Skip to content

[native_datafusion] [Spark SQL Tests] _tmp_metadata_row_index not populated #3317

@andygrove

Description

@andygrove

Summary

2 Spark SQL tests fail because native_datafusion doesn't populate the _tmp_metadata_row_index metadata column.

Failing Tests

  • FileMetadataStructRowIndexSuite: "reading _tmp_metadata_row_index - not present in a table" — returns 0 instead of expected row indices
  • FileMetadataStructRowIndexSuite: "reading _tmp_metadata_row_index - present in a table" — returns 0 instead of expected row indices

Root Cause

native_datafusion scan doesn't generate row index metadata. The count returned is 0 instead of the expected 100 rows.

Possible Fix

In CometScanRule.nativeDataFusionScan(), detect when row index metadata columns are requested in the schema and fall back to native_iceberg_compat.

Related

Discovered in CI for #3307 (enable native_datafusion in auto scan mode).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions