kyuubi/extensions
tian bao 60371b5dd5
[KYUUBI #7122] Support ORC hive table pushdown filter
### Why are the changes needed?

Previously, the `HiveScan` class was used to read data. If it is determined to be ORC type, the `ORCScan` from Spark datasourcev2 can be used. `ORCScan` supports pushfilter down, but `HiveScan` does not yet support it.

In our testing, we are able to achieve approximately 2x performance improvement.

The conversation can be controlled by setting `spark.sql.kyuubi.hive.connector.read.convertMetastoreOrc`. When enabled, the data source ORC reader is used to process ORC tables created by using the HiveQL syntax, instead of Hive SerDe.

close https://github.com/apache/kyuubi/issues/7122

### How was this patch tested?

added unit test

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #7123 from flaming-archer/master_scanbuilder_new.

Closes #7122

c3f412f90 [tian bao] add case _
2be48909f [tian bao] Merge branch 'master_scanbuilder_new' of github.com:flaming-archer/kyuubi into master_scanbuilder_new
c825d0f8c [tian bao] review change
8a26d6a8a [tian bao] Update extensions/spark/kyuubi-spark-connector-hive/src/main/scala/org/apache/kyuubi/spark/connector/hive/KyuubiHiveConnectorConf.scala
68d41969f [tian bao] review change
bed007fea [tian bao] review change
b89e6e67a [tian bao] Optimize UT
5a8941b2d [tian bao] fix failed ut
dc1ba47e3 [tian bao] orc pushdown version 0

Authored-by: tian bao <2011xuesong@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-07-09 13:38:51 +08:00
..
flink/kyuubi-flink-token-provider [KYUUBI #6820] Explicitly disable attach-scaladocs for pure Java modules 2024-11-22 17:12:00 +08:00
server/kyuubi-server-plugin [KYUUBI #6820] Explicitly disable attach-scaladocs for pure Java modules 2024-11-22 17:12:00 +08:00
spark [KYUUBI #7122] Support ORC hive table pushdown filter 2025-07-09 13:38:51 +08:00
README.md

For developers

This folder contains plugins/extension for kyuubi server and different engine types.

  • ext
    • kyuubi-server
    • spark
    • flink
    • trino
    • hive
    • others
    • ...