kyuubi/externals
Cheng Pan d474768d97
[KYUUBI #6018] Speed up GetTables operation for Spark session catalog
# 🔍 Description
## Issue References 🔗

This pull request aims to speed up the GetTables operation for the Spark session catalog.
As reported in https://github.com/apache/kyuubi/discussions/4956, https://github.com/apache/kyuubi/discussions/5949, the GetTables operation is quite slow in some cases, and in https://github.com/apache/kyuubi/pull/4444, `kyuubi.operation.getTables.ignoreTableProperties` was introduced to speed up the V2 catalog, but not covers session catalog.

## Describe Your Solution 🔧

Extend the scope of `kyuubi.operation.getTables.ignoreTableProperties` to cover the GetTables operation for the Spark session catalog.

Currently, the basic step of GetTables in the Spark engine is
```
val catalog: String = getCatalog(spark, catalogName)
val databases: Seq[String] = sessionCatalog.listDatabases(schemaPattern)
val identifiers: Seq[TableIdentifier] = catalog.listTables(db, tablePattern, includeLocalTempViews = false)
val tableObjects: Seq[CatalogTable] = catalog.getTablesByName(identifiers)
```
then filter `tableObjects` with `tableTypes: Set[String]`.

The cost of `catalog.getTablesByName(identifiers)` is quite high when the table number is large, e.g. dozen thousand.

For some cases, listing tables only for table name display, it is worth speeding up the operation while ignoring some properties(e.g. table comments) and query criteria(specifically in this case, when `kyuubi.operation.getTables.ignoreTableProperties=true`, criteria `tableTypes` will be ignored, and all tables and views will be treated as TABLE to return.)

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

Pass GA

---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6018 from pan3793/fast-get-table.

Closes #6018

058001c6f [Cheng Pan] fix
405b12484 [Cheng Pan] fix
615b7470f [Cheng Pan] Speed up GetTables operation

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2024-01-29 14:21:09 +08:00
..
kyuubi-chat-engine [KYUUBI #5870] Directly mapping engine's data type to Java type for TRowSet generation 2023-12-21 12:16:49 +08:00
kyuubi-download [KYUUBI #4279] Use new Apache 'closer.lua' syntax for kyuubi-download to obtain engine 2024-01-24 12:46:53 +08:00
kyuubi-flink-sql-engine [KYUUBI #5953] [LICENSE] Update NOTICE 2024-01-10 19:29:01 +08:00
kyuubi-hive-sql-engine [KYUUBI #5953] [LICENSE] Update NOTICE 2024-01-10 19:29:01 +08:00
kyuubi-jdbc-engine [KYUUBI #5906] [JDBC] Rebase Doris dialect implementation 2023-12-23 00:12:52 +08:00
kyuubi-spark-sql-engine [KYUUBI #6018] Speed up GetTables operation for Spark session catalog 2024-01-29 14:21:09 +08:00
kyuubi-trino-engine [KYUUBI #5968] Support set authentication user for Trino engine 2024-01-21 14:26:12 +08:00