Commit Graph

4330 Commits

Author SHA1 Message Date
Wang, Fei
a1a08e7f93
[KYUUBI #7132] Respect kyuubi.session.engine.startup.waitCompletion for wait engine completion
### Why are the changes needed?

We should not fail the batch submission if the submit process is alive and wait engine completion is false.

Especially for spark on kubernetes, the app might failed with NOT_FOUND state if the spark submit process running time more than the submit timeout.

In this PR, if the `kyuubi.session.engine.startup.waitCompletion` is false, when getting the application info, it use current timestamp as submit time to prevent the app failed with NOT_FOUND state due to submit timeout.

### How was this patch tested?

Pass current GA and manually testing.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7132 from turboFei/batch_submit.

Closes #7132

efb06db1c [Wang, Fei] refine
7e453c162 [Wang, Fei] refine
7bca1a7aa [Wang, Fei] Prevent potential timeout durartion polling the application info
15529ab85 [Wang, Fei] prevent metadata manager fail
38335f2f9 [Wang, Fei] refine
9b8a9fde4 [Wang, Fei] comments
11f607daa [Wang, Fei] docs
f2f6ba148 [Wang, Fei] revert
2da0705ad [Wang, Fei] wait for if not wait complete
d84963420 [Wang, Fei] revert check in loop
b4cf50a49 [Wang, Fei] comments
8c262b7ec [Wang, Fei] refine
ecf379b86 [Wang, Fei] Revert conf change
60dc1676e [Wang, Fei] enlarge
4d0aa542a [Wang, Fei] Save
4aea96552 [Wang, Fei] refine
2ad75fcbf [Wang, Fei] nit
a71b11df6 [Wang, Fei] Do not fail batch if the process is alive

Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-07-14 01:49:06 +08:00
Cheng Pan
c0d4980dab
[KYUUBI #7135] Fix cannot access /tmp/engine-archives: No such file or directory
### Why are the changes needed?

Fix
```
Run ls -lh /tmp/engine-archives
ls: cannot access '/tmp/engine-archives': No such file or directory
Error: Process completed with exit code 2.
```

### How was this patch tested?

GHA

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #7135 from pan3793/gha-cache-fix.

Closes #7135

99ef56082 [Cheng Pan] Fix cannot access /tmp/engine-archives: No such file or directory

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-07-11 10:52:08 +08:00
Cheng Pan
f9ca96a5c4
[KYUUBI #7131] Print cached engine archives
### Why are the changes needed?

Recently, GHA fails frequently with downloading engines failure, this adds logs to display the cached engine archives.

### How was this patch tested?

I will monitor GHA failure after merging.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7131 from pan3793/gha-cache.

Closes #7131

87a38e0d6 [Cheng Pan] Print cached engine archives

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-07-09 18:28:40 +08:00
z1131392774
70a2451951
[KYUUBI #7118] docs(client): Add comprehensive docs for Engine Pool feature
Adds documentation for the Engine Pool feature to the `Share Level Of Kyuubi Engines` page. The separate and now-redundant `engine_pool.rst` file has been deleted.

### Why are the changes needed?

The `Share Level Of Kyuubi Engines` documentation previously lacked a section on the Engine Pool feature.

### How was this patch tested?

### Was this patch authored or co-authored using generative AI tooling?
Yes. Generative AI tools were used in several stages of preparing this contribution:

- The initial draft of the documentation was written with the assistance of Claude.
- Revisions and suggestions were provided by Gemini.
- Commit messages and pull request descriptions were refined for clarity using Gemini.

Generated-by: Claude 4, Gemini 2.5 pro

Closes #7118 from z1131392774/docs/engine-pool-feature.

Closes #7118

61892a22c [z1131392774] restore unnecessary style changes
02b4c6729 [z1131392774] ci: Re-trigger CI checks
60154e00a [z1131392774] fix with spotless
9f0700774 [z1131392774] docs: remove redundant engine_pool.md
bf586f700 [z1131392774] update with SERVER_LOCAL and Engine Pool
4d34a5bfd [z1131392774] fix format with spotless
9ed2e8a61 [z1131392774] docs(client): Add comprehensive docs for Engine Pool feature

Lead-authored-by: z1131392774 <z1131392774@gmail.com>
Co-authored-by: z1131392774 <156416015+z1131392774@users.noreply.github.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-07-09 17:34:06 +08:00
tian bao
60371b5dd5
[KYUUBI #7122] Support ORC hive table pushdown filter
### Why are the changes needed?

Previously, the `HiveScan` class was used to read data. If it is determined to be ORC type, the `ORCScan` from Spark datasourcev2 can be used. `ORCScan` supports pushfilter down, but `HiveScan` does not yet support it.

In our testing, we are able to achieve approximately 2x performance improvement.

The conversation can be controlled by setting `spark.sql.kyuubi.hive.connector.read.convertMetastoreOrc`. When enabled, the data source ORC reader is used to process ORC tables created by using the HiveQL syntax, instead of Hive SerDe.

close https://github.com/apache/kyuubi/issues/7122

### How was this patch tested?

added unit test

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #7123 from flaming-archer/master_scanbuilder_new.

Closes #7122

c3f412f90 [tian bao] add case _
2be48909f [tian bao] Merge branch 'master_scanbuilder_new' of github.com:flaming-archer/kyuubi into master_scanbuilder_new
c825d0f8c [tian bao] review change
8a26d6a8a [tian bao] Update extensions/spark/kyuubi-spark-connector-hive/src/main/scala/org/apache/kyuubi/spark/connector/hive/KyuubiHiveConnectorConf.scala
68d41969f [tian bao] review change
bed007fea [tian bao] review change
b89e6e67a [tian bao] Optimize UT
5a8941b2d [tian bao] fix failed ut
dc1ba47e3 [tian bao] orc pushdown version 0

Authored-by: tian bao <2011xuesong@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-07-09 13:38:51 +08:00
wangzhigang
84928184fc
[KYUUBI #7121] Improve operation timeout management with configurable executors
### Why are the changes needed?

The current mechanism for handling operation timeouts in Kyuubi creates a new `ScheduledExecutorService` with a dedicated thread for each operation. In scenarios with a large number of concurrent operations, this results in excessive thread creation, which consumes substantial system resources and may adversely affect server performance and stability.

This PR introduces a shared `ScheduledThreadPool` within the Operation Manager to centrally schedule operation timeouts. This approach avoids the overhead of creating an excessive number of threads, thereby reducing the system load. Additionally, both the pool size and thread keep-alive time are configurable via the `OPERATION_TIMEOUT_POOL_SIZE` and `OPERATION_TIMEOUT_POOL_KEEPALIVE_TIME` parameters.

### How was this patch tested?

A new unit test for `newDaemonScheduledThreadPool` was added to `ThreadUtilsSuite.scala`. Furthermore, a dedicated `TimeoutSchedulerSuite` was introduced to verify operation timeout behavior.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7121 from wangzhigang1999/master.

Closes #7121

df7688dbf [wangzhigang] Refactor timeout management configuration and improve documentation
2b03b1e68 [wangzhigang] Remove deprecated `ThreadPoolTimeoutExecutor` class following refactor of operation timeout management.
52a8a516a [wangzhigang] Refactor operation timeout management to use per-OperationManager scheduler
7e46d47f8 [wangzhigang] Refactor timeout management by introducing ThreadPoolTimeoutExecutor
f7f10881a [wangzhigang] Add operation timeout management with ThreadPoolTimeoutExecutor
d8cd6c7d4 [wangzhigang] Update .gitignore to exclude .bloop and .metals directories

Lead-authored-by: wangzhigang <wangzhigang1999@live.cn>
Co-authored-by: wangzhigang <wzg443064@alibaba-inc.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-07-09 10:51:30 +08:00
Cheng Pan
efe9f5552c
[KYUUBI #6876] Fix hadoopConf for autoCreateFileUploadPath
### Why are the changes needed?

This change fixes two issues:
1. `KyuubiHadoopUtils.newHadoopConf` should `loadDefaults`, otherwise `core-site.xml`, `hdfs-site.xml` won't take effect.
2. To make it aware of Hadoop conf hot reload, we should use `KyuubiServer.getHadoopConf()`.

### How was this patch tested?

Manual test that `core-site.xml` takes affect, previously not.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7102 from pan3793/6876-followup.

Closes #6876

24989d688 [Cheng Pan] [KYUUBI #6876] Fix hadoopConf for autoCreateFileUploadPath

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-07-09 10:14:15 +08:00
Wenjun Ruan
4e40f9457d
[KYUUBI #7109] Ignore the ? in backticks
### Why are the changes needed?
We will split the sql by `?` when we use `KyuubiPreparedStatement`. But there exist corner case when ? exist in backticks.
For example, below sql contains `?`, but we shouldn't split it by `?`.
```sql
SELECT `(ds|hr)?+.+` FROM sales
```
More details can find at https://hive.apache.org/docs/latest/languagemanual-select_27362043/#regex-column-specification

Hive upstream fix - HIVE-29060

### How was this patch tested?

UT.

### Was this patch authored or co-authored using generative AI tooling?

NO.

Closes #7125 from ruanwenjun/dev_wenjun_fix7109.

Closes #7109

7140980fd [ruanwenjun] [KYUUBI #7109] Ignore the ? in backticks

Lead-authored-by: Wenjun Ruan <wenjun@apache.org>
Co-authored-by: ruanwenjun <zyb@wenjuns-MacBook-Pro-2.local>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-07-07 20:56:36 +08:00
Cheng Pan
e98ad7bf32
[KYUUBI #7112] Enhance test 'capture error from spark process builder' for Spark 4.0
### Why are the changes needed?

```
- capture error from spark process builder *** FAILED ***
  The code passed to eventually never returned normally. Attempted 167 times over 1.50233072485 minutes. Last failure message: "org.apache.kyuubi.KyuubiSQLException: 	Suppressed: org.apache.spark.util.Utils$OriginalTryStackTraceException: Full stacktrace of original doTryWithCallerStacktrace caller
   See more: /builds/lakehouse/kyuubi/kyuubi-server/target/work/kentyao/kyuubi-spark-sql-engine.log.2
  	at org.apache.kyuubi.KyuubiSQLException$.apply(KyuubiSQLException.scala:69)
  	at org.apache.kyuubi.engine.ProcBuilder.$anonfun$start$1(ProcBuilder.scala:234)
  	at java.base/java.lang.Thread.run(Thread.java:840)
  .
  FYI: The last 4096 line(s) of log are:
...
Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
  	at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1742)
  	at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:83)
  	at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:133)
  	at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104)
  	at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3607)
  	at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3659)
  	at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3639)
  	at org.apache.spark.sql.hive.client.HiveClientImpl$.$anonfun$getHive$5(HiveClientImpl.scala:1458)
...
  25/06/19 18:20:08 INFO SparkContext: Successfully stopped SparkContext
  25/06/19 18:20:08 INFO ShutdownHookManager: Shutdown hook called
  25/06/19 18:20:08 INFO ShutdownHookManager: Deleting directory /tmp/spark-791ea5a0-44d2-4750-a549-a3ea2[3254](https://g.hz.netease.com/lakehouse/kyuubi/-/jobs/7667660#L3254)6b2
  25/06/19 18:20:08 INFO ShutdownHookManager: Deleting directory /tmp/spark-1ab9d4a0-707d-4619-bc83-232c29c891f9
  25/06/19 18:20:08 INFO ShutdownHookManager: Deleting directory /builds/lakehouse/kyuubi/kyuubi-server/target/work/kentyao/artifacts/spark-9ee628b1-0c29-4d32-8078-c023d1f812d7" did not contain "org.apache.hadoop.hive.ql.metadata.HiveException:". (SparkProcessBuilderSuite.scala:79)
```

### How was this patch tested?

Pass GHA.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7112 from pan3793/ut-spark-4.0.

Closes #7112

bd4a24bea [Cheng Pan] Enhance test 'capture error from spark process builder' for Spark 4.0

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-07-02 18:19:28 +08:00
cutiechi
4717987e37
[KYUUBI #7113] Skip Hadoop classpath check if flink-shaded-hadoop jar exists in Flink lib directory
### Why are the changes needed?

This change addresses an issue where the Flink engine in Kyuubi would perform a Hadoop classpath check even when a ‎`flink-shaded-hadoop` jar is already present in the Flink ‎`lib` directory. In such cases, the check is unnecessary and may cause confusion or warnings in environments where the shaded jar is used instead of a full Hadoop classpath. By skipping the check when a ‎`flink-shaded-hadoop` jar exists, we improve compatibility and reduce unnecessary log output.

### How was this patch tested?

The patch was tested by deploying Kyuubi with a Flink environment that includes a ‎`flink-shaded-hadoop` jar in the ‎`lib` directory and verifying that the classpath check is correctly skipped. Additional tests ensured that the check still occurs when neither the Hadoop classpath nor the shaded jar is present. Unit tests and manual verification steps were performed to confirm the fix.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7113 from cutiechi/fix/flink-classpath-missing-hadoop-check.

Closes #7113

99a4bf834 [cutiechi] fix(flink): fix process builder suite
7b9998760 [cutiechi] fix(flink): remove hadoop cp add
ea33258a3 [cutiechi] fix(flink): update flink hadoop classpath doc
6bb3b1dfa [cutiechi] fix(flink): optimize hadoop class path messages
c548ed6a1 [cutiechi] fix(flink): simplify classpath detection by merging hasHadoopJar conditions
9c16d5436 [cutiechi] Update kyuubi-server/src/main/scala/org/apache/kyuubi/engine/flink/FlinkProcessBuilder.scala
0f729dcf9 [cutiechi] fix(flink): skip hadoop classpath check if flink-shaded-hadoop jar exists

Authored-by: cutiechi <superchijinpeng@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-07-02 17:33:07 +08:00
namaagra
8c5f461dfb
[KYUUBI #6924] Upgrade Spark Ranger plugin to 2.6.0
This pull request fixes #6924

## Describe Your Solution 🔧

Bump ranger version to 2.6.0
Release notes: https://cwiki.apache.org/confluence/display/RANGER/Apache+Ranger+2.6.0+-+Release+Notes

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

#### Behavior Without This Pull Request ⚰️

#### Behavior With This Pull Request 🎉

#### Related Unit Tests

---

# Checklist 📝

- [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #7124 from namanagraw/ranger_upgrade.

Closes #6924

bade24db8 [Cheng Pan] Update extensions/spark/kyuubi-spark-authz/README.md
650f27319 [namaagra] [KYUUBI apache#6924] Upgrade Spark Ranger plugin to 2.6.0

Lead-authored-by: namaagra <namaagra@visa.com>
Co-authored-by: Cheng Pan <pan3793@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-07-02 17:31:21 +08:00
davidyuan
31bbb536f2
[KYUUBI #7100] [#7099] Ranger Support Check Iceberg Alter Table Command & Change Iceberg Test Use Jdbc Catalog
Parent Issue #7040
Support Check Iceberg Alter Table Command
### Why are the changes needed?

- [x] Alter Table Rename To
- [x] Alter Table Set Properties
- [x] Alter Table Unset Properties
- [x] Alter Table Add Column
- [x] Alter Table Rename Column
- [x] Alter Table Alter Column
- [x] Alter Table Drop Column

### How was this patch tested?

### Was this patch authored or co-authored using generative AI tooling?

Closes #7100 from davidyuan1223/iceberg_alter_table_check.

Closes #7100

4be2210f1 [davidyuan] update
53eda10eb [davidyuan] update

Authored-by: davidyuan <yuanfuyuan@mafengwo.com>
Signed-off-by: Kent Yao <yao@apache.org>
2025-06-26 10:11:43 +08:00
Wang, Fei
aaac07fa55 [KYUUBI #7110] Fix serverOnlyPrefixConfigKeys is iterator issue
### Why are the changes needed?

Followup for #7055
Before this PR, the `serverOnlyPrefixConfigKeys` is type of iterator.

After one time iteration, it become empty.

In this PR, we convert it to `Set` to fix this issue.

### How was this patch tested?

UT.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7110 from turboFei/exclude_prefix.

Closes #7110

91a54b6f0 [Wang, Fei] prefix

Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: Wang, Fei <fwang12@ebay.com>
2025-06-23 19:27:58 -07:00
dnskr
2021574a33
[KYUUBI #7105] [K8S][HELM] Support additional labels for PrometheusRule
### Why are the changes needed?
The change is needed to be able to add additional labels to `PrometheusRule` similar to [podMonitor](523722788f/charts/kyuubi/values.yaml (L321-L330)) and [serviceMonitor](523722788f/charts/kyuubi/values.yaml (L333-L341)).
The PR also includes minor identation fixes.

### How was this patch tested?
```shell
helm template kyuubi charts/kyuubi --set metrics.prometheusRule.enabled=true --set metrics.prometheusRule.labels.test-label=true -s templates/kyuubi-alert.yaml
---
# Source: kyuubi/templates/kyuubi-alert.yaml
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: kyuubi
  labels:
    helm.sh/chart: kyuubi-0.1.0
    app.kubernetes.io/name: kyuubi
    app.kubernetes.io/instance: kyuubi
    app.kubernetes.io/version: "1.10.0"
    app.kubernetes.io/managed-by: Helm
    test-label: true
spec:
  groups:
    []
```

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #7105 from dnskr/helm-prometheusRule-labels.

Closes #7105

234d99da3 [dnskr] [K8S][HELM] Support additional labels for PrometheusRule

Authored-by: dnskr <dnskrv88@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-06-23 23:48:15 +08:00
Cheng Pan
bdeb29451c
[KYUUBI #7076] Update known_translations
### Why are the changes needed?

A routine work.

### How was this patch tested?

Review.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7076 from pan3793/minor.

Closes #7076

546fb5196 [Cheng Pan] Update known_translations

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-06-23 23:34:58 +08:00
Frank Bertsch
b49ed02f16
[KYUUBI #7106] Make response.results.columns optional
### Why are the changes needed?
Bugfix. Spark 3.5 is returning `None` for `response.results.columns`, while Spark 3.3 returned actual values.

The response here: https://github.com/apache/kyuubi/blob/master/python/pyhive/hive.py#L507

For a query that does nothing (mine was an `add jar s3://a/b/c.jar`), here are the responses I received.

Spark 3.3:
```
TFetchResultsResp(status=TStatus(statusCode=0, infoMessages=None, sqlState=None, errorCode=None, errorMessage=None), hasMoreRows=False, results=TRowSet(startRowOffset=0, rows=[], columns=[TColumn(boolVal=None, byteVal=None, i16Val=None, i32Val=None, i64Val=None, doubleVal=None, stringVal=TStringColumn(values=[], nulls=b'\x00'), binaryVal=None)], binaryColumns=None, columnCount=None))
```

Spark 3.5:
```
TFetchResultsResp(status=TStatus(statusCode=0, infoMessages=None, sqlState=None, errorCode=None, errorMessage=None), hasMoreRows=False, results=TRowSet(startRowOffset=0, rows=[], columns=None, binaryColumns=None, columnCount=None))
```

### How was this patch tested?
I tested by applying it locally and running my query against Spark 3.5. I was not able to get any unit tests running, sorry!

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes #7107 from fbertsch/spark_3_5_fix.

Closes #7106

13d1440a8 [Frank Bertsch] Make response.results.columns optional

Authored-by: Frank Bertsch <fbertsch@netflix.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-06-23 23:28:18 +08:00
Wang, Fei
e769f42398 [KYUUBI #6884] [FEATURE] Support to reassign the batches to alternative kyuubi instance in case kyuubi instance lost
### Why are the changes needed?

Support to reassign the batches to alternative kyuubi instance in case kyuubi instance lost.
https://github.com/apache/kyuubi/issues/6884

### How was this patch tested?

Unit Test

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #7037 from George314159/6884.

Closes #6884

8565d4aaa [Wang, Fei] KYUUBI_SESSION_CONNECTION_URL_KEY
22d4539e2 [Wang, Fei] admin
075654cb3 [Wang, Fei] check admin
5654a99f4 [Wang, Fei] log and lock
a19e2edf5 [Wang, Fei] minor comments
a60f23ba3 [George314159] refine
760e10f89 [George314159] Update Based On Comments
75f1ee2a9 [Fei Wang] ping (#1)
f42bcaf9a [George314159] Update Based on Comments
1bea70ed6 [George314159] [KYUUBI-6884] Support to reassign the batches to alternative kyuubi instance in case kyuubi instance lost

Lead-authored-by: Wang, Fei <fwang12@ebay.com>
Co-authored-by: George314159 <hua16732@gmail.com>
Co-authored-by: Fei Wang <cn.feiwang@gmail.com>
Signed-off-by: Wang, Fei <fwang12@ebay.com>
2025-06-22 22:36:51 -07:00
Wang, Fei
302b5fa1e6 [KYUUBI #7101] Load the existing pods when initializing kubernetes client to cleanup terminated app pods
### Why are the changes needed?

To prevent the terminated app pods leak if the events missed during kyuubi server restart.

### How was this patch tested?

Manual test.

```
:2025-06-17 17:50:37.275 INFO [main] org.apache.kyuubi.engine.KubernetesApplicationOperation: [KubernetesInfo(Some(28),Some(dls-prod))] Found existing pod kyuubi-xb406fc5-7b0b-4fdf-8531-929ed2ae250d-8998-5b406fc5-7b0b-4fdf-8531-929ed2ae250d-8998-90c0b328-930f-11ed-a1eb-0242ac120002-0-20250423211008-grectg-stm-17da59fe-caf4-41e4-a12f-6c1ed9a293f9-driver with label: kyuubi-unique-tag=17da59fe-caf4-41e4-a12f-6c1ed9a293f9 in app state FINISHED, marking it as terminated
2025-06-17 17:50:37.278 INFO [main] org.apache.kyuubi.engine.KubernetesApplicationOperation: [KubernetesInfo(Some(28),Some(dls-prod))] Found existing pod kyuubi-xb406fc5-7b0b-4fdf-8531-929ed2ae250d-8998-5b406fc5-7b0b-4fdf-8531-929ed2ae250d-8998-90c0b328-930f-11ed-a1eb-0242ac120002-0-20250423212011-gpdtsi-stm-6a23000f-10be-4a42-ae62-4fa2da8fac07-driver with label: kyuubi-unique-tag=6a23000f-10be-4a42-ae62-4fa2da8fac07 in app state FINISHED, marking it as terminated
```
The pods are cleaned up eventually.
<img width="664" alt="image" src="https://github.com/user-attachments/assets/8cf58f61-065f-4fb0-9718-2e3c00e8d2e0" />

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7101 from turboFei/pod_cleanup.

Closes #7101

7f76cf57c [Wang, Fei] async
11c9db25d [Wang, Fei] cleanup

Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: Wang, Fei <fwang12@ebay.com>
2025-06-22 22:35:14 -07:00
Cheng Pan
364f062852
[KYUUBI #7103] Bump Delta 4.0.0 and enable Delta tests for Spark 4.0
### Why are the changes needed?

https://github.com/delta-io/delta/releases/tag/v4.0.0

### How was this patch tested?

GHA.

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #7103 from pan3793/delta-4.0.

Closes #7103

febaa11ab [Cheng Pan] Bump Delta 4.0.0 and enable Delta tests for Spark 4.0

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-06-19 13:32:17 +08:00
Cheng Pan
7313042c0f
[KYUUBI #7104] Bump Maven 3.9.10
### Why are the changes needed?

Upgrade Maven to the latest version to speed up `build/mvn` downloading, as the previous versions are not available at https://dlcdn.apache.org/maven/maven-3/

### How was this patch tested?

Pass GHA,

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7104 from pan3793/maven-3.9.10.

Closes #7104

48aa9a232 [Cheng Pan] Bump Maven 3.9.10

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-06-19 12:54:28 +08:00
jasonj
523722788f
[KYUUBI #7098] Add ability to annotate pods and headless service in Kyuubi helm chart
### Why are the changes needed?

Support adding arbitrary annotations to Kyuubi pods and services - for example, those needed for annotation-based auto-discovery via [k8s-monitoring-helm](https://github.com/grafana/k8s-monitoring-helm/blob/main/charts/k8s-monitoring/docs/examples/features/annotation-autodiscovery/default/README.md)

### How was this patch tested?

Helm chart installs with and without annotations added

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #7098 from jasonj/master.

Closes #7098

70f740d03 [jasonj] Add ability to annotate pods and headless service

Authored-by: jasonj <jason@interval.xyz>
Signed-off-by: Kent Yao <yao@apache.org>
2025-06-17 14:23:48 +08:00
Wang, Fei
7fbeea66fd [KYUUBI #7072][FOLLOWUP] Fix engine startup permit grafana pannel unit
### Why are the changes needed?

Followup for https://github.com/apache/kyuubi/pull/7072
The metrics unit should not be `ms`.

### How was this patch tested?

<img width="569" alt="image" src="https://github.com/user-attachments/assets/df83b003-762d-4ee2-bbe1-c1af55ae9795" />

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7091 from turboFei/7072_followup.

Closes #7072

d7c4fe4f9 [Wang, Fei] fix unit

Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: Wang, Fei <fwang12@ebay.com>
2025-06-12 23:34:59 -07:00
Wang, Fei
bada9c0411 [KYUUBI #7095] Respect terminated app state when building batch info from metadata
### Why are the changes needed?

Respect terminated app state when building batch info from metadata

It is a followup for https://github.com/apache/kyuubi/pull/2911,
9e40e39c39/kyuubi-server/src/main/scala/org/apache/kyuubi/server/api/v1/BatchesResource.scala (L128-L142)

1. if the kyuubi instance is unreachable during maintain window.
2. the batch app state has been terminated, and the app stated was backfilled by another kyuubi instance peer, see #2911
3. the batch state in the metadata table is still PENDING/RUNNING
4. return the terminated batch state for such case instead of `PENDING or RUNNING`.
### How was this patch tested?

GA and IT.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7095 from turboFei/always_respect_appstate.

Closes #7095

ec72666c9 [Wang, Fei] rename
bc74a9c56 [Wang, Fei] if op not terminated
e786c8d9b [Wang, Fei] respect terminated app state when building batch info from metadata

Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: Wang, Fei <fwang12@ebay.com>
2025-06-12 10:20:44 -07:00
Wang, Fei
00ff464b06 [KYUUBI #7094] Add serverOnly flag for metrics config items
### Why are the changes needed?

MetricsSystem is only used for KyuubiServer, all the metrics config items are server only.
### How was this patch tested?

GA.
### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7094 from turboFei/serverOnly.

Closes #7094

8324419dd [Wang, Fei] Add server only flag for metrics conf

Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: Wang, Fei <fwang12@ebay.com>
2025-06-11 22:32:46 -07:00
Wang, Fei
a3b21f6f80 [KYUUBI #7097] [INFRA] Write sorted authors for release contributors
### Why are the changes needed?

Write sorted authors to release contributors file

### How was this patch tested?

```
(base) ➜  kyuubi git:(sort_author) ✗ RELEASE_TAG=v1.8.1 PREVIOUS_RELEASE_TAG=v1.8.0 ./build/release/pre_gen_release_notes.py

(base) ➜  kyuubi git:(sort_author) ✗ cat build/release/contributors-v1.8.1.txt
* Binjie Yang
* Bowen Liang
* Chao Chen
* Cheng Pan
* David Yuan
* Fei Wang
* Flyangz
* Gianluca Principini
* He Zhao
* Junjie Ma
* Kaifei Yi
* Kang Wang
* liaoyt
* Mingliang Zhu
* mrtisttt
* Ocean22
* Paul Lin
* Peiyue Liu
* Pengqi Li
* Senmiao Liu
* Shaoyun Chen
* SwordyZhao
* Tao Wang
* William Tong
* Xiao Liu
* Yi Zhu
* Yifan Zhou
* Yuwei Zhan
* Zeyu Wang
* Zhen Wang
* Zhiming She

```

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7097 from turboFei/sort_author.

Closes #7097

45dfb8f1e [Wang, Fei] Write sorted authors

Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: Wang, Fei <fwang12@ebay.com>
2025-06-11 22:32:14 -07:00
Wang, Fei
9e40e39c39 [KYUUBI #7093] Log the metadata cleanup count
### Why are the changes needed?

To show how many metadata records cleaned up.
### How was this patch tested?

```
(base) ➜  kyuubi git:(delete_metadata) grep 'Cleaned up' target/unit-tests.log
01:58:17.109 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:17.124 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:17.144 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:17.161 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:17.180 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:17.199 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:17.216 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:17.236 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:17.253 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:17.270 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:17.290 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:17.310 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:17.327 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:17.348 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:17.368 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:17.384 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:17.400 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:17.419 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:17.437 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:17.456 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:17.475 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:17.493 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:17.513 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:17.533 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:17.551 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:17.569 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:17.590 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:17.611 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:17.631 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:17.651 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:17.668 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:17.688 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:17.705 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:17.725 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:17.744 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:17.764 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:17.784 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:17.801 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:17.822 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:17.849 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:17.870 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:17.889 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:17.910 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:17.929 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:17.948 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:17.970 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:17.994 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:18.014 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:18.032 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:18.050 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:18.069 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:18.086 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 0 records older than 1000 ms from metadata.
01:58:18.108 ScalaTest-run-running-JDBCMetadataStoreSuite INFO JDBCMetadataStore: Cleaned up 1 records older than 1000 ms from metadata.
01:58:18.162 ScalaTest-run INFO JDBCMetadataStore: Cleaned up 0 records older than 0 ms from k8s_engine_info.
```
### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7093 from turboFei/delete_metadata.

Closes #7093

e0cf300f8 [Wang, Fei] update

Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: Wang, Fei <fwang12@ebay.com>
2025-06-10 06:59:57 -07:00
Lennon Chin
cad5a392f3
[KYUUBI #7072] Expose metrics of engine startup permit state
### Why are the changes needed?

The metrics `kyuubi_operation_state_LaunchEngine_*` cannot reflect the state of Semaphore after configuring the maximum engine startup limit through `kyuubi.server.limit.engine.startup`, add some metrics to show the relevant permit state.

### How was this patch tested?

### Was this patch authored or co-authored using generative AI tooling?

Closes #7072 from LennonChin/engine_startup_metrics.

Closes #7072

d6bf3696a [Lennon Chin] Expose metrics of engine startup permit status

Authored-by: Lennon Chin <i@coderap.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-05-29 13:27:42 +08:00
zhaohehuhu
bcaff5a3f1
[KYUUBI #7077] Spark 3.5: Enhance MaxScanStrategy for DSv2
### Why are the changes needed?

To enhance the MaxScanStrategy in Spark's DSv2 to ensure it only works for relations that support statistics reporting. This prevents Spark from returning a default value of Long.MaxValue, which, leads to some queries failing or behaving unexpectedly.
### How was this patch tested?

It tested out locally.

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #7077 from zhaohehuhu/dev-0527.

Closes #7077

64001c94e [zhaohehuhu] fix MaxScanStrategy for datasource v2

Authored-by: zhaohehuhu <luoyedeyi459@163.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-05-29 13:25:55 +08:00
davidyuan
1af6647132
[KYUUBI #7068] Iceberg ranger support check branch and tag ddl
### Why are the changes needed?

Iceberg ranger check support branch and tag ddl

### How was this patch tested?

- [x] create branch
- [x] replace branch
- [x] drop branch
- [x] create tag
- [x] replace tag
- [x] drop tag

issue #7068
### Was this patch authored or co-authored using generative AI tooling?

Closes #7069 from davidyuan1223/iceberg_branch_check.

Closes #7068

d060a24e1 [davidyuan] update
1e05018d1 [davidyuan] Merge branch 'master' into iceberg_branch_check
be2684671 [davidyuan] update
231ed3356 [davidyuan] sort spi file
6d2a5bf20 [davidyuan] sort spi file
bc21310cc [davidyuan] update
52ca367f1 [davidyuan] update

Authored-by: davidyuan <yuanfuyuan@mafengwo.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-05-29 13:04:43 +08:00
Cheng Pan
6d99b20e04
[KYUUBI #6870][FOLLOWUP] Correct file name of grafana/README.md
### Why are the changes needed?

Fix a typo of file name.

### How was this patch tested?

Review.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7074 from pan3793/6870-f.

Closes #6870

45915d978 [Cheng Pan] [KYUUBI #6870][FOLLOWUP] Correct file name of grafana/README.md

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-05-26 12:48:30 +08:00
Cheng Pan
526110dfe7
[KYUUBI #6928] Bump Spark 4.0.0
### Why are the changes needed?

Test Spark 4.0.0 RC1
https://lists.apache.org/thread/3sx86qhnmot1p519lloyprxv9h7nt2xh

### How was this patch tested?

GHA.

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #6928 from pan3793/spark-4.0.0.

Closes #6928

a910169bd [Cheng Pan] Bump Spark 4.0.0

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-05-26 12:23:22 +08:00
Cheng Pan
873503e662
[KYUUBI #7073] Retry 3 times on deploying to nexus
### Why are the changes needed?

Retry on deploying failure to overcome the transient issues.

### How was this patch tested?

Review.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7073 from pan3793/deploy-retry.

Closes #7073

f42bd663b [Cheng Pan] Retry 3 times on deploying to nexus

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-05-23 01:55:40 +08:00
Cheng Pan
fc654cfbd3
[KYUUBI #7063] Bump Kyuubi Shaded 0.5.0
### Why are the changes needed?

https://kyuubi.apache.org/shaded-release/0.5.0.html

### How was this patch tested?

Pass GHA.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7063 from pan3793/kyuubi-shaded-0.5.0.

Closes #7063

b202a7c83 [Cheng Pan] Update pom.xml
417914529 [Cheng Pan] Bump Kyuubi Shaded 0.5.0

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-05-22 17:19:07 +08:00
davidyuan
abf947a7ac
[KYUUBI #7065] [#7066] Iceberg Support add partition field check
#7066
### Why are the changes needed?

Iceberg missing some check, this pr try to fix add partition field check

### How was this patch tested?

### Was this patch authored or co-authored using generative AI tooling?

Closes #7065 from davidyuan1223/icerberg_authz.

Closes #7065

be2684671 [davidyuan] update
231ed3356 [davidyuan] sort spi file
6d2a5bf20 [davidyuan] sort spi file
bc21310cc [davidyuan] update
52ca367f1 [davidyuan] update

Authored-by: davidyuan <yuanfuyuan@mafengwo.com>
Signed-off-by: Kent Yao <yao@apache.org>
2025-05-20 14:53:52 +08:00
Cheng Pan
e366b0950f
[KYUUBI #6920][FOLLOWUP] Spark SQL engine supports Spark 4.0
### Why are the changes needed?

There were some breaking changes after we fixed compatibility for Spark 4.0.0 RC1 in #6920, but now Spark has reached 4.0.0 RC6, which has less chance to receive more breaking changes.

### How was this patch tested?

Changes are extracted from https://github.com/apache/kyuubi/pull/6928, which passed CI with Spark 4.0.0 RC6

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7061 from pan3793/6920-followup.

Closes #6920

17a1bd9e5 [Cheng Pan] [KYUUBI #6920][FOLLOWUP] Spark SQL engine supports Spark 4.0

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-05-16 11:47:35 +08:00
Cheng Pan
4df682bf83
[KYUUBI #7062] Bump Delta Lake 3.3.1
### Why are the changes needed?

https://github.com/delta-io/delta/releases/tag/v3.3.1

### How was this patch tested?

Pass GHA.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7062 from pan3793/delta-3.3.1.

Closes #7062

0fc1df8f9 [Cheng Pan] Bump DeltaLake 3.3.1

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-05-15 22:29:31 +08:00
Fei Wang
2ec8b46f02 [KYUUBI #7055] Support to filter out server only configs with prefixes
### Why are the changes needed?

To filter out server only configs with prefixes.

For some kyuubi configs, there is no related defined ConfigEntry, and we can not filter out them and have to populate them to engien end.

For example:
```
kyuubi.kubernetes.28.master.address=k8s://master
kyuubi.backend.server.event.kafka.broker=localhost:9092
kyuubi.metadata.store.jdbc.driver=com.mysql.cj.jdbc.Driver
kyuubi.metadata.store.jdbc.datasource.maximumPoolSize=600
kyuubi.metadata.store.jdbc.datasource.minimumIdle=100
kyuubi.metadata.store.jdbc.datasource.idleTimeout=60000
```

This PR supports to exclude them by setting:
```
kyuubi.config.server.only.prefixes=kyuubi.backend.server.event.kafka.,kyuubi.metadata.store.jdbc.datasource.,kyuubi.kubernetes.28.
```

### How was this patch tested?

UT
### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7055 from turboFei/server_only_configs.

Closes #7055

6c804ff91 [Cheng Pan] Update kyuubi-common/src/main/scala/org/apache/kyuubi/config/KyuubiConf.scala
bd391a664 [Wang, Fei] exclude

Lead-authored-by: Fei Wang <fwang12@ebay.com>
Co-authored-by: Cheng Pan <pan3793@gmail.com>
Co-authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: Wang, Fei <fwang12@ebay.com>
2025-05-11 22:34:46 -07:00
Igor Khrol
61487acfa0
[KYUUBI #7051] Fix usage without sslTrustStore of JDBC driver
### Why are the changes needed?

If `sslTrustStore` is not provided `org.apache.hadoop.conf.Configuration` class existence becomes a hard dependency.
This makes jdbc client too complex to configure: extra Hadoop jars should be provided.

`hadoopCredentialProviderAvailable` variable is useless in the previous implementation logic because it's always `true` or the code is not reachable.

<img width="898" alt="Screenshot 2025-05-09 at 13 05 12" src="https://github.com/user-attachments/assets/6d202555-38c6-40d2-accb-eb78a3d4184e" />

### How was this patch tested?

Build jar and used it to connect from DataGrip.
<img width="595" alt="Screenshot 2025-05-09 at 13 01 29" src="https://github.com/user-attachments/assets/c6e4d904-a3dd-4d3f-9bdd-8bb47ed1e834" />

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7058 from Khrol/master.

Closes #7051

b594757a0 [Igor Khrol] JDBC driver: allow usage without sslTrustStore

Authored-by: Igor Khrol <khroliz@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-05-12 12:57:27 +08:00
Wang, Fei
2dd33e333e [KYUUBI #7054] Add server only flag for more server/credentials/frontend/metadata configs
### Why are the changes needed?

Reduce the kyuubi server end configs involved into engine end.

### How was this patch tested?

UT and  code review.
### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7054 from turboFei/server_only.

Closes #7054

d5855a5db [Wang, Fei] revert kubernetes
b253c336b [Wang, Fei] init

Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: Wang, Fei <fwang12@ebay.com>
2025-05-11 21:15:16 -07:00
taylor.fan
127c736a8f
[KYUUBI #6926] Add SERVER_LOCAL engine share level
### Why are the changes needed?

As clarified in https://github.com/apache/kyuubi/issues/6926, there are some scenarios user want to launch engine on each kyuubi server. SERVER_LOCAL engine share level implement this function by extracting local host address as subdomain, in which case each kyuubi server's engine is unique.

### How was this patch tested?

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #7013 from taylor12805/share_level_server_local.

Closes #6926

ba201bb72 [taylor.fan] [KYUUBI #6926] update format
42f0a4f7d [taylor.fan] [KYUUBI #6926] move host address to subdomain
e06de79ad [taylor.fan] [KYUUBI #6926] Add SERVER_LOCAL engine share level

Authored-by: taylor.fan <taylor.fan@vipshop.com>
Signed-off-by: Kent Yao <yao@apache.org>
2025-04-29 10:42:50 +08:00
John Zhang
c19d923b85
[KYUUBI #7048] Fix KeyError when parsing unknown Hive type_id in schema inspection
This patch adds try/except block to prevent `KeyError` when mapping unknown `type_id` in Hive schema parsing. Now, if a `type_id` is not recognized, `type_code` is set to `None` instead of raising an exception.

### Why are the changes needed?

Previously, when parsing Hive table schemas, the code attempts to map each `type_id` to a human-readable type name via `ttypes.TTypeId._VALUES_TO_NAMES[type_id]`. If Hive introduced an unknown or custom type (e.g. some might using an non-standard version or data pumping from a totally different data source like *Oracle* into *Hive* databases), a `KeyError` was raised, interrupting the entire SQL query process. This patch adds a `try/except` block so that unrecognized `type_id`s will set `type_code` to `None` instead of raising an error so that the downstream user can decided what to do instead of just an Exception. This makes schema inspection more robust and compatible with evolving Hive data types.

### How was this patch tested?

The patch was tested by running schema inspection on tables containing both standard and unknown/custom Hive column types. For known types, parsing behaves as before. For unknown types, the parser sets `type_code` to `None` without raising an exception, and the rest of the process completes successfully. No unit test was added since this is an edge case dependent on unreachable or custom Hive types, but was tested on typical use cases.

### Was this patch authored or co-authored using generative AI tooling?

No. 😂 It's a minor patch.

Closes #7048 from ZsgsDesign/patch-1.

Closes #7048

4d246d0ec [John Zhang] fix: handle KeyError when parsing Hive type_id mapping

Authored-by: John Zhang <zsgsdesign@gmail.com>
Signed-off-by: Kent Yao <yao@apache.org>
2025-04-29 10:41:16 +08:00
Wang, Fei
ecfca79328
[KYUUBI #7033] Treat YARN/Kubernetes application NOT_FOUND as failed to prevent data quality issue
### Why are the changes needed?

Currently, NOT_FOUND application stated is treated as a terminated but not failed state.

It might cause some data quality issue if downstream application depends on the batch state for data processing.

So, I think we should treat NOT_FOUND as a failed state instead.

Currently, we support 3 types of application manager.
1. [JpsApplicationOperation](https://github.com/apache/kyuubi/blob/master/kyuubi-server/src/main/scala/org/apache/kyuubi/engine/JpsApplicationOperation.scala)
2. [YarnApplicationOperation](https://github.com/apache/kyuubi/blob/master/kyuubi-server/src/main/scala/org/apache/kyuubi/engine/YarnApplicationOperation.scala)
3. [KubernetesApplicationOperation](https://github.com/apache/kyuubi/blob/master/kyuubi-server/src/main/scala/org/apache/kyuubi/engine/KubernetesApplicationOperation.scala)

YarnApplicationOperation and KubernetesApplicationOperation are widely used in production use case.

And in multiple kyuubi instance mode, the NOT_FOUND case should rarely happen.
1.  7e199d6fdb/kyuubi-server/src/main/scala/org/apache/kyuubi/server/api/v1/BatchesResource.scala (L369-L385)

3. https://github.com/apache/kyuubi/pull/7029

So, I think we should treat NOT_FOUND as a failed state in production use case.
It is better to fail some corner cases than to mistakenly set unsuccessful batches to the finished state.

### How was this patch tested?

GA.
### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7033 from turboFei/revist_not_found.

Closes #7033

ada4f8822 [Cheng Pan] Update kyuubi-server/src/main/scala/org/apache/kyuubi/engine/ApplicationOperation.scala
985e23c24 [Wang, Fei] Refine
f03d61242 [Wang, Fei] comments
b9d6ac203 [Wang, Fei] incase the metadata updated by peer instance
3bd61ca85 [Wang, Fei] add
339df4730 [Wang, Fei] treat NOT_FOUND as failed

Lead-authored-by: Wang, Fei <fwang12@ebay.com>
Co-authored-by: Cheng Pan <pan3793@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-04-27 21:09:08 +08:00
Wang, Fei
02a6b13d77 [KYUUBI #7028] Persist the kubernetes application terminate state into metastore for app info store fallback
### Why are the changes needed?

1. Persist the kubernetes application terminate info into metastore to prevent the event lose.
2. If it can not get the application info from informer application info store, fallback to get the application info from metastore instead of return NOT_FOUND directly.
3. It is critical because if we return false application state, it might cause data quality issue.

### How was this patch tested?

UT and IT.

<img width="1917" alt="image" src="https://github.com/user-attachments/assets/306f417c-5037-4869-904d-dcf657ff8f60" />

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7029 from turboFei/kubernetes_state.

Closes #7028

9f2badef3 [Wang, Fei] generic dialect
186cc690d [Wang, Fei] nit
82ea62669 [Wang, Fei] Add pod name
4c59bebb5 [Wang, Fei] Refine
327a0d594 [Wang, Fei] Remove create_time from k8s engine info
12c24b1d0 [Wang, Fei] do not use MYSQL deprecated VALUES(col)
becf9d1a7 [Wang, Fei] insert or replace
d167623c1 [Wang, Fei] migration

Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: Wang, Fei <fwang12@ebay.com>
2025-04-27 01:37:27 -07:00
Wang, Fei
70c03ef998
[KYUUBI #7046] Bump dropwizard metrics version to 4.2.30
### Why are the changes needed?

Bump to use the latest 4.2.x, seems mainly for dependency upgrading.

- https://github.com/dropwizard/metrics/releases/tag/v4.2.27
- https://github.com/dropwizard/metrics/releases/tag/v4.2.28
- https://github.com/dropwizard/metrics/releases/tag/v4.2.29
- https://github.com/dropwizard/metrics/releases/tag/v4.2.30

### How was this patch tested?

GA.
### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7046 from turboFei/bump_codahale.

Closes #7046

51d2a5522 [Wang, Fei] Bump codahale metrics version to 4.2.30

Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-04-27 15:56:27 +08:00
Wang, Fei
4cbff4d192 [KYUUBI #7045] Expose jetty metrics
### Why are the changes needed?

Expose the jetty metrics to help detect issue.
Refer: https://metrics.dropwizard.io/4.2.0/manual/jetty.html

### How was this patch tested?

<img width="1425" alt="image" src="https://github.com/user-attachments/assets/ac8c9a48-eaa1-48ee-afec-6f33980d4270" />

<img width="1283" alt="image" src="https://github.com/user-attachments/assets/c2fa444b-6337-4662-832b-3d02f206bd13" />

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7045 from turboFei/metrics_jetty.

Closes #7045

122b93f3d [Wang, Fei] metrics
45a73e7cd [Wang, Fei] metrics

Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: Wang, Fei <fwang12@ebay.com>
2025-04-25 00:02:56 -07:00
Wang, Fei
3e638b61c3 [KYUUBI #7044] Bump jetty version to 9.4.57.v20241219
### Why are the changes needed?

Bump to latest jetty 9.x, see https://github.com/jetty/jetty.project/releases/tag/jetty-9.4.57.v20241219

<img width="1222" alt="image" src="https://github.com/user-attachments/assets/8d2734ee-52f2-4ba9-b22a-660e3b202b7f" />

### How was this patch tested?

GA.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7044 from turboFei/jetty_version.

Closes #7044

740560dd0 [Wang, Fei] Upgrade jetty version

Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: Wang, Fei <fwang12@ebay.com>
2025-04-25 00:01:34 -07:00
Wang, Fei
29b6076319 [KYUUBI #7043] Support to construct the batch info from metadata directly
### Why are the changes needed?

Add an option to allow construct the batch info from metadata directly instead of redirecting the requests to reduce the RPC latency.

### How was this patch tested?

Minor change and Existing GA.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7043 from turboFei/support_no_redirect.

Closes #7043

7f7a2fb80 [Wang, Fei] comments
bb0e324a1 [Wang, Fei] save

Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: Wang, Fei <fwang12@ebay.com>
2025-04-24 22:42:26 -07:00
Wang, Fei
75891d1a92 [KYUUBI #7034][FOLLOWUP] Decouple the kubernetes pod name and app name
### Why are the changes needed?

Followup for #7034  to fix the SparkOnKubernetesTestsSuite.

Sorry, I forget that the appInfo name and pod name were deeply bound before, the appInfo name was used as pod name and used to delete pod.

In this PR, we add `podName` into applicationInfo to separate app name and pod name.

### How was this patch tested?

GA should pass.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7039 from turboFei/fix_test.

Closes #7034

0ff7018d6 [Wang, Fei] revert
18e48c079 [Wang, Fei] comments
19f34bc83 [Wang, Fei] do not get pod name from appName
c1d308437 [Wang, Fei] reduce interval for test stability
50fad6bc5 [Wang, Fei] fix ut

Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: Wang, Fei <fwang12@ebay.com>
2025-04-24 22:40:28 -07:00
Wang, Fei
ba854d3c99
[KYUUBI #7035] Close the operation by operation manager to prevent operation leak
### Why are the changes needed?

To fix the operation leak if the session init timeout.

1. the operationHandle has not been added into session `opHandleSet` (super.runOperation).
2. The `operation.close()` only close the operation, but it does not remove it from `handleToOperation` map.
3. the session close would not remove the opHandle from `handleToOperation` as it has not been added into session `opHandleSet`

So here we can resolve the operation leak by invoking `operationManager.closeOperation` to remove the operation handle and close session.
cc68cb4c85/kyuubi-server/src/main/scala/org/apache/kyuubi/session/KyuubiSessionImpl.scala (L235-L246)

cc68cb4c85/kyuubi-common/src/main/scala/org/apache/kyuubi/session/AbstractSession.scala (L100-L103)

cc68cb4c85/kyuubi-common/src/main/scala/org/apache/kyuubi/operation/OperationManager.scala (L127-L130)

cc68cb4c85/kyuubi-common/src/main/scala/org/apache/kyuubi/session/AbstractSession.scala (L89-L92)

FYI: the operation was added into `handleToOperation` during new operation.
cc68cb4c85/kyuubi-server/src/main/scala/org/apache/kyuubi/operation/KyuubiOperationManager.scala (L56-L64)

### How was this patch tested?

Minor change.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7035 from turboFei/remove_op.

Closes #7035

3c376833a [Wang, Fei] close by op mgr

Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-04-24 14:14:07 +08:00
Wang, Fei
9e8bdf51a2 [KYUUBI #7041][FOLLOWUP] Fix build for SparkOnKubernetesTestsSuite
### Why are the changes needed?

Fix build issue after #7041

### How was this patch tested?

GA.
### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7042 from turboFei/fix_build.

Closes #7041

d026bf554 [Wang, Fei] fix build

Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: Wang, Fei <fwang12@ebay.com>
2025-04-23 22:47:15 -07:00