Commit Graph

4348 Commits

Author SHA1 Message Date
Cheng Pan
fc654cfbd3
[KYUUBI #7063] Bump Kyuubi Shaded 0.5.0
### Why are the changes needed?

https://kyuubi.apache.org/shaded-release/0.5.0.html

### How was this patch tested?

Pass GHA.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7063 from pan3793/kyuubi-shaded-0.5.0.

Closes #7063

b202a7c83 [Cheng Pan] Update pom.xml
417914529 [Cheng Pan] Bump Kyuubi Shaded 0.5.0

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-05-22 17:19:07 +08:00
davidyuan
abf947a7ac
[KYUUBI #7065] [#7066] Iceberg Support add partition field check
#7066
### Why are the changes needed?

Iceberg missing some check, this pr try to fix add partition field check

### How was this patch tested?

### Was this patch authored or co-authored using generative AI tooling?

Closes #7065 from davidyuan1223/icerberg_authz.

Closes #7065

be2684671 [davidyuan] update
231ed3356 [davidyuan] sort spi file
6d2a5bf20 [davidyuan] sort spi file
bc21310cc [davidyuan] update
52ca367f1 [davidyuan] update

Authored-by: davidyuan <yuanfuyuan@mafengwo.com>
Signed-off-by: Kent Yao <yao@apache.org>
2025-05-20 14:53:52 +08:00
Cheng Pan
e366b0950f
[KYUUBI #6920][FOLLOWUP] Spark SQL engine supports Spark 4.0
### Why are the changes needed?

There were some breaking changes after we fixed compatibility for Spark 4.0.0 RC1 in #6920, but now Spark has reached 4.0.0 RC6, which has less chance to receive more breaking changes.

### How was this patch tested?

Changes are extracted from https://github.com/apache/kyuubi/pull/6928, which passed CI with Spark 4.0.0 RC6

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7061 from pan3793/6920-followup.

Closes #6920

17a1bd9e5 [Cheng Pan] [KYUUBI #6920][FOLLOWUP] Spark SQL engine supports Spark 4.0

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-05-16 11:47:35 +08:00
Cheng Pan
4df682bf83
[KYUUBI #7062] Bump Delta Lake 3.3.1
### Why are the changes needed?

https://github.com/delta-io/delta/releases/tag/v3.3.1

### How was this patch tested?

Pass GHA.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7062 from pan3793/delta-3.3.1.

Closes #7062

0fc1df8f9 [Cheng Pan] Bump DeltaLake 3.3.1

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-05-15 22:29:31 +08:00
Fei Wang
2ec8b46f02 [KYUUBI #7055] Support to filter out server only configs with prefixes
### Why are the changes needed?

To filter out server only configs with prefixes.

For some kyuubi configs, there is no related defined ConfigEntry, and we can not filter out them and have to populate them to engien end.

For example:
```
kyuubi.kubernetes.28.master.address=k8s://master
kyuubi.backend.server.event.kafka.broker=localhost:9092
kyuubi.metadata.store.jdbc.driver=com.mysql.cj.jdbc.Driver
kyuubi.metadata.store.jdbc.datasource.maximumPoolSize=600
kyuubi.metadata.store.jdbc.datasource.minimumIdle=100
kyuubi.metadata.store.jdbc.datasource.idleTimeout=60000
```

This PR supports to exclude them by setting:
```
kyuubi.config.server.only.prefixes=kyuubi.backend.server.event.kafka.,kyuubi.metadata.store.jdbc.datasource.,kyuubi.kubernetes.28.
```

### How was this patch tested?

UT
### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7055 from turboFei/server_only_configs.

Closes #7055

6c804ff91 [Cheng Pan] Update kyuubi-common/src/main/scala/org/apache/kyuubi/config/KyuubiConf.scala
bd391a664 [Wang, Fei] exclude

Lead-authored-by: Fei Wang <fwang12@ebay.com>
Co-authored-by: Cheng Pan <pan3793@gmail.com>
Co-authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: Wang, Fei <fwang12@ebay.com>
2025-05-11 22:34:46 -07:00
Igor Khrol
61487acfa0
[KYUUBI #7051] Fix usage without sslTrustStore of JDBC driver
### Why are the changes needed?

If `sslTrustStore` is not provided `org.apache.hadoop.conf.Configuration` class existence becomes a hard dependency.
This makes jdbc client too complex to configure: extra Hadoop jars should be provided.

`hadoopCredentialProviderAvailable` variable is useless in the previous implementation logic because it's always `true` or the code is not reachable.

<img width="898" alt="Screenshot 2025-05-09 at 13 05 12" src="https://github.com/user-attachments/assets/6d202555-38c6-40d2-accb-eb78a3d4184e" />

### How was this patch tested?

Build jar and used it to connect from DataGrip.
<img width="595" alt="Screenshot 2025-05-09 at 13 01 29" src="https://github.com/user-attachments/assets/c6e4d904-a3dd-4d3f-9bdd-8bb47ed1e834" />

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7058 from Khrol/master.

Closes #7051

b594757a0 [Igor Khrol] JDBC driver: allow usage without sslTrustStore

Authored-by: Igor Khrol <khroliz@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-05-12 12:57:27 +08:00
Wang, Fei
2dd33e333e [KYUUBI #7054] Add server only flag for more server/credentials/frontend/metadata configs
### Why are the changes needed?

Reduce the kyuubi server end configs involved into engine end.

### How was this patch tested?

UT and  code review.
### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7054 from turboFei/server_only.

Closes #7054

d5855a5db [Wang, Fei] revert kubernetes
b253c336b [Wang, Fei] init

Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: Wang, Fei <fwang12@ebay.com>
2025-05-11 21:15:16 -07:00
taylor.fan
127c736a8f
[KYUUBI #6926] Add SERVER_LOCAL engine share level
### Why are the changes needed?

As clarified in https://github.com/apache/kyuubi/issues/6926, there are some scenarios user want to launch engine on each kyuubi server. SERVER_LOCAL engine share level implement this function by extracting local host address as subdomain, in which case each kyuubi server's engine is unique.

### How was this patch tested?

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #7013 from taylor12805/share_level_server_local.

Closes #6926

ba201bb72 [taylor.fan] [KYUUBI #6926] update format
42f0a4f7d [taylor.fan] [KYUUBI #6926] move host address to subdomain
e06de79ad [taylor.fan] [KYUUBI #6926] Add SERVER_LOCAL engine share level

Authored-by: taylor.fan <taylor.fan@vipshop.com>
Signed-off-by: Kent Yao <yao@apache.org>
2025-04-29 10:42:50 +08:00
John Zhang
c19d923b85
[KYUUBI #7048] Fix KeyError when parsing unknown Hive type_id in schema inspection
This patch adds try/except block to prevent `KeyError` when mapping unknown `type_id` in Hive schema parsing. Now, if a `type_id` is not recognized, `type_code` is set to `None` instead of raising an exception.

### Why are the changes needed?

Previously, when parsing Hive table schemas, the code attempts to map each `type_id` to a human-readable type name via `ttypes.TTypeId._VALUES_TO_NAMES[type_id]`. If Hive introduced an unknown or custom type (e.g. some might using an non-standard version or data pumping from a totally different data source like *Oracle* into *Hive* databases), a `KeyError` was raised, interrupting the entire SQL query process. This patch adds a `try/except` block so that unrecognized `type_id`s will set `type_code` to `None` instead of raising an error so that the downstream user can decided what to do instead of just an Exception. This makes schema inspection more robust and compatible with evolving Hive data types.

### How was this patch tested?

The patch was tested by running schema inspection on tables containing both standard and unknown/custom Hive column types. For known types, parsing behaves as before. For unknown types, the parser sets `type_code` to `None` without raising an exception, and the rest of the process completes successfully. No unit test was added since this is an edge case dependent on unreachable or custom Hive types, but was tested on typical use cases.

### Was this patch authored or co-authored using generative AI tooling?

No. 😂 It's a minor patch.

Closes #7048 from ZsgsDesign/patch-1.

Closes #7048

4d246d0ec [John Zhang] fix: handle KeyError when parsing Hive type_id mapping

Authored-by: John Zhang <zsgsdesign@gmail.com>
Signed-off-by: Kent Yao <yao@apache.org>
2025-04-29 10:41:16 +08:00
Wang, Fei
ecfca79328
[KYUUBI #7033] Treat YARN/Kubernetes application NOT_FOUND as failed to prevent data quality issue
### Why are the changes needed?

Currently, NOT_FOUND application stated is treated as a terminated but not failed state.

It might cause some data quality issue if downstream application depends on the batch state for data processing.

So, I think we should treat NOT_FOUND as a failed state instead.

Currently, we support 3 types of application manager.
1. [JpsApplicationOperation](https://github.com/apache/kyuubi/blob/master/kyuubi-server/src/main/scala/org/apache/kyuubi/engine/JpsApplicationOperation.scala)
2. [YarnApplicationOperation](https://github.com/apache/kyuubi/blob/master/kyuubi-server/src/main/scala/org/apache/kyuubi/engine/YarnApplicationOperation.scala)
3. [KubernetesApplicationOperation](https://github.com/apache/kyuubi/blob/master/kyuubi-server/src/main/scala/org/apache/kyuubi/engine/KubernetesApplicationOperation.scala)

YarnApplicationOperation and KubernetesApplicationOperation are widely used in production use case.

And in multiple kyuubi instance mode, the NOT_FOUND case should rarely happen.
1.  7e199d6fdb/kyuubi-server/src/main/scala/org/apache/kyuubi/server/api/v1/BatchesResource.scala (L369-L385)

3. https://github.com/apache/kyuubi/pull/7029

So, I think we should treat NOT_FOUND as a failed state in production use case.
It is better to fail some corner cases than to mistakenly set unsuccessful batches to the finished state.

### How was this patch tested?

GA.
### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7033 from turboFei/revist_not_found.

Closes #7033

ada4f8822 [Cheng Pan] Update kyuubi-server/src/main/scala/org/apache/kyuubi/engine/ApplicationOperation.scala
985e23c24 [Wang, Fei] Refine
f03d61242 [Wang, Fei] comments
b9d6ac203 [Wang, Fei] incase the metadata updated by peer instance
3bd61ca85 [Wang, Fei] add
339df4730 [Wang, Fei] treat NOT_FOUND as failed

Lead-authored-by: Wang, Fei <fwang12@ebay.com>
Co-authored-by: Cheng Pan <pan3793@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-04-27 21:09:08 +08:00
Wang, Fei
02a6b13d77 [KYUUBI #7028] Persist the kubernetes application terminate state into metastore for app info store fallback
### Why are the changes needed?

1. Persist the kubernetes application terminate info into metastore to prevent the event lose.
2. If it can not get the application info from informer application info store, fallback to get the application info from metastore instead of return NOT_FOUND directly.
3. It is critical because if we return false application state, it might cause data quality issue.

### How was this patch tested?

UT and IT.

<img width="1917" alt="image" src="https://github.com/user-attachments/assets/306f417c-5037-4869-904d-dcf657ff8f60" />

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7029 from turboFei/kubernetes_state.

Closes #7028

9f2badef3 [Wang, Fei] generic dialect
186cc690d [Wang, Fei] nit
82ea62669 [Wang, Fei] Add pod name
4c59bebb5 [Wang, Fei] Refine
327a0d594 [Wang, Fei] Remove create_time from k8s engine info
12c24b1d0 [Wang, Fei] do not use MYSQL deprecated VALUES(col)
becf9d1a7 [Wang, Fei] insert or replace
d167623c1 [Wang, Fei] migration

Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: Wang, Fei <fwang12@ebay.com>
2025-04-27 01:37:27 -07:00
Wang, Fei
70c03ef998
[KYUUBI #7046] Bump dropwizard metrics version to 4.2.30
### Why are the changes needed?

Bump to use the latest 4.2.x, seems mainly for dependency upgrading.

- https://github.com/dropwizard/metrics/releases/tag/v4.2.27
- https://github.com/dropwizard/metrics/releases/tag/v4.2.28
- https://github.com/dropwizard/metrics/releases/tag/v4.2.29
- https://github.com/dropwizard/metrics/releases/tag/v4.2.30

### How was this patch tested?

GA.
### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7046 from turboFei/bump_codahale.

Closes #7046

51d2a5522 [Wang, Fei] Bump codahale metrics version to 4.2.30

Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-04-27 15:56:27 +08:00
Wang, Fei
4cbff4d192 [KYUUBI #7045] Expose jetty metrics
### Why are the changes needed?

Expose the jetty metrics to help detect issue.
Refer: https://metrics.dropwizard.io/4.2.0/manual/jetty.html

### How was this patch tested?

<img width="1425" alt="image" src="https://github.com/user-attachments/assets/ac8c9a48-eaa1-48ee-afec-6f33980d4270" />

<img width="1283" alt="image" src="https://github.com/user-attachments/assets/c2fa444b-6337-4662-832b-3d02f206bd13" />

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7045 from turboFei/metrics_jetty.

Closes #7045

122b93f3d [Wang, Fei] metrics
45a73e7cd [Wang, Fei] metrics

Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: Wang, Fei <fwang12@ebay.com>
2025-04-25 00:02:56 -07:00
Wang, Fei
3e638b61c3 [KYUUBI #7044] Bump jetty version to 9.4.57.v20241219
### Why are the changes needed?

Bump to latest jetty 9.x, see https://github.com/jetty/jetty.project/releases/tag/jetty-9.4.57.v20241219

<img width="1222" alt="image" src="https://github.com/user-attachments/assets/8d2734ee-52f2-4ba9-b22a-660e3b202b7f" />

### How was this patch tested?

GA.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7044 from turboFei/jetty_version.

Closes #7044

740560dd0 [Wang, Fei] Upgrade jetty version

Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: Wang, Fei <fwang12@ebay.com>
2025-04-25 00:01:34 -07:00
Wang, Fei
29b6076319 [KYUUBI #7043] Support to construct the batch info from metadata directly
### Why are the changes needed?

Add an option to allow construct the batch info from metadata directly instead of redirecting the requests to reduce the RPC latency.

### How was this patch tested?

Minor change and Existing GA.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7043 from turboFei/support_no_redirect.

Closes #7043

7f7a2fb80 [Wang, Fei] comments
bb0e324a1 [Wang, Fei] save

Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: Wang, Fei <fwang12@ebay.com>
2025-04-24 22:42:26 -07:00
Wang, Fei
75891d1a92 [KYUUBI #7034][FOLLOWUP] Decouple the kubernetes pod name and app name
### Why are the changes needed?

Followup for #7034  to fix the SparkOnKubernetesTestsSuite.

Sorry, I forget that the appInfo name and pod name were deeply bound before, the appInfo name was used as pod name and used to delete pod.

In this PR, we add `podName` into applicationInfo to separate app name and pod name.

### How was this patch tested?

GA should pass.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7039 from turboFei/fix_test.

Closes #7034

0ff7018d6 [Wang, Fei] revert
18e48c079 [Wang, Fei] comments
19f34bc83 [Wang, Fei] do not get pod name from appName
c1d308437 [Wang, Fei] reduce interval for test stability
50fad6bc5 [Wang, Fei] fix ut

Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: Wang, Fei <fwang12@ebay.com>
2025-04-24 22:40:28 -07:00
Wang, Fei
ba854d3c99
[KYUUBI #7035] Close the operation by operation manager to prevent operation leak
### Why are the changes needed?

To fix the operation leak if the session init timeout.

1. the operationHandle has not been added into session `opHandleSet` (super.runOperation).
2. The `operation.close()` only close the operation, but it does not remove it from `handleToOperation` map.
3. the session close would not remove the opHandle from `handleToOperation` as it has not been added into session `opHandleSet`

So here we can resolve the operation leak by invoking `operationManager.closeOperation` to remove the operation handle and close session.
cc68cb4c85/kyuubi-server/src/main/scala/org/apache/kyuubi/session/KyuubiSessionImpl.scala (L235-L246)

cc68cb4c85/kyuubi-common/src/main/scala/org/apache/kyuubi/session/AbstractSession.scala (L100-L103)

cc68cb4c85/kyuubi-common/src/main/scala/org/apache/kyuubi/operation/OperationManager.scala (L127-L130)

cc68cb4c85/kyuubi-common/src/main/scala/org/apache/kyuubi/session/AbstractSession.scala (L89-L92)

FYI: the operation was added into `handleToOperation` during new operation.
cc68cb4c85/kyuubi-server/src/main/scala/org/apache/kyuubi/operation/KyuubiOperationManager.scala (L56-L64)

### How was this patch tested?

Minor change.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7035 from turboFei/remove_op.

Closes #7035

3c376833a [Wang, Fei] close by op mgr

Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-04-24 14:14:07 +08:00
Wang, Fei
9e8bdf51a2 [KYUUBI #7041][FOLLOWUP] Fix build for SparkOnKubernetesTestsSuite
### Why are the changes needed?

Fix build issue after #7041

### How was this patch tested?

GA.
### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7042 from turboFei/fix_build.

Closes #7041

d026bf554 [Wang, Fei] fix build

Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: Wang, Fei <fwang12@ebay.com>
2025-04-23 22:47:15 -07:00
Wang, Fei
ee677a6feb [KYUUBI #7041] Fix NPE when getting metadtamanager in KubernetesApplicationOperation
### Why are the changes needed?

To fix NPE.

Before, we use below method to get `metadataManager`.
```
private def metadataManager = KyuubiServer.kyuubiServer.backendService
    .sessionManager.asInstanceOf[KyuubiSessionManager].metadataManager
```
But before the kyuubi server fully restarted, the `KyuubiServer.kyuubiServer` is null and might throw NPE during batch recovery phase.

For example:

```
:2025-04-23 14:06:24.040 ERROR [KyuubiSessionManager-exec-pool: Thread-231] org.apache.kyuubi.engine.KubernetesApplicationOperation: Failed to get application by label: kyuubi-unique-tag=95116703-4240-4cc1-9886-ccae3a2ac879, due to Cannot invoke "org.apache.kyuubi.server.KyuubiServer.backendService()" because the return value of "org.apache.kyuubi.server.KyuubiServer$.kyuubiServer()" is null
```

### How was this patch tested?

Existing GA.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7041 from turboFei/fix_NPE.

Closes #7041

064d88707 [Wang, Fei] Fix NPE

Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: Wang, Fei <fwang12@ebay.com>
2025-04-23 20:20:39 -07:00
Wang, Fei
f0c31e2f78 [KYUUBI #6828][FOLLOWUP] Fix NPE in KyuubiBaseResultSet::getBigDecimal
### Why are the changes needed?

It is missed in #6828
733d4f0901/jdbc/src/java/org/apache/hive/jdbc/HiveBaseResultSet.java (L151-L159)

### How was this patch tested?

Minor change.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7038 from turboFei/fixNPE.

Closes #6828

2785e97be [Wang, Fei] Fix NPE

Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: Wang, Fei <fwang12@ebay.com>
2025-04-22 20:46:43 -07:00
Cheng Pan
6da0e62baf
[KYUUBI #7036] [DOCS] Improve docs for kyuubi-extension-spark-jdbc-dialect
### Why are the changes needed?

This PR removes the page https://kyuubi.readthedocs.io/en/v1.10.1/client/python/pyspark.html and merges the most content into https://kyuubi.readthedocs.io/en/v1.10.1/extensions/engines/spark/jdbc-dialect.html, some original content of the latter is also modified.

The current docs are misleading, I got asked several times by users why they follow the [Kyuubi PySpark docs](https://kyuubi.readthedocs.io/en/v1.10.1/client/python/pyspark.html) to access data stored in Hive warehouse is too slow.

Actually, accessing HiveServer2/STS from Spark JDBC data source is discouraged by the Spark community, see [SPARK-47482](https://github.com/apache/spark/pull/45609), even though it's technical feasible.

### How was this patch tested?

It's a docs-only change, review is required.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7036 from pan3793/jdbc-ds-docs.

Closes #7036

c00ce0706 [Cheng Pan] style
f2676bd23 [Cheng Pan] [DOCS] Improve docs for kyuubi-extension-spark-jdbc-dialect

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-04-23 11:09:29 +08:00
Wang, Fei
cc68cb4c85 [KYUUBI #7034] [KUBERNETES] Prefer to use pod spark-app-name label as application name than pod name
### Why are the changes needed?

After https://github.com/apache/spark/pull/34460 (Since Spark 3.3.0), the `spark-app-name` is available.

We shall use it as the application name if it exists.

### How was this patch tested?

Minor change.
### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7034 from turboFei/k8s_app_name.

Closes #7034

bfa88a436 [Wang, Fei] Get pod app name

Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: Wang, Fei <fwang12@ebay.com>
2025-04-16 19:28:04 -07:00
Wang, Fei
7e199d6fdb [KYUUBI #7025] [KYUUBI #6686][FOLLOWUP] Prefer terminated container app state than terminated pod state
### Why are the changes needed?

I found that, for a kyuubi batch on kubernetes.

1. It has been `FINISHED`.
2. then I delete the pod manually, then I check the k8s-audit.log, then the appState became `FAILED`.

```
2025-04-15 11:16:30.453 INFO [-675216314-pool-44-thread-839] org.apache.kyuubi.engine.KubernetesApplicationAuditLogger: label=61e7d8c1-e5a9-46cd-83e7-c611003f0224     context=97      namespace=dls-prod      pod=kyuubi-spark-61e7d8c1-e5a9-46cd-83e7-c611003f0224-driver podState=Running        containers=[microvault->ContainerState(running=ContainerStateRunning(startedAt=2025-04-15T18:13:48Z, additionalProperties={}), terminated=null, waiting=null, additionalProperties={}),spark-kubernetes-driver->ContainerState(running=null, terminated=ContainerStateTerminated(containerID=containerd://72704f8e7ccb5e877c8f6b10bf6ad810d0c019e07e0cb5975be733e79762c1ec, exitCode=0, finishedAt=2025-04-15T18:14:22Z, message=null, reason=Completed, signal=null, startedAt=2025-04-15T18:13:49Z, additionalProperties={}), waiting=null, additionalProperties={})]   appId=spark-228c62e0dc37402bacac189d01b871e4    appState=FINISHED       appError=''
:2025-04-15 11:16:30.854 INFO [-675216314-pool-44-thread-840] org.apache.kyuubi.engine.KubernetesApplicationAuditLogger: label=61e7d8c1-e5a9-46cd-83e7-c611003f0224     context=97      namespace=dls-prod      pod=kyuubi-spark-61e7d8c1-e5a9-46cd-83e7-c611003f0224-driver podState=Failed containers=[microvault->ContainerState(running=null, terminated=ContainerStateTerminated(containerID=containerd://91654e3ee74e2c31218e14be201b50a4a604c2ad15d3afd84dc6f620e59894b7, exitCode=2, finishedAt=2025-04-15T18:16:30Z, message=null, reason=Error, signal=null, startedAt=2025-04-15T18:13:48Z, additionalProperties={}), waiting=null, additionalProperties={}),spark-kubernetes-driver->ContainerState(running=null, terminated=ContainerStateTerminated(containerID=containerd://72704f8e7ccb5e877c8f6b10bf6ad810d0c019e07e0cb5975be733e79762c1ec, exitCode=0, finishedAt=2025-04-15T18:14:22Z, message=null, reason=Completed, signal=null, startedAt=2025-04-15T18:13:49Z, additionalProperties={}), waiting=null, additionalProperties={})]    appId=spark-228c62e0dc37402bacac189d01b871e4    appState=FAILED appError='{
```

This PR is a followup for #6690 , which ignore the container state if POD is terminated.

It is more reasonable to respect the terminated container state than terminated pod state.

### How was this patch tested?

Integration testing.

```
:2025-04-15 13:53:24.551 INFO [-1077768163-pool-36-thread-3] org.apache.kyuubi.engine.KubernetesApplicationAuditLogger: eventType=DELETE	label=e0eb4580-3cfa-43bf-bdcc-efeabcabc93c	context=97	namespace=dls-prod	pod=kyuubi-spark-e0eb4580-3cfa-43bf-bdcc-efeabcabc93c-driver	podState=Failed	containers=[microvault->ContainerState(running=null, terminated=ContainerStateTerminated(containerID=containerd://66c42206730950bd422774e3c1b0f426d7879731788cea609bbfe0daab24a763, exitCode=2, finishedAt=2025-04-15T20:53:22Z, message=null, reason=Error, signal=null, startedAt=2025-04-15T20:52:00Z, additionalProperties={}), waiting=null, additionalProperties={}),spark-kubernetes-driver->ContainerState(running=null, terminated=ContainerStateTerminated(containerID=containerd://9179a73d9d9e148dcd9c13ee6cc29dc3e257f95a33609065e061866bb611cb3b, exitCode=0, finishedAt=2025-04-15T20:52:28Z, message=null, reason=Completed, signal=null, startedAt=2025-04-15T20:52:01Z, additionalProperties={}), waiting=null, additionalProperties={})]	appId=spark-578df0facbfd4958a07f8d1ae79107dc	appState=FINISHED	appError=''
```

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7025 from turboFei/container_terminated.

Closes #7025

Closes #6686

a3b2a5a56 [Wang, Fei] comments
4356d1bc9 [Wang, Fei] fix the app state logical

Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: Wang, Fei <fwang12@ebay.com>
2025-04-16 10:12:10 -07:00
Kent Yao
0ae158ecb1
[KYUUBI #7032] Remove Umbrella/Subtask issue template
### Why are the changes needed?

Github has provide subtask creation & auto-linking features, and they are more advanced

### How was this patch tested?
https://github.com/apache/kyuubi/issues/7030

### Was this patch authored or co-authored using generative AI tooling?
no

Closes #7032 from yaooqinn/sb.

Closes #7032

3fd6ccd90 [Kent Yao] remove more
cbf691033 [Kent Yao] Remove Umbrella/Subtask issue template

Authored-by: Kent Yao <yao@apache.org>
Signed-off-by: Kent Yao <yao@apache.org>
2025-04-16 16:30:11 +08:00
Wang, Fei
82e1673cae [KYUUBI #7026] Audit the kubernetes pod event type and fix DELETE event process logical
### Why are the changes needed?

1. Audit the kubernetes resource event type.
2. Fix the process logical for DELETE event.

Before this pr:

I tried to delete the POD manually, then I saw that, kyuubi thought the `appState=PENDING`.
```
:2025-04-15 13:58:20.320 INFO [-1077768163-pool-36-thread-7] org.apache.kyuubi.engine.KubernetesApplicationAuditLogger: eventType=DELETE	label=3c58e9fd-cf8c-4cc3-a9aa-82ae40e200d8	context=97	namespace=dls-prod	pod=kyuubi-spark-3c58e9fd-cf8c-4cc3-a9aa-82ae40e200d8-driver	podState=Pending	containers=[]	appId=spark-cd125bbd9fc84ffcae6d6b5d41d4d8ad	appState=PENDING	appError=''
```

It seems that, the pod status in the event is the snapshot before pod deleted.

Then we would not receive any event for this POD, and finally the batch FINISHED with application `NOT_FOUND` .

<img width="1389" alt="image" src="https://github.com/user-attachments/assets/5df03db6-0924-4a58-9538-b196fbf87f32" />

Seems we need to process the DELETE event specially.

1. get the app state from the pod/container states
2. if the applicationState got is terminated, return the applicationState directly
3. otherwise, the applicationState should be FAILED, as the pod has been deleted.

### How was this patch tested?

<img width="1614" alt="image" src="https://github.com/user-attachments/assets/11e64c6f-ad53-4485-b8d2-a351bb23e8ca" />

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7026 from turboFei/k8s_audit.

Closes #7026

4e5695d34 [Wang, Fei] for delete
c16757218 [Wang, Fei] audit the pod event type

Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: Wang, Fei <fwang12@ebay.com>
2025-04-15 22:37:12 -07:00
Wang, Fei
4fc201e85d [KYUUBI #7027] Support to initialize kubernetes clients on kyuubi server startup
### Why are the changes needed?

This ensure the Kyuubi server is promptly informed for any Kubernetes resource changes after startup. It is highly recommend to set it for multiple Kyuubi instances mode.

### How was this patch tested?

Existing GA and Integration testing.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7027 from turboFei/k8s_client_init.

Closes #7027

393b9960a [Wang, Fei] server only
a640278c4 [Wang, Fei] refresh

Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: Wang, Fei <fwang12@ebay.com>
2025-04-15 22:36:16 -07:00
Wang, Fei
fa99183354 [KYUUBI #7023] Upgrade kubernetes client version to 6.13.5
### Why are the changes needed?

Upgrade the kubernetes client, https://github.com/fabric8io/kubernetes-client/releases/tag/v6.13.5

### How was this patch tested?

GA.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7023 from turboFei/k8s_client.

Closes #7023

3e3ac634f [Wang, Fei] 6.16.5
df5aa011f [Wang, Fei] upgrade

Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: Wang, Fei <fwang12@ebay.com>
2025-04-15 11:47:13 -07:00
Cheng Pan
647a7139ce
[KYUUBI #7022] Update announcement mail template to contain download links
### Why are the changes needed?

To satisfy ASF policy

https://lists.apache.org/thread/89jb1kp77wcv16tph8qlbf5k0fscyz9l

### How was this patch tested?

Review

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7022 from pan3793/ann-tmpl.

Closes #7022

7fa64b163 [Cheng Pan] Update announcement mail template to contain download links

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-04-14 19:56:09 +08:00
Wang, Fei
ec074fd202 [KYUUBI #7015] Record the session disconnected info into kyuubi session event
### Why are the changes needed?

Currently, if the kyuubi session between client and kyuubi session disconnected without closing properly, it is difficult to debug, and we have to check the kyuubi server log.

It is better that, we can record such kind of information into kyuubi session event.
### How was this patch tested?

IT.

<img width="1264" alt="image" src="https://github.com/user-attachments/assets/d2c5b6d0-6298-46ec-9b73-ce648551120c" />

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7015 from turboFei/disconnect.

Closes #7015

c95709284 [Wang, Fei] do not post
e46521410 [Wang, Fei] nit
bca7f9b7e [Wang, Fei] post
1cf6f8f49 [Wang, Fei] disconnect

Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: Wang, Fei <fwang12@ebay.com>
2025-04-10 00:09:33 -07:00
Wang, Fei
0cc52d035c [KYUUBI #7017] Using mutable JettyServer uri to prevent batch kyuubi instance mismatch
### Why are the changes needed?

To fix the batch kyuubi instance port is negative issue.
<img width="697" alt="image" src="https://github.com/user-attachments/assets/ef992390-8d20-44b3-8640-35496caff85d" />

It happen after I stop the kyuubi service.
We should use variable instead of function for jetty server serverUri.
After the server connector stopped, the localPort would be `-2`.

![image](https://github.com/user-attachments/assets/5152293d-9c2c-4979-bdcb-322f02928813)

### How was this patch tested?

Existing UT.
### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7017 from turboFei/server_port_negative.

Closes #7017

3d34c4031 [Wang, Fei] warn
e58298646 [Wang, Fei] mutable server uri
2cbaf772a [Wang, Fei] Revert "hard code the server uri"
b64d91b32 [Wang, Fei] hard code the server uri

Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: Wang, Fei <fwang12@ebay.com>
2025-04-09 10:27:09 -07:00
Wang, Fei
d6f07a6b64 [KYUUBI #7011] Set kyuubi session engine client after opening engine session successfully
### Why are the changes needed?

Since https://github.com/apache/kyuubi/pull/3618
Kyuubi server could retry opening the engine when encountering a special error.
1937dd93f9/kyuubi-server/src/main/scala/org/apache/kyuubi/session/KyuubiSessionImpl.scala (L177-L212)

The `_client` might be reset and closed.

So, we shall set `_client` after open engine session successfully, as the `client` method is a public method.
### How was this patch tested?

Existing UT.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7011 from turboFei/client_ready.

Closes #7011

3ad57ee91 [Wang, Fei] fix npe
b956394fa [Wang, Fei] close internal engine client
523b48a4d [Wang, Fei] internal client
5baeedec1 [Wang, Fei] Revert "method"
84c808cfb [Wang, Fei] method
8efaa52f6 [Wang, Fei] check engine launched

Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: Wang, Fei <fwang12@ebay.com>
2025-04-03 09:41:27 -07:00
Wang, Fei
1937dd93f9
[KYUUBI #7009] Backport HIVE-26723: Configurable canonical name checking.
### Why are the changes needed?

Backport https://github.com/apache/hive/pull/3749

It is not possible to create SSL connection with Kerberos authentication when the server certificate is not issued to the canonical host name but to an alternative domain name.

See details about the exception and steps for reproducing in the [HIVE-26723](https://issues.apache.org/jira/browse/HIVE-26723)

Hive JDBC client validates the host name by its canonical name by default. This behaviour leads to SSLHandshakeExcpetion when trying to connect using alias name with Kerberos authentication. To solve this issue a new connection property is introduced to be able disabling canonical host name check: enableCanonicalHostnameCheck having default value true.

When the property is not given in connection string (or its value is true) then the original behaviour is applied i.e. checking canonical host name.

### How was this patch tested?

There are no new unit tests because the fix is in the HiveConnection constructor which contains lot of logic inside and also builds new SSL connections.
IMO it would have been far too much effort to mock the whole environment for creating unit tests against this tiny change. :(

There wasn't any already existing test against HiveConnection that could be extended with this new feature/bugfix. It is misleading that there is a class having name TestHiveConnection but there is no any tests that would test the class HiveConnection itself.

BTW It was tested manually: after this fix when the steps in JIRA are executed again using the new JARs then the SSL connection is created successfully, and I was able to execute queries.

### Does this PR introduce any user-facing change?
A new JDBC connection URL property has been introduced: enableCanonicalHostnameCheck to be able to turn off the canonical host name checking. Its default value is true so if it is not set the canonical host name is checked when building up the SSL connection.

To turn off the canonical host name checking just add this property to the connection string, i.e:

```
./beeline -u "jdbc:hive2://hs2.subdomain.example.com:443/default;transportMode=http;httpPath=cliservice;socketTimeout=60;ssl=true;retries=1;principal=myhiveprincipal/mydomain.example.com;enableCanonicalHostnameCheck=false;"
```
### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7009 from turboFei/kerberos_can.

Closes #7009

40cd48814 [Wang, Fei] Backport HIVE-26723: Configurable canonical name checking.

Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-04-01 13:51:16 +08:00
Wang, Fei
2fdf440562
[KYUUBI #7008] Backport HIVE-27817: Disable ssl hostname verification for 127.0.0.1
### Why are the changes needed?

Backport https://github.com/apache/hive/pull/4823

We need to setup production tunnel because we can't connect to production environment directly:

```
sh -fN -o ServerAliveInterval=60 -o ServerAliveCountMax=3 -L 127.0.0.1:10001:hiveserver2.prod.company.com:10001 bastion.company.com

JDBC url: jdbc:hive2://127.0.0.1:10001/default;ssl=true
```

But it will throw exception after [HIVE-15025](https://issues.apache.org/jira/browse/HIVE-15025):

```
Exception in thread "main" java.sql.SQLException: Could not open client transport with JDBC Uri: jdbc:hive2://localhost:10001/default;ssl=true: javax.net.ssl.SSLHandshakeException: No subject alternative DNS name matching localhost found.
	at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:224)
	at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:107)
	at java.sql.DriverManager.getConnection(DriverManager.java:664)
	at java.sql.DriverManager.getConnection(DriverManager.java:247)
	at org.apache.spark.sql.TestJDBC$.main(TestJDBC.scala:47)
	at org.apache.spark.sql.TestJDBC.main(TestJDBC.scala)
Caused by: org.apache.hive.org.apache.thrift.transport.TTransportException: javax.net.ssl.SSLHandshakeException: No subject alternative DNS name matching localhost found.
	at org.apache.hive.org.apache.thrift.transport.TIOStreamTransport.flush(TIOStreamTransport.java:161)
	at org.apache.hive.org.apache.thrift.transport.TSaslTransport.sendSaslMessage(TSaslTransport.java:166)
	at org.apache.hive.org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:100)
	at org.apache.hive.org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271)
	at org.apache.hive.org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
	at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:311)
	at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:196)
	... 5 more
```
This PR disables ssl hostname verification for 127.0.0.1 to workaround this issue.

### How was this patch tested?

Manual test.
### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7008 from turboFei/ssl.

Closes #7008

6ae1b7b82 [Wang, Fei] Backport HIVE-27817: Disable ssl hostname verification for 127.0.0.1

Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-04-01 13:47:55 +08:00
dnskr
e2efe934e1 [KYUUBI #7005] [DOC] Remove empty page "Getting Started with Jupyter Lap"
### Why are the changes needed?

The PR resolves the following warning message:
```
../kyuubi/docs/quick_start/quick_start_with_jupyter.md: WARNING: document isn't included in any toctree
```
It removes the empty page `Getting Started with Jupyter Lap` which is also not presented in the documentation menu.

### How was this patch tested?

Built documentation locally and checked there are no warning message anymore.

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #7005 from dnskr/remove-empty-getting-started-with-jupyter-lap.

Closes #7005

030fb3598 [dnskr] [DOC] Remove empty page "Getting Started with Jupyter Lap"

Authored-by: dnskr <dnskrv88@gmail.com>
Signed-off-by: dnskr <dnskrv88@gmail.com>
2025-03-29 19:18:45 +01:00
Cheng Pan
d7f20e8431
[KYUUBI #7004] Include FastXML Jackson into authZ shaded jar
### Why are the changes needed?

RANGER-4225 (2.5.0) upgrades Jackson from 1.x to 2.x, and it causes `ClassNotFoundException` when user use `kyuubi-spark-authz-shaded_2.12-1.10.1.jar`(built with Ranger 2.5.0)

```
java.lang.NoClassDefFoundError: com/fasterxml/jackson/jaxrs/base/ProviderBase
 at java.lang.ClassLoader.defineClass1(Native Method)
 at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
 at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
 at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
 at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
 at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
 at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
 at org.apache.ranger.plugin.util.RangerRESTClient.buildClient(RangerRESTClient.java:208)
 at org.apache.ranger.plugin.util.RangerRESTClient.getClient(RangerRESTClient.java:191)
 at org.apache.ranger.plugin.util.RangerRESTClient.get(RangerRESTClient.java:465)
 at org.apache.ranger.admin.client.RangerAdminRESTClient.getRangerRolesDownloadResponse(RangerAdminRESTClient.java:1321)
 at org.apache.ranger.admin.client.RangerAdminRESTClient.getRolesIfUpdatedWithCred(RangerAdminRESTClient.java:1183)
 at org.apache.ranger.admin.client.RangerAdminRESTClient.getRolesIfUpdated(RangerAdminRESTClient.java:148)
 at org.apache.ranger.plugin.util.RangerRolesProvider.loadUserGroupRolesFromAdmin(RangerRolesProvider.java:172)
 at org.apache.ranger.plugin.util.RangerRolesProvider.loadUserGroupRoles(RangerRolesProvider.java:112)
 at org.apache.ranger.plugin.util.PolicyRefresher.loadRoles(PolicyRefresher.java:563)
 at org.apache.ranger.plugin.util.PolicyRefresher.startRefresher(PolicyRefresher.java:138)
 at org.apache.ranger.plugin.service.RangerBasePlugin.init(RangerBasePlugin.java:254)
 at org.apache.kyuubi.plugin.spark.authz.ranger.SparkRangerAdminPlugin$.initialize(SparkRangerAdminPlugin.scala:68)
 at org.apache.kyuubi.plugin.spark.authz.ranger.RangerSparkExtension.<init>(RangerSparkExtension.scala:44)
```

### How was this patch tested?

```
$ jar tf kyuubi-spark-authz-shaded_2.12-1.11.0-SNAPSHOT.jar | grep org/apache/kyuubi/shade/com/fasterxml
org/apache/kyuubi/shade/com/fasterxml/
org/apache/kyuubi/shade/com/fasterxml/jackson/
org/apache/kyuubi/shade/com/fasterxml/jackson/databind/
org/apache/kyuubi/shade/com/fasterxml/jackson/databind/AbstractTypeResolver.class
org/apache/kyuubi/shade/com/fasterxml/jackson/databind/AnnotationIntrospector$ReferenceProperty$Type.class
org/apache/kyuubi/shade/com/fasterxml/jackson/databind/AnnotationIntrospector$ReferenceProperty.class
org/apache/kyuubi/shade/com/fasterxml/jackson/databind/AnnotationIntrospector$XmlExtensions.class
org/apache/kyuubi/shade/com/fasterxml/jackson/databind/AnnotationIntrospector.class
org/apache/kyuubi/shade/com/fasterxml/jackson/databind/BeanDescription.class
...
```

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7004 from pan3793/authz-jackson.

Closes #7004

cbf870516 [Cheng Pan] fix
4312d9fe5 [Cheng Pan] Include FastXML Jackson into authZ shaded jar

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-03-26 20:31:32 +08:00
davidyuan
06df5e5dc3
[KYUUBI #6979] Support check paimon system producers
### Why are the changes needed?

Cuurently, ranger check missing paimom system producers command, need to support these command
1. create_tag
2. delete_tag
3. rollback

#6979

PS: There has a question about paimon, paimon'sparkCatalog need the currentCatalog Env is the paimon's catalog, use default spark_catalog will throw exception, maybe we should add this hint to the documentation.
such as
If you wanna support producers check with paimon, you need use sql `use $paimon_catalog` to ensure the session currentCatalog is paimon_catalog

PS: paimon-spark-3.3:0.8.2 has some compaitable question, suggest upgrade the paimon version

### How was this patch tested?

producers test cases
1. create_tag
2. delete_tag
3. rollback

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #6980 from davidyuan1223/paimon_producers.

Closes #6979

90f367c6a [davidyuan] update
c0503cb5f [davidyuan] Merge remote-tracking branch 'origin/paimon_producers' into paimon_producers
993d1dcb8 [davidyuan] Merge branch 'master' into paimon_producers
f68edef41 [davidyuan] producers
58224191b [davidyuan] Merge branch 'master' into paimon_producers
57aac600b [davidyuan] update
cbcdd8dbf [davidyuan] producers

Authored-by: davidyuan <yuanfuyuan@mafengwo.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-03-26 14:13:59 +08:00
Cheng Pan
176bc293fc
[KYUUBI #7003] Cut out JNA dependencies for authZ plugin
### Why are the changes needed?

This PR provides an alternative for RANGER-4125 to cut out JNA dependencies for authZ plugin.

### How was this patch tested?

Pass GHA, and I checked the content of authz-shaded jar

```
$ jar tf extensions/spark/kyuubi-spark-authz-shaded/target/kyuubi-spark-authz-shaded_2.12-1.11.0-SNAPSHOT.jar | grep Hostname
org/apache/kyuubi/shade/com/kstruct/gethostname4j/Hostname.class
```

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7003 from pan3793/authz-hostname.

Closes #7003

42e246856 [Cheng Pan] Cut out JNA dependencies for authz plugin

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-03-26 11:19:58 +08:00
dependabot[bot]
560e0fbee1
Bump nanoid from 3.3.6 to 3.3.11 in /kyuubi-server/web-ui (#7001) 2025-03-25 14:44:57 +00:00
Cheng Pan
c6bb57c685
[KYUUBI #7000] Exclude aws-java-sdk-logs from kyuubi-spark-authz-shaded
### Why are the changes needed?

RANGER-4831 (2.5.0) switches from aws-java-sdk-bundle to aws-java-sdk-logs

### How was this patch tested?

I checked the packaged jar content

```
$ build/mvn clean install -DskipTests -pl :kyuubi-spark-authz-shaded_2.12 -am
$ jar -tf extensions/spark/kyuubi-spark-authz-shaded/target/kyuubi-spark-authz-shaded_2.12-1.11.0-SNAPSHOT.jar \
  | grep -v 'org/apache/ranger/' \
  | grep -v 'org/apache/kyuubi/' \
  | grep -v 'com/sun/jna/' \
  | grep -v 'META-INF/services/' \
  | grep -v 'service-defs/ranger-servicedef-'
META-INF/
META-INF/MANIFEST.MF
META-INF/LICENSE
META-INF/NOTICE
database_command_spec.json
function_command_spec.json
org/
org/apache/
scan_command_spec.json
service-defs/
table_command_spec.json
org/apache/hadoop/
org/apache/hadoop/security/
org/apache/hadoop/security/SecureClientLogin.class
etc/
etc/ranger/
etc/ranger/geo/
etc/ranger/geo/geo.txt
org/apache/hadoop/security/SecureClientLoginConfiguration.class
etc/ranger/geo/geo_long.txt
resourcenamemap.properties
org/apache/hadoop/security/KrbPasswordSaverLoginModule.class
META-INF/jersey-module-version
com/
com/sun/
META-INF/persistence.xml
```

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #7000 from pan3793/authz-aws-logs.

Closes #7000

a22ca807a [Cheng Pan] Exclude aws-java-sdk-logs from kyuubi-spark-authz-shaded
447d450fc [Cheng Pan] Exclude aws-java-sdk-logs from kyuubi-spark-authz-shaded

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-03-25 18:18:56 +08:00
Cheng Pan
1f1bc3cb16
[KYUUBI #6999] Keep JNA in authz-shaded with Scala 2.13
### Why are the changes needed?

In Scala [v2.13.16](https://github.com/scala/scala/releases/tag/v2.13.16)

> JNA is no longer a dependency of `scala-compiler.jar`

Since Spark 4.0 upgrades to Scala 2.13.16, JNA deps have gone too.

```
$ spark-3.5.5-bin-hadoop3-scala2.13 cat RELEASE
Spark 3.5.5 (git revision 7c29c664cdc) built for Hadoop 3.3.4
Build flags: -B -Pmesos -Pyarn -Pkubernetes -Psparkr -Pscala-2.13 -Phadoop-3 -Phive -Phive-thriftserver
$ spark-3.5.5-bin-hadoop3-scala2.13 ls jars | grep jna
jna-5.9.0.jar
```

```
$ spark-4.0.0-bin-hadoop3 cat RELEASE
Spark 4.0.0 (git revision ca56e9ce591) built for Hadoop 3.4.1
Build flags: -B -Pyarn -Pkubernetes -Psparkr -Phadoop-3 -Phive -Phive-thriftserver
$ spark-4.0.0-bin-hadoop3 ls jars | grep jna
<no-output>
```

It's rare to use the non-default Scala version with Spark in practice, we shall respect Spark 4 deps for Scala 2.13 cases.

### How was this patch tested?

Review.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #6999 from pan3793/authz-scala213.

Closes #6999

18230a2d7 [Cheng Pan] Keep JNA in authz-shaded with Scala 2.13

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-03-25 17:30:24 +08:00
Cheng Pan
6459680d89
[KYUUBI #6998] [TEST] Harness SparkProcessBuilderSuite
### Why are the changes needed?

Fix the missing `assert` in `SparkProcessBuilderSuite - spark process builder`.

Fix the flaky test `SparkProcessBuilderSuite - capture error from spark process builder` by increasing `kyuubi.session.engine.startup.maxLogLines` from 10 to 4096, this is easy to fail, especially in Spark 4.0 due to increased error stack trace. for example, https://github.com/apache/kyuubi/actions/runs/13974413470/job/39290129824

```
SparkProcessBuilderSuite:
- spark process builder
- capture error from spark process builder *** FAILED ***
  The code passed to eventually never returned normally. Attempted 167 times over 1.5007926256666668 minutes. Last failure message: "org.apache.kyuubi.KyuubiSQLException: 	Suppressed: org.apache.spark.util.Utils$OriginalTryStackTraceException: Full stacktrace of original doTryWithCallerStacktrace caller
   See more: /home/runner/work/kyuubi/kyuubi/kyuubi-server/target/work/kentyao/kyuubi-spark-sql-engine.log.2
  	at org.apache.kyuubi.KyuubiSQLException$.apply(KyuubiSQLException.scala:69)
  	at org.apache.kyuubi.engine.ProcBuilder.$anonfun$start$1(ProcBuilder.scala:239)
  	at java.base/java.lang.Thread.run(Thread.java:1583)
  .
  FYI: The last 10 line(s) of log are:
  25/03/24 12:53:39 INFO MemoryStore: MemoryStore started with capacity 434.4 MiB
  25/03/24 12:53:39 INFO MemoryStore: MemoryStore cleared
  25/03/24 12:53:39 INFO BlockManager: BlockManager stopped
  25/03/24 12:53:39 INFO BlockManagerMaster: BlockManagerMaster stopped
  25/03/24 12:53:39 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
  25/03/24 12:53:39 INFO SparkContext: Successfully stopped SparkContext
  25/03/24 12:53:39 INFO ShutdownHookManager: Shutdown hook called
  25/03/24 12:53:39 INFO ShutdownHookManager: Deleting directory /tmp/spark-18455622-344e-48ac-92eb-4b368c35e697
  25/03/24 12:53:39 INFO ShutdownHookManager: Deleting directory /home/runner/work/kyuubi/kyuubi/kyuubi-server/target/work/kentyao/artifacts/spark-7479249b-44a2-4fe5-aa0f-544074f9c356
  25/03/24 12:53:39 INFO ShutdownHookManager: Deleting directory /tmp/spark-5ba8250f-1ff2-4e0d-a365-27d7518308e1" did not contain "org.apache.hadoop.hive.ql.metadata.HiveException:". (SparkProcessBuilderSuite.scala:77)
```

### How was this patch tested?

Pass GHA, and verified locally with Spark 4.0.0 RC3 by running tests 10 times with constant success.

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #6998 from pan3793/spark-pb-ut.

Closes #6998

a4290b413 [Cheng Pan] harness SparkProcessBuilderSuite

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-03-25 17:28:58 +08:00
Wang, Fei
196b47e32a [KYUUBI #6997] Get the latest batch app info after submit process terminated to prevent batch ERROR due to engine submit timeout
### Why are the changes needed?

We meet below issue:
For spark on yarn:
```
spark.yarn.submit.waitAppCompletion=false
kyuubi.engine.yarn.submit.timeout=PT10M
```

Due to network issue, the application submission was very slow.

It was submitted after 15 minutes.
<img width="1430" alt="image" src="https://github.com/user-attachments/assets/a326c3d1-4d39-42da-b6aa-cad5f8e7fc4b" />

<img width="1350" alt="image" src="https://github.com/user-attachments/assets/8e20056a-bd71-4515-a5e3-f881509a34b2" />

Then the batch failed from PENDING state to ERRO state directly, due to application state NOT_FOUND(exceeds the kyuubi.engine.yarn.submit.timeout).

a54ee39ab3/kyuubi-server/src/main/scala/org/apache/kyuubi/engine/ApplicationOperation.scala (L99-L106)

<img width="1727" alt="image" src="https://github.com/user-attachments/assets/20a2987c-675c-4136-a107-001f30b1b217" />

Here is the operation event:
<img width="1727" alt="image" src="https://github.com/user-attachments/assets/e2bab9c3-a959-4e2b-a207-813ae6489b30" />

But from the batch log, the current application status should be `PENDING`.
```
:2025-03-21 17:36:19.350 INFO [KyuubiSessionManager-exec-pool: Thread-176922] org.apache.kyuubi.operation.BatchJobSubmission: Batch report for bbba09c8-3704-4a87-8394-9bcbbd39cc34, Some(ApplicationInfo(application_1741747369441_2258235,6042072c-e8fa-425d-a6a3-3d5bbb4ec1e3-275732_6042072c-e8fa-425d-a6a3-3d5bbb4ec1e3-275732.e3a34b86-7fc7-43ea-b4a5-1b6f27df54b5.0_20250322002147.stm,PENDING,Some(https://apollo-rno-rm-2.vip.hadoop.ebay.com:50030/proxy/application_1741747369441_2258235/),Some()))
```

So, we should retrieve the batch application info after the submission process terminated before checking the application failed, to get the current application information to prevent the corner case:
1. the application submission time exceeds the `kyuubi.engine.yarn.submit.timeout` and the app state is NOT FOUND
2. can not get the application report before the submission process terminated
3. then the batch state to ERROR from PENDING directly.

Conclusion:

The application state transition was:

UNKNOWN(before submit timeout) -> NOT_FOUND(reach submit timeout) -> processExit -> batchOpError -> PENDING(updateApplicationInfoMetadataIfNeeded) -> UNKNOWN(batchError but app not terminated)

After this PR, it should be:

UNKNOWN(before submit timeout) -> NOT_FOUND(reach submit timeout) ->  processExit-> PENDING(after process terminated) -> ....

### How was this patch tested?

Existing GA.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #6997 from turboFei/app_not_found_v2.

Closes #6997

370cf49e9 [Wang, Fei] v2
912ec28ca [Wang, Fei] nit
3c376f922 [Wang, Fei] log the op ex
d9cbdb87d [Wang, Fei] fix app not found

Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: Wang, Fei <fwang12@ebay.com>
2025-03-24 12:53:22 -07:00
wuziyi
2080c2186c
[KYUUBI #6990] Add rebalance before InsertIntoHiveDirCommand and InsertIntoDataSourceDirCommand to align with behaviors of hive
### Why are the changes needed?

When users switch from Hive to Spark, for sql like INSERT OVERWRITE DIRECTORY AS SELECT, it would be great if small files could be automatically merged through simple configuration, just like in Hive.

### How was this patch tested?

UnitTest

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #6991 from Z1Wu/feat/add_insert_dir_rebalance_support.

Closes #6990

2820bb2d2 [wuziyi] [fix] nit
a69c04191 [wuziyi] [fix] nit
951a7738f [wuziyi] [fix] nit
f75dfcb3a [wuziyi] [Feat] add rebalance before InsertIntoHiveDirCommand and InsertIntoDataSourceDirCommand to align with behaviors of hive

Authored-by: wuziyi <wuziyi02@corp.netease.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-03-25 00:52:55 +08:00
Wang, Fei
338206e8a7 [KYUUBI #6785] Shutdown the executor service in KubernetesApplicationOperation and prevent NPE
# 🔍 Description
## Issue References 🔗

As title.

Fix NPE, because the cleanupTerminatedAppInfoTrigger will be set to `null`.
d3520ddbce/kyuubi-server/src/main/scala/org/apache/kyuubi/engine/KubernetesApplicationOperation.scala (L269)

Also shutdown the ExecutorService when KubernetesApplicationOperation stoped.
## Describe Your Solution 🔧

Shutdown the thread executor service and check the null.
## Types of changes 🔖

- [x] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

#### Behavior Without This Pull Request ⚰️

#### Behavior With This Pull Request 🎉

#### Related Unit Tests

---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6785 from turboFei/npe_k8s.

Closes #6785

6afd052e6 [Wang, Fei] comments
f0c3e3134 [Wang, Fei] prevent npe
9dffe0125 [Wang, Fei] shutdown

Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: Wang, Fei <fwang12@ebay.com>
2025-03-23 13:19:22 -07:00
Reese Feng
a54ee39ab3 [KYUUBI #6984] Fix ValueError when rendering MapType data
[
[KYUUBI #6984] Fix ValueError when rendering MapType data
](https://github.com/apache/kyuubi/issues/6984)

### Why are the changes needed?
The issue was caused by an incorrect iteration of MapType data in the `%table` magic command. When iterating over a `MapType` column, the code used `for k, v in m` directly, which leads to a `ValueError` because raw `Map` entries may not be properly unpacked

### How was this patch tested?
- [x] Manual testing:
  Executed a query with a `MapType` column and confirmed that the `%table` command now renders it without errors.
```python
 from pyspark.sql import SparkSession
 from pyspark.sql.types import MapType, StringType, IntegerType
 spark = SparkSession.builder \
     .appName("MapFieldExample") \
     .getOrCreate()

 data = [
     (1, {"a": "1", "b": "2"}),
     (2, {"x": "10"}),
     (3, {"key": "value"})
 ]

 schema = "id INT, map_col MAP<STRING, STRING>"
 df = spark.createDataFrame(data, schema=schema)
 df.printSchema()
 df2=df.collect()
```
using `%table` render table
```python
 %table df2
```

result
```python
{'application/vnd.livy.table.v1+json': {'headers': [{'name': 'id', 'type': 'INT_TYPE'}, {'name': 'map_col', 'type': 'MAP_TYPE'}], 'data': [[1, {'a': '1', 'b': '2'}], [2, {'x': '10'}], [3, {'key': 'value'}]]}}

```

### Was this patch authored or co-authored using generative AI tooling?
No

**notice** This PR was co-authored by DeepSeek-R1.

Closes #6985 from JustFeng/patch-1.

Closes #6984

e0911ba94 [Reese Feng] Update PySparkTests for magic cmd
bc3ce1a49 [Reese Feng] Update PySparkTests for magic cmd
200d7ad9b [Reese Feng] Fix syntax error in dict iteration in magic_table_convert_map

Authored-by: Reese Feng <10377945+JustFeng@users.noreply.github.com>
Signed-off-by: Wang, Fei <fwang12@ebay.com>
2025-03-19 21:18:35 -07:00
wforget
cb36e748ed
[KYUUBI #6989] Calculate expected join partitions based on scanned table size
### Why are the changes needed?

Avoid unstable test case caused by table size changes, this is likely to happen when upgrading Parquet/ORC/Spark.

### How was this patch tested?

unit test

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #6989 from wForget/minor_fix.

Closes #6989

9cdd36973 [wforget] address comments
f79fcca0d [wforget] Calculate expected join partitions based on scanned table size

Authored-by: wforget <643348094@qq.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-03-18 20:23:35 +08:00
Cheng Pan
86d7b3b348
[KYUUBI #6988] [INFRA] Foward GitHub discussions to ASF mailing list
### Why are the changes needed?

To satisfy the ASF requirements.

> An error occurred while processing the github feature in .asf.yaml:
>
> GitHub discussions can only be enabled if a mailing list target exists for it.

### How was this patch tested?

Review and monitor the master branch after merging.

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #6988 from pan3793/discussion.

Closes #6988

dd788c054 [Cheng Pan] [INFRA] Foward GitHub discussions to ASF mailing list

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-03-17 20:24:57 +08:00
davidyuan
d8f031a9fc
[KYUUBI #6941] Test Add new Column for paimon
### Why are the changes needed?

Cuurently, ranger check test case missing check paimon add new column command, add it.
#6941

### How was this patch tested?

Test Add New Column for paimin with ranger

### Was this patch authored or co-authored using generative AI tooling?

No

This patch had conflicts when merged, resolved by
Committer: Cheng Pan <chengpan@apache.org>

Closes #6945 from davidyuan1223/test_add_new_column_for_paimon.

Closes #6941

f865e132a [davidyuan] test add new column

Authored-by: davidyuan <yuanfuyuan@mafengwo.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-03-17 16:25:26 +08:00
davidyuan
e1393247f7
[KYUUBI #6951] Test changing column type
### Why are the changes needed?

Ranger check test case missing paimon changing column type command, add the test case
#6951

### How was this patch tested?

Test ranger check paimon changing column type command

### Was this patch authored or co-authored using generative AI tooling?

No

This patch had conflicts when merged, resolved by
Committer: Cheng Pan <chengpan@apache.org>

Closes #6956 from davidyuan1223/test_changing_column_type.

Closes #6951

9d5140e81 [davidyuan] Merge branch 'master' into test_changing_column_type
e4f8974d8 [davidyuan] test changing column type

Authored-by: davidyuan <yuanfuyuan@mafengwo.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-03-17 16:20:33 +08:00
dnskr
3641d9fb0a
[KYUUBI #6986] [DOC] Fix multiple Pygments lexer name issues
### Why are the changes needed?

The PR fixes multiple `Pygments lexer name` issues and resolves the following warnings during the documentation build process:
```
../kyuubi/docs/client/advanced/kerberos.md:37: WARNING: Pygments lexer name 'cmd' is not known
../kyuubi/docs/client/bi_tools/hue.md:26: WARNING: Lexing literal_block "Welcome to\n  __  __                           __\n /\\ \\/\\ \\                         /\\ \\      __\n \\ \\ \\/'/'  __  __  __  __  __  __\\ \\ \\____/\\_\\\n  \\ \\ , <  /\\ \\/\\ \\/\\ \\/\\ \\/\\ \\/\\ \\\\ \\ '__`\\/\\ \\\n   \\ \\ \\\\`\\\\ \\ \\_\\ \\ \\ \\_\\ \\ \\ \\_\\ \\\\ \\ \\L\\ \\ \\ \\\n    \\ \\_\\ \\_\\/`____ \\ \\____/\\ \\____/ \\ \\_,__/\\ \\_\\\n     \\/_/\\/_/`/___/> \\/___/  \\/___/   \\/___/  \\/_/\n                /\\___/\n                \\/__/" as "bash" resulted in an error at token: "'". Retrying in relaxed mode. [misc.highlighting_failure]
../kyuubi/docs/client/jdbc/hive_jdbc.md:27: WARNING: Pygments lexer name 'gradle' is not known
../kyuubi/docs/client/jdbc/kyuubi_jdbc.rst:111: WARNING: Pygments lexer name 'jdbc' is not known
../kyuubi/docs/client/jdbc/kyuubi_jdbc.rst:134: WARNING: Pygments lexer name 'jdbc' is not known
../kyuubi/docs/client/jdbc/kyuubi_jdbc.rst:143: WARNING: Pygments lexer name 'jdbc' is not known
../kyuubi/docs/client/jdbc/kyuubi_jdbc.rst:163: WARNING: Pygments lexer name 'jdbc' is not known
../kyuubi/docs/connector/spark/delta_lake_with_azure_blob.rst:191: WARNING: Pygments lexer name 'log' is not known
../kyuubi/docs/deployment/hive_metastore.md:38: WARNING: Pygments lexer name 'shell script' is not known
../kyuubi/docs/deployment/hive_metastore.md:207: WARNING: Lexing literal_block "Caused by: org.apache.thrift.TApplicationException: Invalid method name: 'get_table_req'\n\tat org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:79)\n\tat org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_table_req(ThriftHiveMetastore.java:1567)\n\tat org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_table_req(ThriftHiveMetastore.java:1554)\n\tat org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:1350)\n\tat org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.getTable(SessionHiveMetaStoreClient.java:127)\n\tat sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)\n\tat sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)\n\tat sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat java.lang.reflect.Method.invoke(Method.java:498)\n\tat org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:173)\n\tat com.sun.proxy.$Proxy37.getTable(Unknown Source)\n\tat sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)\n\tat sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)\n\tat sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat java.lang.reflect.Method.invoke(Method.java:498)\n\tat org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2336)\n\tat com.sun.proxy.$Proxy37.getTable(Unknown Source)\n\tat org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1274)\n\t... 93 more" as "java" resulted in an error at token: "'". Retrying in relaxed mode. [misc.highlighting_failure]
../kyuubi/docs/extensions/server/authentication.rst:75: WARNING: Pygments lexer name 'property' is not known
../kyuubi/docs/extensions/server/events.rst:76: WARNING: Pygments lexer name 'property' is not known
../kyuubi/docs/monitor/logging.md:38: WARNING: Pygments lexer name 'log' is not known
../kyuubi/docs/monitor/logging.md:86: WARNING: Pygments lexer name 'log' is not known
../kyuubi/docs/monitor/logging.md:222: WARNING: Pygments lexer name 'log' is not known
../kyuubi/docs/security/kerberos.rst:104: WARNING: Pygments lexer name 'property' is not known
../kyuubi/docs/security/ldap.md:24: WARNING: Pygments lexer name 'properties example' is not known
../kyuubi/docs/security/ldap.md:40: WARNING: Pygments lexer name 'properties example' is not known

```

Supported languages: [Pygments lexers](https://pygments.org/docs/lexers) and [highlightjs](https://github.com/highlightjs/highlight.js/blob/main/SUPPORTED_LANGUAGES.md).

### How was this patch tested?

Built documentation locally and checked there are related warnings.

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #6986 from dnskr/fix-unknown-Pygments-lexer-name.

Closes #6986

f5b62f52d [dnskr] [DOC] Fix multiple Pygments lexer name issues

Authored-by: dnskr <dnskrv88@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-03-17 16:06:08 +08:00