### Why are the changes needed?
The default behavior of HDFS is to set the permission of a file created with `FileSystem.create` or `FileSystem.mkdirs` to `(P & ^umask)`, where `P` is the permission in the API call and umask is a system value set by `fs.permissions.umask-mode` and defaults to `0022`. This means, with default settings, any mkdirs call can have at most `755` permissions.
The same issue also got reported in [SPARK-30860](https://issues.apache.org/jira/browse/SPARK-30860)
### How was this patch tested?
Manual test.
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes#7148 from pan3793/fs-mkdirs.
Closes#7148
7527060ac [Cheng Pan] fix
f64913277 [Cheng Pan] Fix spark.kubernetes.file.upload.path permission
Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
### Why are the changes needed?
Followup for #7034 to fix the SparkOnKubernetesTestsSuite.
Sorry, I forget that the appInfo name and pod name were deeply bound before, the appInfo name was used as pod name and used to delete pod.
In this PR, we add `podName` into applicationInfo to separate app name and pod name.
### How was this patch tested?
GA should pass.
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes#7039 from turboFei/fix_test.
Closes#7034
0ff7018d6 [Wang, Fei] revert
18e48c079 [Wang, Fei] comments
19f34bc83 [Wang, Fei] do not get pod name from appName
c1d308437 [Wang, Fei] reduce interval for test stability
50fad6bc5 [Wang, Fei] fix ut
Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: Wang, Fei <fwang12@ebay.com>
### Why are the changes needed?
Fix build issue after #7041
### How was this patch tested?
GA.
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes#7042 from turboFei/fix_build.
Closes#7041
d026bf554 [Wang, Fei] fix build
Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: Wang, Fei <fwang12@ebay.com>
### Why are the changes needed?
Test Spark 3.5.5 Release Notes
https://spark.apache.org/releases/spark-release-3-5-5.html
### How was this patch tested?
Pass GHA.
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes#6939 from pan3793/spark-3.5.5.
Closes#6939
8c0288ae5 [Cheng Pan] ga
78b0e72db [Cheng Pan] nit
686a7b0a9 [Cheng Pan] fix
d40cc5bba [Cheng Pan] Bump Spark 3.5.5
Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
### Why are the changes needed?
I observed ClickHouse integration test failure in GHA, after some investigation, the root cause is https://github.com/testcontainers/testcontainers-java/pull/9942
```
/entrypoint.sh: neither CLICKHOUSE_USER nor CLICKHOUSE_PASSWORD is set, disabling network access for user 'default'
```
In short, the recent ClickHouse docker image does not allow the `default` user to connect without a password, unfortunately, `testcontainers-scala-clickhosue` does not expose API to set CLICKHOSUE_USER and CLICKHOUSE_PASSWORD, as a workaround, I pin `clickhouse-server:24.3.15`(the latest version has no such restriction) until a fixed version of Testcontainers available.
This PR also switches the `clickhouse-jdbc`'s classifier from `http` to `shaded`, the reason is, `http` does not ship ApacheHttpClient5, previously, it happened to work because `iceberg-runtime-spark3.5_2.12` packaged un-relocated ApacheHttpClient5 classes, but it gets fixed in Iceberg 1.8.0, then `clickhouse-jdbc:http` stop working.
```
java.lang.NoClassDefFoundError: org/apache/hc/core5/http/HttpRequest
```
Additionally, this PR bumps `clickhouse-jdbc` from 0.6.0 to 0.6.5.
### How was this patch tested?
Pass GHA.
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes#6915 from pan3793/fix-ch-test.
Closes#6915
996f095e0 [Cheng Pan] Pin clickhouse-server:24.3.15
d633df07c [Cheng Pan] Bump clickhouse-jdbc 0.6.5
214c8a227 [Cheng Pan] Fix ClickHouse integration tests
Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
### Why are the changes needed?
Spark 3.5.4 is released https://spark.apache.org/releases/spark-release-3-5-4.html
### How was this patch tested?
Pas GHA
### Was this patch authored or co-authored using generative AI tooling?
No
Closes#6842 from pan3793/spark-3.5.4.
Closes#6842
0fb7ad8a0 [Cheng Pan] ga
8eacc9c97 [Cheng Pan] Spark 3.5.4 RC2
0721fa401 [Cheng Pan] fix
49e98a201 [Cheng Pan] maven repo
951db0c82 [Cheng Pan] Spark 3.5.4
Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
# 🔍 Description
## Issue References 🔗
This pull request fixes #
## Describe Your Solution 🔧
Preparing v1.11.0-SNAPSHOT after branch-1.10 cut
```shell
build/mvn versions:set -DgenerateBackupPoms=false -DnewVersion="1.11.0-SNAPSHOT"
(cd kyuubi-server/web-ui && npm version "1.11.0-SNAPSHOT")
```
## Types of changes 🔖
- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
#### Behavior Without This Pull Request ⚰️
#### Behavior With This Pull Request 🎉
#### Related Unit Tests
---
# Checklist 📝
- [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
Closes#6769 from bowenliang123/bump-1.11.
Closes#6769
6db219d28 [Bowen Liang] get latest_branch by sorting version in branch name
465276204 [Bowen Liang] update package.json
81f2865e5 [Bowen Liang] bump
Authored-by: Bowen Liang <liangbowen@gf.com.cn>
Signed-off-by: Bowen Liang <liangbowen@gf.com.cn>
# 🔍 Description
Spark 3.5.2 was released recently.
Release Notes is available at https://spark.apache.org/releases/spark-release-3-5-2.html
## Types of changes 🔖
- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
Pass GHA.
---
# Checklist 📝
- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
Closes#6609 from pan3793/spark-3.5.2.
Closes#6609
587cf1dd3 [Cheng Pan] Bump Spark 3.5.2
Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
# 🔍 Description
This PR rewrites some utility methods in Java, specifically,
```
Utils.isWindows
Utils.isMac
Utils.findLocalInetAddress
```
and moves them from `kyuubi-common`'s `Utils` to the `kyuubi-util`'s `JavaUtils`, so that they could be used in other modules that do not depend on `kyuubi-common`.
## Types of changes 🔖
- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
Pass GHA.
---
# Checklist 📝
- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
Closes#6499 from pan3793/javautils.
Closes#6499
565936def [Cheng Pan] fix
f06a85e9f [Cheng Pan] Move some untiliy methods in Java
Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
# 🔍 Description
[`jcl-over-slf4j`](https://www.slf4j.org/legacy.html#jcl-over-slf4j) is a drop-in replacement of `commons-logging`, the latter one should not be present in the final classpath, otherwise, there are potential class conflict issues.
The current dep check is problematic, this PR also changes it to always perform "install" to fix the false negative report.
## Types of changes 🔖
- [x] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
Simply delete `commons-logging-1.1.3.jar` from `apache-kyuubi-1.9.1-bin.tgz` and everything goes well.
---
# Checklist 📝
- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
Closes#6458 from pan3793/commons-logging.
Closes#6458
114ec766a [Cheng Pan] fix
79d4121a1 [Cheng Pan] fix
6633e83ee [Cheng Pan] fix
21127ed0b [Cheng Pan] always perform install on dep check
98b13dfcf [Cheng Pan] Remove commons-logging from binary release
Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
# 🔍 Description
## Issue References 🔗
This pull request fixes#6253
## Describe Your Solution 🔧
Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.
## Types of changes 🔖
- [ ] Bugfix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
#### Behavior Without This Pull Request ⚰️
#### Behavior With This Pull Request 🎉
<img width="1251" alt="image" src="https://github.com/apache/kyuubi/assets/18713676/b654a300-8c79-4461-9fba-4ad1c913accc">
#### Related Unit Tests
---
# Checklist 📝
- [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
Closes#6275 from lsm1/branch-jdbc-engine-on-yarn.
Closes#6253
5ed4af041 [senmiaoliu] fix style
86e032688 [senmiaoliu] fix style
b3e114445 [senmiaoliu] fix style
bf33de300 [senmiaoliu] fix style
c38404918 [senmiaoliu] jdbc engine on yarn
Authored-by: senmiaoliu <senmiaoliu@trip.com>
Signed-off-by: Shaoyun Chen <csy@apache.org>
# 🔍 Description
## Issue References 🔗
This pull request fixes#5374
## Describe Your Solution 🔧
JDBC Engine supports ClickHouse
## Types of changes 🔖
- [ ] Bugfix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
#### Behavior Without This Pull Request ⚰️
#### Behavior With This Pull Request 🎉
#### Related Unit Tests
---
# Checklist 📝
- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
Closes#6225 from lsm1/branch-support-clickhouse.
Closes#5374
0ce4f6f0b [senmiaoliu] fix style
f6ab022b6 [senmiaoliu] use ck jdbc http jar
dee6a6bdc [senmiaoliu] add it test
aed6b33a9 [senmiaoliu] init clickhouse engine
Authored-by: senmiaoliu <senmiaoliu@trip.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
# 🔍 Description
## Issue References 🔗
They have stopped patching the JDK 1.5 jars that Hadoop uses (see [HADOOP-18540](https://issues.apache.org/jira/browse/HADOOP-18540)).
The new artifacts have similar names - but the names are like bcprov-jdk18on as opposed to bcprov-jdk15on.
CVE-2023-33201 is an example of a security issue that seems only to be fixed in the JDK 1.8 artifacts (ie no JDK 1.5 jar has the fix).
https://www.bouncycastle.org/releasenotes.html#r1rv77 latest current release but the CVE was fixed in 1.74.
To be clear, Kyuubi only uses BouncyCastle for testing, the CVE does not affect Kyuubi distribution.
## Describe Your Solution 🔧
Bump BouncyCastle from 1.67 to 1.77, and change the artifactId from `*-jdk15on` to `*jdk18on`.
## Types of changes 🔖
- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
Pass GA.
---
# Checklist 📝
- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
Closes#6177 from pan3793/bouncycastle.
Closes#6177
8595b98c1 [Cheng Pan] Bump BouncyCastle from 1.67 to 1.77
b9e7123f6 [Cheng Pan] Bump bouncycastle from 1.67 to 1.77
Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
# 🔍 Description
## Issue References 🔗
`kyuubi-gluten-it` is always enabled and uses a different Spark version, which mess the code jumping in IDEA.
## Describe Your Solution 🔧
- conditional enable `kyuubi-gluten-it` module via profile `gluten-it`
- refactor POM and GHA workflow to reduce duplicated definition
## Types of changes 🔖
- [x] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
Review, and wait for daily GHA results.
---
# Checklist 📝
- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
Closes#6178 from pan3793/gluten-it.
Closes#6178
77b7bb809 [Cheng Pan] fix
c62ce40f8 [Cheng Pan] Improve Gluten intergartion test
Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
# 🔍 Description
## Issue References 🔗
Kyuubi fully supports Spark 3.5 now, this pull request aims to set the default Spark to 3.5 in Kyuubi 1.9
## Types of changes 🔖
- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
Pass GA.
---
# Checklist 📝
- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
Closes#6163 from pan3793/spark-3.5-default.
Closes#6163
f386aeb7a [Cheng Pan] Set default Spark version to 3.5
Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
# 🔍 Description
## Issue References 🔗
The `getCodeSourceLocation` lives on `kyuubi-common`, which is not reachable for modules like `kyuubi-hive-beeline`.
## Describe Your Solution 🔧
Move it to `kyuubi-util`.
## Types of changes 🔖
- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
Pass GA.
---
# Checklist 📝
- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
Closes#6140 from pan3793/utils.
Closes#6140
e680d0516 [Cheng Pan] nit
1f79705d8 [Cheng Pan] fix
9420c0f81 [Cheng Pan] fix
f845a7f39 [Cheng Pan] Move getCodeSourceLocation to kyuubi-util
Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
# 🔍 Description
## Issue References 🔗
`build/dist` is going to fail without profile `-Pjdbc-shaded` after bumping a new version.
## Describe Your Solution 🔧
Making the JDBC IT module always depends on `kyuubi-hive-jdbc-shaded`, to ensure the shaded JDBC is packaged before the JDBC IT module performs copy jars.
## Types of changes 🔖
- [x] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
Bump a new version, then perform making the distribution.
```
build/mvn versions:set -DgenerateBackupPoms=false -DnewVersion="1.9.1-SNAPSHOT"
build/dist --spark-provided --hive-provided --flink-provided
```
Before, failed with
```
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-dependency-plugin:3.6.1:copy (copy) on project kyuubi-jdbc-it_2.12: Unable to find/resolve artifact.: Could not find artifact org.apache.kyuubi:kyuubi-hive-jdbc-shaded:jar:1.9.1-SNAPSHOT in aliyun-apache-snapshots (https://maven.aliyun.com/repository/apache-snapshots/) -> [Help 1]
```
After, everything goes well.
---
# Checklist 📝
- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
Closes#6135 from pan3793/impala-folowup.
Closes#5509
12f600d2d [Cheng Pan] [KYUUBI #5509][FOLLOWUP] JDBC IT should always depends on kyuubi-hive-jdbc-shaded
Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
# 🔍 Description
## Issue References 🔗
This pull request fixes#5509
## Describe Your Solution 🔧
Added [Apache Impala](https://impala.apache.org) support in the form of the JDBC engine dialect. Slightly modified Kyuubi Hive JDBC driver in order to use it as driver for Impala dialect instead of the original Hive driver.
## Types of changes 🔖
- [ ] Bugfix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
#### Related Unit Tests
- `org.apache.kyuubi.engine.jdbc.impala.OperationWithImpalaEngineSuite`
- `org.apache.kyuubi.engine.jdbc.impala.SessionSuite`
- `org.apache.kyuubi.engine.jdbc.impala.StatementSuite`
#### Related Integration Tests
- `org.apache.kyuubi.it.jdbc.impala.OperationWithServerSuite`
- `org.apache.kyuubi.it.jdbc.impala.SessionWithServerSuite`
- `org.apache.kyuubi.it.jdbc.impala.StatementWithServerSuite`
---
# Checklist 📝
- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
Closes#6104 from tigrulya-exe/feature/5509-support-impala-jdbc-dialect.
Closes#5509
32ae6d846 [Tigran Manasyan] Codestyle fixes
985212561 [Tigran Manasyan] fix review comments
ecb0d7dca [Tigran Manasyan] copy impala compose file to integration tests resources
5ea347430 [Tigran Manasyan] fix order in services file
2c63a7003 [Tigran Manasyan] Add Apache Impala JDBC engine dialect
Authored-by: Tigran Manasyan <t.manasyan@arenadata.io>
Signed-off-by: Cheng Pan <chengpan@apache.org>
# 🔍 Description
## Issue References 🔗
This pull request fixes#6043
## Describe Your Solution 🔧
Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.
## Types of changes 🔖
- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
#### Behavior Without This Pull Request ⚰️
#### Behavior With This Pull Request 🎉
#### Related Unit Tests
---
# Checklist 📝
- [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
Closes#6055 from Kwafoor/kyuubi_6043.
Closes#6043
0b5c15361 [wangjunbo] [KYUUBI #6043][BUG][TEST][GLUTEN] Gluten-it gluten package add arch suffix
Authored-by: wangjunbo <wangjunbo@qiyi.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
# 🔍 Description
## Issue References 🔗
While enabling batch implementation V2 with the following configurations
```
kyuubi.batch.impl.version=2
kyuubi.batch.submitter.enabled=true
kyuubi.batch.submitter.threads=48
spark.master=yarn
spark.submit.deployMode=cluster
spark.yarn.submit.waitAppCompletion=false
```
I found that the batch jobs will be blocked in the DB queue once a YARN queue has no resources, this brings an issue, the subsequential batch jobs that are going to be submitted to another YARN queue also be queued in DB, rather than YARN queue.
```
mysql> select state, engine_state, count(1) from metadata where state in ('INITIALIZED', 'PENDING', 'RUNNING') group by state, engine_state;
+-------------+--------------+----------+
| state | engine_state | count(1) |
+-------------+--------------+----------+
| INITIALIZED | NULL | 166 |
| PENDING | NULL | 1 |
| RUNNING | PENDING | 148 |
| RUNNING | RUNNING | 415 |
+-------------+--------------+----------+
```
## Describe Your Solution 🔧
The submitter queue whose size is controlled by `kyuubi.batch.submitter.threads` is designed to address the `spark-submit` process concurrency issue, too many `spark-submit` processes may run out of the Kyuubi server's node CPU/memory resources and eventually crash the service. For Spark YARN cluster mode, if set `spark.yarn.submit.waitAppCompletion=false`, the local `spark-submit` process exits immediately once the Application goes ACCEPTED status, even no resource could be allocated for the AM container, we should not block such case in submitter queue.
## Types of changes 🔖
- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
Pass GA, and roll out into internal cluster.
---
# Checklist 📝
- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
Closes#6028 from pan3793/batch-submit.
Closes#6028
05fcc758f [Cheng Pan] Exited spark-submit process should not block batch submit queue
Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
# 🔍 Description
## Issue References 🔗
HiveServer2 has a configuration `hive.server2.enable.doAs` to control the execution user between the session user and the server user, Kyuubi's CONNECTION and USER share levels always perform like doAs enabled do. In CDH 5/6, this is disabled by default, users who want to migrate from CDH to Kyuubi may encounter permission issues with the current implementation.
## Describe Your Solution 🔧
This pull request introduces a new configuration `kyuubi.engine.doAs.enabled` to allow enable/disable user impersonation on launching engine. For security purpose, it's not allowed to be overridden by session conf.
The change in this PR has certain limitations:
- only supports Spark engine
- only supports interactive mode, specifically, it does not take effect on Spark batch mode now.
## Types of changes 🔖
- [ ] Bugfix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
The first step is passing all existing UTs when `kyuubi.engine.doAs.enabled=true`.
Tested on internal Kerberized-environment, when `kyuubi.engine.share.level=CONNECTION` and `kyuubi.engine.doAs.enabled=false`, use user 'spark' to launch engine, and the engine submitted without `--proxy-user spark`, thus engine launched by server user `hive`, then run `select session_user(), current_user()` and returns
```
+-----------------+-----------------+
| session_user() | current_user() |
+-----------------+-----------------+
| spark | hive |
+-----------------+-----------------+
```
And I checked the `spark.app.name` and registered path on Zookeeper also expected.
```
+-----------------+--------------------------------------------------------------------------+
| key | value |
+-----------------+--------------------------------------------------------------------------+
| spark.app.name | kyuubi_USER_SPARK_SQL_spark_default_51a416e5-6023-4bac-a964-cd9605f17c61 |
+-----------------+--------------------------------------------------------------------------+
```
---
# Checklist 📝
- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
Closes#6003 from pan3793/doas.
Closes#6003
c4002fef5 [Cheng Pan] grammar
add20fd57 [Cheng Pan] nit
8711c2265 [Cheng Pan] address comment
033a32252 [Cheng Pan] 1.9.0
9273b9426 [Cheng Pan] fix
a1563e1ca [Cheng Pan] HadoopCredentialsManager
e982e2364 [Cheng Pan] Allow disable user impersonation on launching engine
Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
# 🔍 Description
## Issue References 🔗
This pull request canonicalizes Trino IT in GitHub Action workflow.
`mvn -am` means "also make", so that `mvn integration-tests/kyuubi-trino-it -am` would trigger compiling of `kyuubi-server`, `externals/kyuubi-spark-sql-engine`, `externals/kyuubi-download` modules automatically.
## Describe Your Solution 🔧
CI part change in 6688b3dacf is unnecessary.
## Types of changes 🔖
- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
Pass GA.
---
# Checklist 📝
- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
Closes#5978 from pan3793/trino-ci.
Closes#5978
1b8ef94d1 [Cheng Pan] set KYUUBI_HOME in trino-it
18e72d355 [Cheng Pan] Canonicalize Trino IT in GitHub Action workflow
Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
# 🔍 Description
## Issue References 🔗
This is a regular dependency upgrading,
## Describe Your Solution 🔧
Upgrade `trino-client` from 363 to 411. 411 is the latest version which uses okhttp 3.x, hence it does not have kotlin runtime dependencies.
This PR also updates the docs, especially the Trino cluster version requirement.
## Types of changes 🔖
- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
Pass GA.
---
# Checklist 📝
- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
Closes#5975 from pan3793/trino-411.
Closes#5975
2b57df34d [Cheng Pan] fix
c498a5bb3 [Cheng Pan] fix
21948ca4f [Cheng Pan] Fix compile
e4f1397cc [Cheng Pan] license
66583ca16 [Cheng Pan] Bump trino-client 411
Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
# 🔍 Description
## Issue References 🔗
This pull request fixes#5939
## Describe Your Solution 🔧closed#5939 and upgrade gluten version to `1.2.0-SNAPSHOT`
bug fix CI (expected failed): https://github.com/Kwafoor/incubator-kyuubi/actions/runs/7408085340
```
Caused by: org.apache.maven.project.DependencyResolutionException: Could not resolve dependencies for project org.apache.kyuubi:kyuubi-gluten-it_2.12🫙1.9.0-SNAPSHOT: Could not find artifact io.glutenproject:gluten-velox-bundle-spark3.3_2.12-ubuntu_22.04🫙1.1.0-SNAPSHOT at specified path /home/runner/work/incubator-kyuubi/incubator-kyuubi/integration-tests/kyuubi-gluten-it/../../gluten/package/target/gluten-velox-bundle-spark3.3_2.12-ubuntu_22.04-1.1.0-SNAPSHOT.jar
```
upgrade gluten version `1.2.0-SNAPSHOT` CI succeed:https://github.com/Kwafoor/incubator-kyuubi/actions/runs/7411523117
## Types of changes 🔖
- [x] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
#### Behavior Without This Pull Request ⚰️
#### Behavior With This Pull Request 🎉
#### Related Unit Tests
---
# Checklist 📝
- [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
Closes#5947 from Kwafoor/kyuubi_5939.
Closes#5939
1007890c6 [wangjunbo] [KYUUBI #5939] remove test branch
3b1890d4c [wangjunbo] upgrade Gluten version
fb7fcfdcc [wangjunbo] [KYUUBI #5939] test
0f40b5749 [wangjunbo] [KYUUBI #5939] fix gluten cache
Lead-authored-by: wangjunbo <wangjunbo@qiyi.com>
Co-authored-by: wangjunbo <junbo.w@outlook.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
# 🔍 Description
## Issue References 🔗
This PR aims to support hive engine run on yarn mode, close https://github.com/apache/kyuubi/issues/5867
## Describe Your Solution 🔧
Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.
## Types of changes 🔖
- [ ] Bugfix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
#### Behavior Without This Pull Request ⚰️
#### Behavior With This Pull Request 🎉
#### Related Unit Tests
---
# Checklists
## 📝 Author Self Checklist
- [ ] My code follows the [style guidelines](https://kyuubi.readthedocs.io/en/master/contributing/code/style.html) of this project
- [ ] I have performed a self-review
- [ ] I have commented my code, particularly in hard-to-understand areas
- [ ] I have made corresponding changes to the documentation
- [ ] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my feature works
- [ ] New and existing unit tests pass locally with my changes
- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
## 📝 Committer Pre-Merge Checklist
- [x] Pull request title is okay.
- [x] No license issues.
- [x] Milestone correctly set?
- [x] Test coverage is ok
- [x] Assignees are selected.
- [x] Minimum number of approvals
- [x] No changes are requested
**Be nice. Be informative.**
Closes#5868 from Yikf/hive-on-yarn.
Closes#5867
44f7287f5 [yikaifei] fix
3c17d2c4a [yikaifei] fix test
5474ebfba [yikaifei] parse classpath
6b97c4213 [Cheng Pan] Update kyuubi-common/src/main/scala/org/apache/kyuubi/engine/deploy/yarn/EngineYarnModeSubmitter.scala
34a67b452 [Cheng Pan] Update kyuubi-common/src/main/scala/org/apache/kyuubi/engine/deploy/yarn/EngineYarnModeSubmitter.scala
5e5045e66 [yikaifei] fix app type
d1eb5aea7 [yikaifei] fix
d89d09cfe [Cheng Pan] Update kyuubi-common/src/main/scala/org/apache/kyuubi/engine/deploy/yarn/EngineYarnModeSubmitter.scala
1fa18ba1b [Cheng Pan] Update kyuubi-common/src/main/scala/org/apache/kyuubi/engine/deploy/yarn/ApplicationMaster.scala
1b0b77f4d [Cheng Pan] Update kyuubi-common/src/main/scala/org/apache/kyuubi/engine/deploy/yarn/ApplicationMaster.scala
2ed1d4492 [Cheng Pan] Update kyuubi-common/src/main/scala/org/apache/kyuubi/engine/deploy/yarn/EngineYarnModeSubmitter.scala
98ff19ce6 [yikaifei] HiveEngine support run on YARN mode
Lead-authored-by: yikaifei <yikaifei@apache.org>
Co-authored-by: Cheng Pan <pan3793@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
# 🔍 Description
## Issue References 🔗
This pull request fixes#5856
## Describe Your Solution 🔧
## Types of changes 🔖
- [ ] Bugfix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
#### Behavior Without This Pull Request ⚰️
#### Behavior With This Pull Request 🎉
#### Related Unit Tests
---
# Checklists
## 📝 Author Self Checklist
- [x] My code follows the [style guidelines](https://kyuubi.readthedocs.io/en/master/contributing/code/style.html) of this project
- [x] I have performed a self-review
- [ ] I have commented my code, particularly in hard-to-understand areas
- [ ] I have made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my feature works
- [ ] New and existing unit tests pass locally with my changes
- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
## 📝 Committer Pre-Merge Checklist
- [x] Pull request title is okay.
- [x] No license issues.
- [x] Milestone correctly set?
- [x] Test coverage is ok
- [x] Assignees are selected.
- [x] Minimum number of approvals
- [x] No changes are requested
**Be nice. Be informative.**
Closes#5859 from zml1206/KYUUBI-5856.
Closes#5856
872fd06d2 [zml1206] Revert changes in SparkProcessBuilderSuite
bc4996f90 [zml1206] Bump spark 3.4.2
Authored-by: zml1206 <zhuml1206@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
# 🔍 Description
## Issue References 🔗
This pull request implement the https://github.com/apache/kyuubi/issues/5865, it support get SQL keywords from Hive engine through API.
## Describe Your Solution 🔧
Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.
## Types of changes 🔖
- [ ] Bugfix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
#### Behavior Without This Pull Request ⚰️
#### Behavior With This Pull Request 🎉
#### Related Unit Tests
---
# Checklists
## 📝 Author Self Checklist
- [x] My code follows the [style guidelines](https://kyuubi.readthedocs.io/en/master/contributing/code/style.html) of this project
- [x] I have performed a self-review
- [ ] I have commented my code, particularly in hard-to-understand areas
- [ ] I have made corresponding changes to the documentation
- [ ] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my feature works
- [ ] New and existing unit tests pass locally with my changes
- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
## 📝 Committer Pre-Merge Checklist
- [x] Pull request title is okay.
- [x] No license issues.
- [x] Milestone correctly set?
- [x] Test coverage is ok
- [x] Assignees are selected.
- [x] Minimum number of approvals
- [x] No changes are requested
**Be nice. Be informative.**
Closes#5866 from Yikf/hive-keywords.
Closes#5865
e54945d94 [yikaifei] Hive engine CLI_ODBC_KEYWORDS
Authored-by: yikaifei <yikaifei@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
# 🔍 Description
## Issue References 🔗
As described.
## Describe Your Solution 🔧
- replacing the usage of `ForAllTestContainer` with `TestContainerForAll`, simplifying the lifecycle for starting / stopping the containers and fetching the configs from the containers
- use `testcontainers-scala-postgresql` for testing with PostgreSQL containers
- add version 16 for PostgreSQL image tag
## Types of changes 🔖
- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
#### Behavior Without This Pull Request ⚰️
No behaviour changes.
#### Behavior With This Pull Request 🎉
No behaviour changes.
#### Related Unit Tests
JDBC Engine IT.
---
# Checklists
## 📝 Author Self Checklist
- [x] My code follows the [style guidelines](https://kyuubi.readthedocs.io/en/master/contributing/code/style.html) of this project
- [x] I have performed a self-review
- [ ] I have commented my code, particularly in hard-to-understand areas
- [ ] I have made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my feature works
- [x] New and existing unit tests pass locally with my changes
- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
## 📝 Committer Pre-Merge Checklist
- [ ] Pull request title is okay.
- [ ] No license issues.
- [ ] Milestone correctly set?
- [ ] Test coverage is ok
- [ ] Assignees are selected.
- [ ] Minimum number of approvals
- [ ] No changes are requested
**Be nice. Be informative.**
Closes#5862 from bowenliang123/jdbc-container.
Closes#5862
29e85121c [Bowen Liang] TestContainerForAll
Authored-by: Bowen Liang <liangbowen@gf.com.cn>
Signed-off-by: liangbowen <liangbowen@gf.com.cn>
# 🔍 Description
## Issue References 🔗
TL;DR there are some issues with shading Thrift RPC classes during the engine packaging phase, see details in the PR description of https://github.com/apache/kyuubi-shaded/pull/20.
## Describe Your Solution 🔧
This PR aims to migrate from vanilla `hive-service-rpc`, `libfb303`, `libthrift` to `kyuubi-relocated-hive-service-rpc` introduced in https://github.com/apache/kyuubi-shaded/pull/20, the detailed works are:
- replace imported deps in `pom.xml` and rename the package prefix in all modules, except for
- `kyuubi-server` there are a few places use vanilla thrift classes to access HMS to get token
- `kyuubi-hive-sql-engine` Hive method invocation
- update relocations rules in modules that creates shaded jar
- introduce `HiveRpcUtils` in `kyuubi-hive-sql-engine` module for object conversion.
As part of the whole change, this PR upgrades from the Kyuubi Shaded 0.1.0 to 0.2.0, which changes the jars name. see https://kyuubi.apache.org/shaded-release/0.2.0.html
## Types of changes 🔖
- [x] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
#### Behavior Without This Pull Request ⚰️
#### Behavior With This Pull Request 🎉
#### Related Unit Tests
Pass all Hive UT with Hive 3.1.3, and IT with Hive 3.1.3 and 2.3.9 (also tested with 2.1.1-cdh6.3.2)
---
# Checklists
## 📝 Author Self Checklist
- [ ] My code follows the [style guidelines](https://kyuubi.readthedocs.io/en/master/contributing/code/style.html) of this project
- [x] I have performed a self-review
- [x] I have commented my code, particularly in hard-to-understand areas
- [ ] I have made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my feature works
- [x] New and existing unit tests pass locally with my changes
- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
## 📝 Committer Pre-Merge Checklist
- [x] Pull request title is okay.
- [x] No license issues.
- [x] Milestone correctly set?
- [x] Test coverage is ok
- [x] Assignees are selected.
- [x] Minimum number of approvals
- [x] No changes are requested
**Be nice. Be informative.**
Closes#5783 from pan3793/rpc-shaded.
Closes#5783
b45d4deaa [Cheng Pan] remove staging repo
890076a20 [Cheng Pan] Kyuubi Shaded 0.2.0 RC0
071945d45 [Cheng Pan] Rebase
199794ed9 [Cheng Pan] fix
fc128b170 [Cheng Pan] fix
26d313896 [Cheng Pan] fix
632984c92 [Cheng Pan] fix
428305589 [Cheng Pan] fix
6301e28fd [Cheng Pan] fix
955cdb33b [Cheng Pan] Switch to kyuubi-shaded-hive-service-rpc
Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
# 🔍 Description
## Issue References 🔗
This pull request fixes#5467
## Describe Your Solution 🔧
1. Add Gluten UTs.
2. Setup CI for Gluten testing
3. Write docs to guide users in setting up Kyuubi with Spark plus Gluten.
## Types of changes 🔖
- [ ] Bugfix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
#### Behavior Without This Pull Request ⚰️
#### Behavior With This Pull Request 🎉
#### Related Unit Tests
github action ci tests: [Gluten Test CI](https://github.com/Kwafoor/incubator-kyuubi/actions/runs/7111586978)
---
# Checklists
## 📝 Author Self Checklist
- [ ] My code follows the [style guidelines](https://kyuubi.readthedocs.io/en/master/contributing/code/style.html) of this project
- [x] I have performed a self-review
- [x] I have commented my code, particularly in hard-to-understand areas
- [ ] I have made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [x] I have added tests that prove my fix is effective or that my feature works
- [x] New and existing unit tests pass locally with my changes
- [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
## 📝 Committer Pre-Merge Checklist
- [ ] Pull request title is okay.
- [ ] No license issues.
- [ ] Milestone correctly set?
- [ ] Test coverage is ok
- [ ] Assignees are selected.
- [ ] Minimum number of approvals
- [ ] No changes are requested
**Be nice. Be informative.**
Closes#5800 from Kwafoor/kyuubi_5467.
Closes#5800
c6dd26f93 [wangjunbo] fix
7818ae0c5 [wangjunbo] fix Scala Test
296f08c8c [wangjunbo] remove spark-3.2 gluten test
5a704675d [wangjunbo] [KYUUBI#5467] Integrate Intel Gluten with Spark engine
Authored-by: wangjunbo <wangjunbo@qiyi.com>
Signed-off-by: ulyssesyou <ulyssesyou@apache.org>
# 🔍 Description
## Issue References 🔗
This PR aims to support `CLI_ODBC_KEYWORDS` on flink engine to avoid https://github.com/apache/kyuubi/issues/2637
## Describe Your Solution 🔧
Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.
## Types of changes 🔖
- [x] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
#### Behavior Without This Pull Request ⚰️
#### Behavior With This Pull Request 🎉
#### Related Unit Tests
Adjusted existing test #org.apache.kyuubi.it.flink.operation.FlinkOperationSuite
---
# Checklists
## 📝 Author Self Checklist
- [x] My code follows the [style guidelines](https://kyuubi.readthedocs.io/en/master/contributing/code/style.html) of this project
- [x] I have performed a self-review
- [ ] I have commented my code, particularly in hard-to-understand areas
- [ ] I have made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [x] I have added tests that prove my fix is effective or that my feature works
- [x] New and existing unit tests pass locally with my changes
- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
## 📝 Committer Pre-Merge Checklist
- [x] Pull request title is okay.
- [x] No license issues.
- [ ] Milestone correctly set?
- [ ] Test coverage is ok
- [ ] Assignees are selected.
- [ ] Minimum number of approvals
- [ ] No changes are requested
**Be nice. Be informative.**
Closes#5782 from Yikf/flink-CLI_ODBC_KEYWORDS.
Closes#5782
ef0dc049a [yikaifei] Flink GetInfo support CLI_ODBC_KEYWORDS
Authored-by: yikaifei <yikaifei@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
### _Why are the changes needed?_
To close https://github.com/apache/kyuubi/issues/5464.
To support JDBC engine use MySQL Dialect (kyuubi.engine.jdbc.type=mysql).
### _How was this patch tested?_
- [x] Add some test cases that check the changes thoroughly including negative and positive cases if possible
- [ ] Add screenshots for manual tests if appropriate
- [ ] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request
### _Was this patch authored or co-authored using generative AI tooling?_
No.
Closes#5588 from Kwafoor/kyuubi_5464.
Closes#5464
1019a6118 [wangjunbo] [KYUUBI #5464]rename function name `getProviderClass` to `getDriverClass`
9901bbad4 [wangjunbo] [KYUUBI #5464]handle properly to keep compatiblity
b33d79ed2 [wangjunbo] [KYUUBI #5464]handle properly to keep compatiblity
86e6ee2b3 [wangjunbo] [KYUUBI #5464]handle properly to keep compatiblity
d76cb3275 [wangjunbo] [KYUUBI #5464]update the docs
4a1acffd1 [wangjunbo] [KYUUBI #5464]update the docs
1aff55ecd [wangjunbo] [KYUUBI #5464]update the docs of kyuubi.engine.type
84202ea0c [wangjunbo] [KYUUBI #5464] update the docs of kyuubi.engine.type
e3c1e94db [wangjunbo] [KYUUBI #5464] fix check
cdf820da0 [wangjunbo] [KYUUBI #5464] fix check
ff0f30ad8 [wangjunbo] [KYUUBI #5464] fix check
01321dc44 [wangjunbo] [KYUUBI #5464] JDBC Engine supports MySQL
756f5303c [wangjunbo] [KYUUBI #5464] JDBC Engine supports MySQL
Authored-by: wangjunbo <wangjunbo@qiyi.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
### _Why are the changes needed?_
Close#5375
### _How was this patch tested?_
- [x] Add some test cases that check the changes thoroughly including negative and positive cases if possible
- [ ] Add screenshots for manual tests if appropriate
- [x] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request
### _Was this patch authored or co-authored using generative AI tooling?_
No
Closes#5416 from ymZhao1001/jdbc-pg-dialect.
Closes#5375
d81988d84 [zhaoyangming] postgreSQL
915f9fb0a [yangming] change to like
d8da12af5 [yangming] reformat
29c63e38e [zhaoyangming] add postgresql dependency
ec328ad93 [zhaoyangming] add postgresql dependency
a8944fed5 [zhaoyangming] update postgresql to postgreSQL
cf7b69107 [zhaoyangming] Merge remote-tracking branch 'origin/jdbc-pg-dialect' into jdbc-pg-dialect
c127aa3d3 [zhaoyangming] update postgresql to postgreSQL
a693d6c34 [yangming] reformat
0d12a6ceb [zhaoyangming] add postgresql dependency
c7d3fa3da [yangming] fix conflict
dde1564b6 [zhaoyangming] add test info
2a49b338a [zhaoyangming] style
c8ce15f29 [zhaoyangming] StringBuilder is redundant.
5d70173cf [yangming] JDBC Engine supports PostgreSQL
Lead-authored-by: zhaoyangming <zhaoyangming@deepexi.com>
Co-authored-by: yangming <261635393@qq.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
### _Why are the changes needed?_
Improve Kyuubi On Kubernetes IT
Done:
1. Copy spark submit engine log in kyuubi pod to local and upload when test failed.
2. pre install spark image into minikube to avoid image pull error
### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible
- [ ] Add screenshots for manual tests if appropriate
- [ ] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request
### _Was this patch authored or co-authored using generative AI tooling?_
No
Closes#5437 from zwangsheng/KYUUBI#5435.
Closes#5435
0cbbafce7 [zwangsheng] add comment
1f1336c59 [zwangsheng] ready
e1c10a6ea [zwangsheng] debug
32759015c [zwangsheng] debug
8e2f1eaf1 [zwangsheng] debug
80eaae30a [zwangsheng] [KYUUBI #5435][NOT_MERGE][TEST] Improve Kyuubi On Kubernetes IT
Authored-by: zwangsheng <binjieyang@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
### _Why are the changes needed?_
The Apache Spark Community found a performance regression with log4j2. See https://github.com/apache/spark/pull/36747.
This PR to fix the performance issue on our side.
### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible
- [ ] Add screenshots for manual tests if appropriate
- [ ] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request
### _Was this patch authored or co-authored using generative AI tooling?_
No.
Closes#5400 from ITzhangqiang/KYUUBI_5365.
Closes#5365
dbb9d8b32 [ITzhangqiang] [KYUUBI #5365] Don't use Log4j2's extended throwable conversion pattern in default logging configurations
Authored-by: ITzhangqiang <itzhangqiang@163.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
### _Why are the changes needed?_
- enable CI test on Scala-2.13 for all modules except Flink SQL engine
- For testing, choose available Spark engine home in `download` module by `SCALA_COMPILE_VERSION` of Kyuubi server
- Choose the Scala version of Spark engine main resource Jar in the following order:
1. `SPARK_SCALA_VERSION` system env
2. Extract Scala version from Spark home's `spark-core` jar filename
- Fixed 1 assertion error of kyuubi-spark-lineage module, as Spark on Scala 2.12 and 2.13 show different order of column linage output in `MergeIntoTable` ut
```
SparkSQLLineageParserHelperSuite:
- columns lineage extract - MergeIntoTable *** FAILED ***
inputTables(List(v2_catalog.db.source_t))
outputTables(List(v2_catalog.db.target_t))
columnLineage(List(ColumnLineage(v2_catalog.db.target_t.name,Set(v2_catalog.db.source_t.name)), ColumnLineage(v2_catalog.db.target_t.price,Set(v2_catalog.db.source_t.price)), ColumnLineage(v2_catalog.db.target_t.id,Set(v2_catalog.db.source_t.id)))) did not equal inputTables(List(v2_catalog.db.source_t))
outputTables(List(v2_catalog.db.target_t))
columnLineage(List(ColumnLineage(v2_catalog.db.target_t.id,Set(v2_catalog.db.source_t.id)), ColumnLineage(v2_catalog.db.target_t.name,Set(v2_catalog.db.source_t.name)), ColumnLineage(v2_catalog.db.target_t.price,Set(v2_catalog.db.source_t.price)))) (SparkSQLLineageParserHelperSuite.scala:182)
```
- Fixed other tests relying on Scala scripting results
### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible
- [ ] Add screenshots for manual tests if appropriate
- [x] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request
### _Was this patch authored or co-authored using generative AI tooling?_
Closes#5196 from bowenliang123/scala213-test.
Closes#5196
97fafacd3 [liangbowen] prevent repeated compilation for regrex pattern
76b99d423 [Bowen Liang] test on scala-2.13
Lead-authored-by: Bowen Liang <liangbowen@gf.com.cn>
Co-authored-by: liangbowen <liangbowen@gf.com.cn>
Signed-off-by: Bowen Liang <liangbowen@gf.com.cn>
### _Why are the changes needed?_
Replace string literal with constant variable
### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible
- [ ] Add screenshots for manual tests if appropriate
- [ ] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request
### _Was this patch authored or co-authored using generative AI tooling?_
No
Closes#5339 from cxzl25/use_engine_init_timeout_key.
Closes#5339
bef2eaa4a [sychen] fix
Authored-by: sychen <sychen@ctrip.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
### _Why are the changes needed?_
The `recoveryMetadata` is not accurate after batch impl is introduced. This PR proposes to rename `recoveryMetadata` to `metadata` and introduce a dedicated flay `fromRecovery` to distinguish metadata between them.
This PR also partially reverts #4798, by removing unnecessary constructor parameters `shouldRunAsync` and `batchConf`
### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible
- [ ] Add screenshots for manual tests if appropriate
- [x] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request
### _Was this patch authored or co-authored using generative AI tooling?_
No.
Closes#5243 from pan3793/meta-recov.
Closes#5243
0718fbefe [Cheng Pan] nit
b8358464c [Cheng Pan] simplify
a2d6519c6 [Cheng Pan] fix test
2dad868bd [Cheng Pan] refactor
f83d2a602 [Cheng Pan] Distinguish batch impl v2 metadata from recovery
Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
### _Why are the changes needed?_
Kyuubi fully supports Spark 3.4 now, it's time to move forward.
### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible
- [ ] Add screenshots for manual tests if appropriate
- [x] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request
### _Was this patch authored or co-authored using generative AI tooling?_
Closes#5202 from pan3793/default-3.4.
Closes#5202
a0efccdbf [Cheng Pan] nit
30456dbb9 [Cheng Pan] nit
1cc83c871 [Cheng Pan] enable lineage test
d8ca7c7d8 [Cheng Pan] Switch to Spark 3.4 by default
Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
### _Why are the changes needed?_
As titled.
### _How was this patch tested?_
- [x] Add some test cases that check the changes thoroughly including negative and positive cases if possible
- [ ] Add screenshots for manual tests if appropriate
- [x] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request
Closes#5134 from link3280/KYUUBI-4806.
Closes#4806
a1b74783c [Paul Lin] Optimize code style
546cfdf5b [Paul Lin] Update externals/kyuubi-flink-sql-engine/src/main/scala/org/apache/kyuubi/engine/flink/operation/FlinkOperation.scala
b6eb7af4f [Paul Lin] Update externals/kyuubi-flink-sql-engine/src/main/scala/org/apache/kyuubi/engine/flink/result/ResultSet.scala
1563fa98b [Paul Lin] Remove explicit StartRowOffset for Flink
4e61a348c [Paul Lin] Add comments
c93294650 [Paul Lin] Improve code style
6bd0c8e69 [Paul Lin] Use dedicated thread pool
15412db3a [Paul Lin] Improve logging
d6a2a9cff [Paul Lin] [KYUUBI #4806][FLINK] Implement incremental result fetching
Authored-by: Paul Lin <paullin3280@gmail.com>
Signed-off-by: Paul Lin <paullin3280@gmail.com>
### _Why are the changes needed?_
https://spark.apache.org/news/spark-3-3-3-released.html
### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible
- [ ] Add screenshots for manual tests if appropriate
- [x] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request
Closes#5150 from pan3793/spark-3.3.3.
Closes#5150
61583609b [Cheng Pan] image
3021dd80b [Cheng Pan] remove staging repo
71b8aa843 [Cheng Pan] Revert "tgz"
d9125e63e [Cheng Pan] tgz
ebe3107c9 [Cheng Pan] Bump Spark 3.3.3
Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
### _Why are the changes needed?_
- Change hardcoded Scala's version 2.12 in Maven module's `artifactId` to placeholder `scala.binary.version` which is defined in project parent pom as 2.12
- Preparation for Scala 2.13/3.x support in the future
- No impact on using or building Maven modules
- Some ignorable warning messages for unstable artifactId will be thrown by Maven.
```
Warning: Some problems were encountered while building the effective model for org.apache.kyuubi:kyuubi-server_2.12🫙1.8.0-SNAPSHOT
Warning: 'artifactId' contains an expression but should be a constant
```
### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible
- [ ] Add screenshots for manual tests if appropriate
- [x] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request
### _Was this patch authored or co-authored using generative AI tooling?_
No.
Closes#5175 from bowenliang123/artifactId-scala.
Closes#5177
2eba29cfa [liangbowen] use placeholder of scala binary version for artifactId
Authored-by: liangbowen <liangbowen@gf.com.cn>
Signed-off-by: Cheng Pan <chengpan@apache.org>
### _Why are the changes needed?_
close#4940
### _How was this patch tested?_
- [x] Add some test cases that check the changes thoroughly including negative and positive cases if possible
- [ ] Add screenshots for manual tests if appropriate
- [ ] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request
Closes#5110 from lsm1/features/kyuubi_4940.
Closes#4940
6c0a9a37f [senmiaoliu] add kdf for hive engine
Authored-by: senmiaoliu <senmiaoliu@trip.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
### _Why are the changes needed?_
Close#4843
Support to submit kyuubi engine/batch to multiple kubernetes contexts and namespaces.
In this pr, the user can config the kubernetes conf for specified kubernetes context and namespace likes below.
```
kyuubi.kubernetes.<context>.master.address
kyuubi.kubernetes.<context>.<namespace>.authenticate.oauthTokenFile
```
For example:
```
kyuubi.kubernetes.28.master.address=k8s://master
kyuubi.kubernetes.28.ns1.authenticate.oauthTokenFile=/var/run/secrets/kubernetes.io/token.ns1
kyuubi.kubernetes.28.ns2.authenticate.oauthTokenFile=/var/run/secrets/kubernetes.io/token.ns2
```
for k8s context=28, namespace=ns1, its kubernetes config is:
```
kyuubi.kubernetes.master.address=k8s://master
kyuubi.kubernetes.authenticate.oauthTokenFile=/var/run/secrets/kubernetes.io/token.ns1
```
for k8s context=28, namespace=ns2, its kubernetes config is:
```
kyuubi.kubernetes.master.address=k8s://master
kyuubi.kubernetes.authenticate.oauthTokenFile=/var/run/secrets/kubernetes.io/token.ns2
```
So that, kyuubi server can build kubernetes client for each context and namespace.
### _How was this patch tested?_
Existing kubernetes integration testing.
Closes#4984 from turboFei/k8s_client_yaml.
Closes#4843
f8ffaeeb9 [fwang12] nit
d25774288 [fwang12] comments
5ae7c8433 [fwang12] save into request conf
fd6c363db [fwang12] save
ff004a529 [fwang12] procebuilder method
6b9520bfd [fwang12] save
58850387e [fwang12] save
98df67e5f [fwang12] ut
da811697c [fwang12] fix
aa568aaa4 [fwang12] save
89656f463 [fwang12] check init
a0ef6894b [fwang12] code style
00abb6568 [fwang12] default namespace
295512987 [fwang12] k8s context namespace
Authored-by: fwang12 <fwang12@ebay.com>
Signed-off-by: fwang12 <fwang12@ebay.com>
### _Why are the changes needed?_
- To improve Scala code with corrections, simplification, scala style, redundancy cleaning-up. No feature changes introduced.
Corrections:
- Class doesn't correspond to file name (SparkListenerExtensionTest)
- Correct package name in ResultSetUtil and PySparkTests
Improvements:
- 'var' could be a 'val'
- GetOrElse(null) to orNull
Cleanup & Simplification:
- Redundant cast inspection
- Redundant collection conversion
- Simplify boolean expression
- Redundant new on case class
- Redundant return
- Unnecessary parentheses
- Unnecessary partial function
- Simplifiable empty check
- Anonymous function convertible to a method value
Scala Style:
- Constructing range for seq indices
- Get and getOrElse to getOrElse
- Convert expression to Single Abstract Method (SAM)
- Scala unnecessary semicolon inspection
- Map and getOrElse(false) to exists
- Map and flatten to flatMap
- Null initializer can be replaced by _
- scaladoc link to method
Other Improvements:
- Replace map and getOrElse(true) with forall
- Unit return type in the argument of map
- Size to length on arrays and strings
- Type check can be pattern matching
- Java mutator method accessed as parameterless
- Procedure syntax in method definition
### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible
- [ ] Add screenshots for manual tests if appropriate
- [ ] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request
Closes#4959 from bowenliang123/scala-Improve.
Closes#4959
2d36ff351 [liangbowen] code improvement for Scala
Authored-by: liangbowen <liangbowen@gf.com.cn>
Signed-off-by: liangbowen <liangbowen@gf.com.cn>
### _Why are the changes needed?_
Close#4681
Set `CreateSparkTimeoutChecker` in `SparkSQLEngine` daemon.
Exit when spark session initialize fail.
### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible
- [ ] Add screenshots for manual tests if appropriate
- [ ] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request
Closes#4682 from zwangsheng/KYUUBI_4681.
Closes#4681
1928a67ec [zwangsheng] Add thread name
57f1914e4 [zwangsheng] Add thread name
71ff31a2b [zwangsheng] revert
4e8a619b2 [zwangsheng] DEBUG
ea23fae11 [zwangsheng] Change Init Timeout => 10M
3a89acc64 [zwangsheng] fix comments
565d1c90a [zwangsheng] [KYUUBI #4681][Engine] Set thread daemon
Authored-by: zwangsheng <2213335496@qq.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>