### Why are the changes needed?
Currently , ranger check missing paimon insert table command, add test cases
#6958
### How was this patch tested?
1. Test INSERT INTO:
1.1 table1OnlyUserForNs could select table1, try to insert table1
1.2 someone has no any permission, try to insert table1
2. Test INSERT OVERWRITE:
2.1 table1OnlyUserForNs could select table1, try to insert table2
2.2 someone has no any permiession, try select table1 then insert table2
### Was this patch authored or co-authored using generative AI tooling?
No
Closes#6959 from davidyuan1223/test_insert.
Closes#6958
d1f41ba81 [davidyuan] Merge branch 'master' into test_insert
b56e701d4 [davidyuan] Test Insert Table
8306210ee [davidyuan] update
Authored-by: davidyuan <yuanfuyuan@mafengwo.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
### Why are the changes needed?
Test Authz Support paimon rename table name command privilege check
#6936
### How was this patch tested?
Test Authz Support paimon rename table name command privilege check
### Was this patch authored or co-authored using generative AI tooling?
No
Closes#6937 from davidyuan1223/check_authz_paimon_rename_table.
Closes#6936
797d1c489 [davidyuan] Merge branch 'master' into check_authz_paimon_rename_table
bc3c823a3 [davidyuan] Merge remote-tracking branch 'origin/master' into check_authz_paimon_rename_table
6205670d2 [davidyuan] add renameTable to command_spec.json
e4b241ef5 [davidyuan] Merge branch 'master' into check_authz_paimon_rename_table
5fec3bcb7 [davidyuan] test paimon rename table name command
30d09418c [davidyuan] test paimon rename table name command
Authored-by: davidyuan <yuanfuyuan@mafengwo.com>
Signed-off-by: Kent Yao <yao@apache.org>
### Why are the changes needed?
Ranger check test case missing paimon adding column position command, add the test case
#6949
### How was this patch tested?
Test ranger check with paimon adding column position command
### Was this patch authored or co-authored using generative AI tooling?
No
Closes#6954 from davidyuan1223/test_adding_column_position.
Closes#6949
262ecaaca [davidyuan] Merge remote-tracking branch 'origin/master' into test_adding_column_position
154765fc3 [davidyuan] Merge branch 'master' into test_adding_column_position
4ebf985a9 [davidyuan] test adding column position
Authored-by: davidyuan <yuanfuyuan@mafengwo.com>
Signed-off-by: Kent Yao <yao@apache.org>
### Why are the changes needed?
Currently range check missing check UnsetTableProperties command, we need add it to the range check.
#6940
### How was this patch tested?
Use paimon removing table properties to test this command
### Was this patch authored or co-authored using generative AI tooling?
No
Closes#6944 from davidyuan1223/test_remove_table_properties.
Closes#6940
4f24d7d6a [davidyuan] Merge branch 'master' into test_remove_table_properties
11d3773ed [davidyuan] test unset table properties command
Authored-by: davidyuan <yuanfuyuan@mafengwo.com>
Signed-off-by: Kent Yao <yao@apache.org>
### Why are the changes needed?
Ranger check test case missing paimon changing column position command, add the test case
#6950
### How was this patch tested?
Test ranger check with paimon changing column position command
### Was this patch authored or co-authored using generative AI tooling?
No
Closes#6955 from davidyuan1223/test_changing_column_position.
Closes#6950
520b5377f [davidyuan] Merge branch 'master' into test_changing_column_position
1eed87346 [davidyuan] test changing column position
Authored-by: davidyuan <yuanfuyuan@mafengwo.com>
Signed-off-by: Kent Yao <yao@apache.org>
### Why are the changes needed?
Test Spark 3.5.5 Release Notes
https://spark.apache.org/releases/spark-release-3-5-5.html
### How was this patch tested?
Pass GHA.
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes#6939 from pan3793/spark-3.5.5.
Closes#6939
8c0288ae5 [Cheng Pan] ga
78b0e72db [Cheng Pan] nit
686a7b0a9 [Cheng Pan] fix
d40cc5bba [Cheng Pan] Bump Spark 3.5.5
Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
### Why are the changes needed?
Currently, ranger check for paimon missing rename column name command, add the test case
#6942
### How was this patch tested?
Test Paimon Rename column name with ranger
### Was this patch authored or co-authored using generative AI tooling?
No
Closes#6946 from davidyuan1223/test_rename_column_name.
Closes#6942
8e49eb0ab [davidyuan] test rename column name
Authored-by: davidyuan <yuanfuyuan@mafengwo.com>
Signed-off-by: Kent Yao <yao@apache.org>
### Why are the changes needed?
AUTHZ Test Create Partitioned Table for PAIMON, check that has support the command
#6923
### How was this patch tested?
est Authz for paimon with create partitioned table command. Check the permission
### Was this patch authored or co-authored using generative AI tooling?
No
Closes#6931 from davidyuan1223/support_create_with_parition_for_paimon.
Closes#6923
61f7560d3 [Cheng Pan] Merge branch 'master' into support_create_with_parition_for_paimon
ffb79376f [Cheng Pan] Update extensions/spark/kyuubi-spark-authz/src/test/scala/org/apache/kyuubi/plugin/spark/authz/ranger/PaimonCatalogRangerSparkExtensionSuite.scala
b0829795a [Bowen Liang] Update extensions/spark/kyuubi-spark-authz/src/test/scala/org/apache/kyuubi/plugin/spark/authz/ranger/PaimonCatalogRangerSparkExtensionSuite.scala
4b160d720 [davidyuan] support create partition table as for paimon
Lead-authored-by: davidyuan <yuanfuyuan@mafengwo.com>
Co-authored-by: Bowen Liang <bowenliang@apache.org>
Co-authored-by: Cheng Pan <pan3793@gmail.com>
Co-authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
### Why are the changes needed?
AUTHZ Test Add/Change Table properties for PAIMON, check that has support the command
https://github.com/apache/kyuubi/issues/6932
### How was this patch tested?
Test Add/Change properties SQL
### Was this patch authored or co-authored using generative AI tooling?
No
Closes#6933 from davidyuan1223/test_alter_tableproperties_for_paimin.
Closes#6932
4d64fbf23 [Cheng Pan] Update extensions/spark/kyuubi-spark-authz/src/test/scala/org/apache/kyuubi/plugin/spark/authz/ranger/PaimonCatalogRangerSparkExtensionSuite.scala
c861a778b [davidyuan] support add/change table properties for paimon
Lead-authored-by: davidyuan <yuanfuyuan@mafengwo.com>
Co-authored-by: Cheng Pan <pan3793@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
### Why are the changes needed?
AUTHZ Test CTAS for Paimon to check it support this command, the related issue is https://github.com/apache/kyuubi/issues/6921
### How was this patch tested?
Test Authz for paimon with create table as command. Check the permission.
### Was this patch authored or co-authored using generative AI tooling?
No
Closes#6922 from davidyuan1223/support_create_table_as_for_paimon_check.
Closes#6921
7bfd6ad49 [david yuan] Update extensions/spark/kyuubi-spark-authz/src/test/scala/org/apache/kyuubi/plugin/spark/authz/ranger/PaimonCatalogRangerSparkExtensionSuite.scala
a9ce20cc4 [davidyuan] support create table as for paimon
Lead-authored-by: davidyuan <yuanfuyuan@mafengwo.com>
Co-authored-by: david yuan <davidyuan1223@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
### Why are the changes needed?
Paimon does not seem to support Scala 2.13
### How was this patch tested?
Pass GHA.
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes#6925 from pan3793/authz-paimon-scala212.
Closes#6925
865a7dd72 [Cheng Pan] fix
971d23273 [Cheng Pan] Update extensions/spark/kyuubi-spark-authz/src/test/scala/org/apache/kyuubi/plugin/spark/authz/ranger/PaimonCatalogRangerSparkExtensionSuite.scala
499f10ab0 [Cheng Pan] Only run Paimon authz tests with Scala 2.12
Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
# Why are the changes needed?
## Issue reference:
https://github.com/apache/kyuubi/issues/6912
## How to reproduce the issue?
The changes in this PR will avoid a wrong result when generating the instance of org.apache.kyuubi.plugin.lineage.Lineage, in the certain case as follows:
step 1: create a temporary view from a file
step 2: insert into a table by selecting from the temporary view in step 1
step 3: generate the lineage when executing the insert statement in step 2
In detail, please see the UT code submission in this patch.
## The issue analysis
Let's see the current code when getting the Lineage object by resolving a LogicalPlan object:
<img width="694" alt="image" src="https://github.com/user-attachments/assets/65256a0d-320d-4271-968f-59eafb74de9f" />
According to the above logic, a None org.apache.kyuubi.plugin.lineage.Lineage object will be generated due to "try-catch" self-protection, in this certain case. This None object will lead to problems in the following 2 scenes:
### Unit Test Environment
In Unit Test, when the code runs here a "None.get" exception will be raised:
<img width="682" alt="image" src="https://github.com/user-attachments/assets/102dc9bd-294f-4b1e-b1c6-01b6fee50fed" />
Here's the runtime exception stack:
```
None.get
java.util.NoSuchElementException: None.get
at scala.None$.get(Option.scala:529)
at scala.None$.get(Option.scala:527)
at org.apache.kyuubi.plugin.lineage.helper.SparkSQLLineageParserHelperSuite.extractLineageWithoutExecuting(SparkSQLLineageParserHelperSuite.scala:1485)
at org.apache.kyuubi.plugin.lineage.helper.SparkSQLLineageParserHelperSuite.$anonfun$new$83(SparkSQLLineageParserHelperSuite.scala:1465)
```
### Production Environment
This Lineage object cannot be used in the production environment because it has a None value which lacks some necessary lineage information. The right content of the Lineage instance in the above case should be:
```
inputTables(List())
outputTables(List(spark_catalog.test_db.test_table_from_dir))
columnLineage(List(ColumnLineage(spark_catalog.test_db.test_table_from_dir.a0,Set()), ColumnLineage(spark_catalog.test_db.test_table_from_dir.b0,Set())))
```
a newly added test case(test directory to table) passed after this issue is fixed.
# How to fix the issue?
Add a "Empty judgment" logic. In detail, please see the code submission in this patch.
# How was this patch tested?
1. by adding a new test case in UT code and make sure it passes
2. by submitting a Spark application including the SQL of this case in the production environment, and make sure a right Lineage instance is generated, instead of a None object
# Was this patch authored or co-authored using generative AI tooling?
No
Closes#6911 from xglv1985/fix_spark_lineage_runtime_exception.
Closes#6912
13a71075d [Cheng Pan] Update extensions/spark/kyuubi-spark-lineage/src/test/scala/org/apache/kyuubi/plugin/lineage/helper/SparkSQLLineageParserHelperSuite.scala
4e89b95cd [Cheng Pan] Update extensions/spark/kyuubi-spark-lineage/src/test/scala/org/apache/kyuubi/plugin/lineage/helper/SparkSQLLineageParserHelperSuite.scala
59b350bfb [xglv1985] fix a runtime exception when generate column lineage tuple--more readable code
52bc0288d [xglv1985] fix a runtime exception when generate column lineage tuple--spotless sytle
fea6bbc0d [xglv1985] fix a runtime exception when generate column lineage tuple--remove tab from UT code
901879095 [xglv1985] fix a runtime exception when generate column lineage tuple--unit test
fbb4df879 [xglv1985] fix a runtime exception when generate column lineage tuple
Lead-authored-by: xglv1985 <xglv1985@gmail.com>
Co-authored-by: Cheng Pan <pan3793@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
### Why are the changes needed?
Added a service definition for spark which in turn enables the creation of a default policy for the spark service.
Default policy will block access until another policy is downloaded from Apache Ranger.
### How was this patch tested?
Tested manually.
Configure Kyuubi Authz plugin. Do not start Apache Ranger, it must not be reachable.
Make sure that policy cache is empty.
Start Kyuubi engine and try to query any tables. The default policy should not allow any access.
Previously the access was not restricted because there wasn't a default policy defined.
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes#6902 from developster/master.
Closes#6901
feb6ebf61 [Octavian Ciubotaru] Default policy for spark
Authored-by: Octavian Ciubotaru <ociubotaru@developmentgateway.org>
Signed-off-by: Kent Yao <yao@apache.org>
### Why are the changes needed?
Backport https://github.com/apache/kyuubi/pull/5852 to Spark 3.3, to enhance MaxScanStrategy to include support for the datasourcev2 in Spark 3.3
### How was this patch tested?
Add some UTs
### Was this patch authored or co-authored using generative AI tooling?
No
Closes#6862 from zhaohehuhu/dev-1225.
Closes#6862
c745eda14 [zhaohehuhu] MaxScanStrategy supports DSv2 in Spark 3.3
Authored-by: zhaohehuhu <luoyedeyi459@163.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
### Why are the changes needed?
Backport https://github.com/apache/kyuubi/pull/5852 to Spark 3.4, to enhance MaxScanStrategy to include support for the datasourcev2 in Spark 3.4
### How was this patch tested?
Add some UTs
### Was this patch authored or co-authored using generative AI tooling?
No
Closes#6857 from zhaohehuhu/dev-1224.
Closes#6857
c72c62984 [zhaohehuhu] remove the import
dfbf2bc2d [zhaohehuhu] MaxScanStrategy supports DSv2 in Spark 3.4
Authored-by: zhaohehuhu <luoyedeyi459@163.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
# 🔍 Description
```
export JAVA_HOME=/path/of/openjdk-17
build/mvn clean install -DskipTests -Dmaven.scaladoc.skip=false
```
```
[INFO] --- scala-maven-plugin:4.9.2:doc-jar (attach-scaladocs) kyuubi-server-plugin ---
[INFO] compiler plugin: BasicArtifact(com.github.ghik,silencer-plugin_2.12.20,1.7.19,null)
error: fatal error: object scala in compiler mirror not found.
```
## Types of changes 🔖
- [x] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
Successfully run the build command
```
export JAVA_HOME=/path/of/openjdk-17
build/mvn clean install -DskipTests -Dmaven.scaladoc.skip=false
```
---
# Checklist 📝
- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
Closes#6820 from pan3793/scaladoc.
Closes#6820
f5cee3429 [Cheng Pan] Explicitly disable attach-scaladocs for pure Java modules
Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
# 🔍 Description
## Issue References 🔗
This pull request fixes #
## Describe Your Solution 🔧
Preparing v1.11.0-SNAPSHOT after branch-1.10 cut
```shell
build/mvn versions:set -DgenerateBackupPoms=false -DnewVersion="1.11.0-SNAPSHOT"
(cd kyuubi-server/web-ui && npm version "1.11.0-SNAPSHOT")
```
## Types of changes 🔖
- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
#### Behavior Without This Pull Request ⚰️
#### Behavior With This Pull Request 🎉
#### Related Unit Tests
---
# Checklist 📝
- [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
Closes#6769 from bowenliang123/bump-1.11.
Closes#6769
6db219d28 [Bowen Liang] get latest_branch by sorting version in branch name
465276204 [Bowen Liang] update package.json
81f2865e5 [Bowen Liang] bump
Authored-by: Bowen Liang <liangbowen@gf.com.cn>
Signed-off-by: Bowen Liang <liangbowen@gf.com.cn>
# 🔍 Description
## Issue References 🔗
This pull request fixes#6754
## Describe Your Solution 🔧
Right now in RuleAuthorization we use an ArrayBuffer to collect access requests, which is very slow because each new PrivilegeObject needs to be compared with all access requests.
## Types of changes 🔖
- [x] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
#### Behavior Without This Pull Request ⚰️
Add benchmark
Before
```sh
Java HotSpot(TM) 64-Bit Server VM 17.0.12+8-LTS-286 on Mac OS X 14.6
Apple M3
Collecting files ranger access request: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
50000 files benchmark 181863 189434 NaN -0.0 -181863368958.0 1.0X
````
#### Behavior With This Pull Request 🎉
After
```sh
Java HotSpot(TM) 64-Bit Server VM 17.0.12+8-LTS-286 on Mac OS X 14.6
Apple M3
Collecting files ranger access request: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
50000 files benchmark 1281 1310 33 -0.0 -1280563000.0 1.0X
```
#### Related Unit Tests
Exists UT
---
# Checklist 📝
- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
Closes#6758 from wankunde/ranger2.
Closes#6754
9d7d1964b [wankunde] [KYUUBI #6754] Improve the performance of ranger access requests
88b9c049b [wankun] Update extensions/spark/kyuubi-spark-authz/src/test/scala/org/apache/spark/sql/RuleAuthorizationBenchmark.scala
20c55fbeb [wankun] Update extensions/spark/kyuubi-spark-authz/pom.xml
f5a3b6ca5 [wankunde] [KYUUBI #6754] Improve the performance of ranger access requests
9793249de [wankunde] [KYUUBI #6754] Improve the performance of ranger access requests
d86b01f9c [wankunde] [KYUUBI #6754] Improve the performance of ranger access requests
b904b491b [wankunde] [KYUUBI #6754] Improve the performance of ranger access requests
aad08a6bb [wankunde] [KYUUBI #6754] Improve the performance of ranger access requests
1374604bc [wankunde] [KYUUBI #6754] Improve the performance of ranger access requests
01e15c149 [wankun] Update extensions/spark/kyuubi-spark-authz/pom.xml
805e8a9c0 [wankun] Update extensions/spark/kyuubi-spark-authz/pom.xml
e19817943 [wankunde] [KYUUBI #6754] Improve the performance of ranger access requests
Lead-authored-by: wankunde <wankunde@163.com>
Co-authored-by: wankun <wankun@apache.org>
Signed-off-by: Bowen Liang <liangbowen@gf.com.cn>
# 🔍 Description
## Issue References 🔗
Fix a ClassNotFound issue.
```
java.lang.NoClassDefFoundError: org/apache/kyuubi/shade/javax/ws/rs/core/Cookie
```
## Types of changes 🔖
- [x] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
Verified manually.
---
# Checklist 📝
- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
Closes#6723 from pan3793/6638-followup.
Closes#6638
56e9842e0 [Cheng Pan] [KYUUBI #6638] authz shaded should include jsr311-api
Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Bowen Liang <liangbowen@gf.com.cn>
# 🔍 Description
## Issue References 🔗
This pull request fixes#6666
## Describe Your Solution 🔧
Bump ranger version to 2.5.0
Release notes: https://cwiki.apache.org/confluence/display/RANGER/Apache+Ranger+2.5.0+-+Release+Notes
## Types of changes 🔖
- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
#### Behavior Without This Pull Request ⚰️
#### Behavior With This Pull Request 🎉
#### Related Unit Tests
---
# Checklist 📝
- [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
Closes#6692 from Madhukar525722/ranger_upgrade.
Closes#6666
88e1e12c5 [madlnu] [KYUUBI #6666] Upgrade spark ranger plugin to 2.5.0
Authored-by: madlnu <madlnu@visa.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
# 🔍 Description
Spark 4.0.0-preview2 RC1 passed the vote
https://lists.apache.org/thread/4ctj2mlgs4q2yb4hdw2jy4z34p5yw2b1
## Types of changes 🔖
- [ ] Bugfix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
Pass GHA.
---
# Checklist 📝
- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
Closes#6699 from pan3793/spark-4.0.0-preview2.
Closes#6699
2db1f645d [Cheng Pan] 4.0.0-preview2
42055bb1e [Cheng Pan] fix
d29c0ef83 [Cheng Pan] disable delta test
98d323b95 [Cheng Pan] fix
2e782c00b [Cheng Pan] log4j-slf4j2-impl
fde4bb6ba [Cheng Pan] spark-4.0.0-preview2
Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
# 🔍 Description
## Issue References 🔗
This pull request fixes #
## Describe Your Solution 🔧
- Apache Commons Lang2 is no longer actively maintained and not used by Kyuubi modules
## Types of changes 🔖
- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
#### Behavior Without This Pull Request ⚰️
#### Behavior With This Pull Request 🎉
#### Related Unit Tests
---
# Checklist 📝
- [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
Closes#6672 from bowenliang123/remove-commonlang2.
Closes#6672
34cda170a [liangbowen] remove common lang2
Lead-authored-by: Bowen Liang <liangbowen@gf.com.cn>
Co-authored-by: liangbowen <liangbowen@gf.com.cn>
Signed-off-by: liangbowen <liangbowen@gf.com.cn>
# 🔍 Description
## Issue References 🔗
This pull request fixes#5402
## Describe Your Solution 🔧
When facing out-of-control memory management in Spark engine, we typically use JVMkill as a remedy by killing the process and generating a heap dump for post-analysis. However, even with jvmkill protection, we may still encounter issues caused by JVM running out of memory, such as repeated execution of Full GC without performing any useful work during the pause time. Since the JVM does not exhaust 100% of resources, JVMkill will not be triggered.
So introducing JVMQuake provides more granular monitoring of GC behavior, enabling early detection of memory management issues and facilitating fast failure.
You can use the following configuration to enable jvmQuake plugins:
```
spark.plugins=org.apache.spark.kyuubi.jvm.quake.KyuubiJVMQuakePlugin
```
| configuration | default | comment |
| ---- | ---- | ---- |
| spark.driver.jvmQuake.enabled | false | when true, enable driver jvmQuake |
| spark.executor.jvmQuake.enabled | false | when true, enable executor jvmQuake |
| spark.driver.jvmQuake.heapDump.enabled | false | when true, enable jvm heap dump when jvmQuake rearch the threshold |
| spark.executor.jvmQuake.heapDump.enabled | false | when true, enable jvm heap dump when jvmQuake rearch the threshold |
| spark.jvmQuake.dumpThreshold | 100 | The number of seconds to dump memory |
| spark.jvmQuake.killThreshold | 200 | The number of seconds to kill process |
| spark.jvmQuake.exitCode | 502 | The exit code of kill process |
| spark.jvmQuake.heapDumpPath | /tmp/kyuubi_jvm_quake/apps | The path of heap dump |
| spark.jvmQuake.checkInterval | 3 | The number of seconds to check jvmQuake |
| spark.jvmQuake.runTimeWeight | 1.0 | The weight of rum time |
## Types of changes 🔖
- [ ] Bugfix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
#### Behavior Without This Pull Request ⚰️
#### Behavior With This Pull Request 🎉
#### Related Unit Tests
---
# Checklist 📝
- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
Closes#6572 from yoock/features/kyuubi-jvm-quake.
Closes#5402
84361ce8f [王龙] add jvm quake
Authored-by: 王龙 <wanglong16@xiaomi.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
# 🔍 Description
SPARK-46257 (Spark 4.0.0) moves to Derby 10.16, `org.apache.derby.jdbc.AutoloadedDriver` has been moved to `org.apache.derby.iapi.jdbc.AutoloadedDriver`
## Types of changes 🔖
- [ ] Bugfix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
Manually tested with Spark 4.0.
---
# Checklist 📝
- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
Closes#6640 from pan3793/authz-derby.
Closes#6640
46edb32be [Cheng Pan] Update extensions/spark/kyuubi-spark-authz/src/main/scala/org/apache/kyuubi/plugin/spark/authz/util/AuthZUtils.scala
7eee47f0d [Cheng Pan] Adapt Derby 10.16 new JDBC driver package name
Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
# 🔍 Description
I faced the following error when trying to run authz with Spark 4.0
```
Cause: java.lang.NoClassDefFoundError: javax/ws/rs/core/Cookie
at java.base/java.lang.Class.forName0(Native Method)
at java.base/java.lang.Class.forName(Class.java:375)
at org.apache.ranger.plugin.policyengine.RangerPluginContext.createAdminClient(RangerPluginContext.java:96)
at org.apache.ranger.plugin.util.PolicyRefresher.<init>(PolicyRefresher.java:90)
at org.apache.ranger.plugin.service.RangerBasePlugin.init(RangerBasePlugin.java:251)
at org.apache.kyuubi.plugin.spark.authz.ranger.SparkRangerAdminPlugin$.initialize(SparkRangerAdminPlugin.scala:68)
```
The `javax.ws.rs:jsr311-api` is the transitive dep of `jersey-client`, we should shade and relocate it correctly.
Why does it work with Spark 3? Spark 3 provides `jakarta.ws.rs:jakarta.ws.rs-api:2.1.6` which provides `java.ws.rs.*` classes, but Spark 4 upgrades to `jakarta.ws.rs:jakarta.ws.rs-api:3.0.0` which changed package name to`jakarta.ws.rs.*`.
## Types of changes 🔖
- [x] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
Pass GHA and manually tested with Spark 4
---
# Checklist 📝
- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
Closes#6638 from pan3793/jsr311.
Closes#6638
5699200cf [Cheng Pan] Shade jsr311-api in Authz
Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
# 🔍 Description
## Issue References 🔗
This pull request fixes#6581
## Describe Your Solution 🔧
Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.
I modified `KyuubiSparkSQLAstBuilder#visitMultipartIdentifier` and implemented `KyuubiSparkSQLAstBuilder#visitQuotedIdentifier` to process the quoted identifiers.
## Types of changes 🔖
- [x] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
#### Behavior Without This Pull Request ⚰️
#### Behavior With This Pull Request 🎉
#### Related Unit Tests
```
extensions/spark/kyuubi-extension-spark-3-3/src/test/scala/org/apache/spark/sql/ZorderSuiteBase.scala
test("optimize sort by backquoted column name")
```
---
# Checklist 📝
- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
Closes#6582 from XorSum/features/zorder-backquote.
Closes#6582
16ffa1238 [xorsum] zorder by support quote
Authored-by: xorsum <xorsum@outlook.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
# 🔍 Description
## Issue References 🔗
This pull request fixes#6564
## Describe Your Solution 🔧
Remove the `columnDesc` for `InsertIntoHadoopFsRelationCommand ` and `InsertIntoHiveTable ` in `table_command_spec.json`
## Types of changes 🔖
- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [x] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
#### Behavior Without This Pull Request ⚰️
Insert into table will check the privilege of columns.
#### Behavior With This Pull Request 🎉
Insert into table will check the privilege of table.
#### Related Unit Tests
---
# Checklist 📝
- [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
Closes#6570 from liujiayi771/insert-permission.
Closes#6564
d956aa916 [joey.ljy] Fix ut
d282f8ec5 [joey.ljy] insert into table check the privilege of table
Authored-by: joey.ljy <joey.ljy@alibaba-inc.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
# 🔍 Description
## Issue References 🔗
This pull request fixes#6541
## Describe Your Solution 🔧
Fix an issue where DataSourceV2RelationTableExtractor#table could not fetch the ‘database’ attribute causing the Ranger checks to fail when using the Paimon Catalog.
If the database attribute is not resolved, use DataSourceV2RelationTableExtractor#identifier to complete it.
## Types of changes 🔖
- [x] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
#### Behavior Without This Pull Request ⚰️
#### Behavior With This Pull Request 🎉
#### Related Unit Tests
---
# Checklist 📝
- [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
Closes#6544 from promising-forever/issues/6541.
Closes#6541
6549f8528 [caoyu] Fix test failure, paimon-spark run on Scala 2.12.
c1a09214a [caoyu] Optimising the 'database' capture logic
69fb0bc7e [caoyu] PolicyJsonFileGenerator#genPolicies add paimonNamespace
c89c70bad [caoyu] [KYUUBI #6541] [AUTHZ] Fix DataSourceV2RelationTableExtractor#table can't get the 'database' attribute if it's a Paimon plan.
77f121b0d [caoyu] [KYUUBI #6541] [AUTHZ] Fix DataSourceV2RelationTableExtractor#table can't get the 'database' attribute if it's a Paimon plan.
9cfb5847b [caoyu] [KYUUBI #6541] [AUTHZ] Fix DataSourceV2RelationTableExtractor#table can't get the 'database' attribute if it's a Paimon plan.
Authored-by: caoyu <caoy.5@jifenn.com>
Signed-off-by: Bowen Liang <liangbowen@gf.com.cn>
# 🔍 Description
## Issue References 🔗
This pull request fixes#6554
## Describe Your Solution 🔧
- Delete `/kyuubi/extensions/spark/kyuubi-extension-spark-3-x/src/main/scala/org/apache/kyuubi/sql/zorder/InsertZorderBeforeWritingBase.scala` file
- Rename `InsertZorderBeforeWriting33.scala` to `InsertZorderBeforeWriting.scala`
- Rename `InsertZorderHelper33, InsertZorderBeforeWritingDatasource33, InsertZorderBeforeWritingHive33, ZorderSuiteSpark33` to `InsertZorderHelper, InsertZorderBeforeWritingDatasource, InsertZorderBeforeWritingHive, ZorderSuiteSpark`
## Types of changes 🔖
- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
#### Behavior Without This Pull Request ⚰️
#### Behavior With This Pull Request 🎉
#### Related Unit Tests
---
# Checklist 📝
- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
Closes#6555 from huangxiaopingRD/6554.
Closes#6554
26de4fa09 [huangxiaoping] [KYUUBI #6554] Delete redundant code related to zorder
Authored-by: huangxiaoping <1754789345@qq.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
# 🔍 Description
## Issue References 🔗
This pull request fixes#6551
## Describe Your Solution 🔧
Update `canInsertZorder` to allow insert zorder when global sort is `false` and the plan is `Repartition` or `RepartitionByExpression`.
## Types of changes 🔖
- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
#### Behavior Without This Pull Request ⚰️
#### Behavior With This Pull Request 🎉
#### Related Unit Tests
/kyuubi-extension-spark-common/src/test/scala/org/apache/spark/sql/ZorderSuiteBase.scala
---
# Checklist 📝
- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
Closes#6552 from huangxiaopingRD/6551.
Closes#6551
b597443c3 [huangxiaoping] Fix code style
618594667 [huangxiaoping] [KYUUBI #6551] Allow insert zorder when when the plan is Repartition or RepartitionByExpression
Authored-by: huangxiaoping <1754789345@qq.com>
Signed-off-by: ulyssesyou <ulyssesyou@apache.org>
# 🔍 Description
This pull request aims to remove building support for Spark 3.2, while still keeping the engine support for Spark 3.2.
Mailing list discussion: https://lists.apache.org/thread/l74n5zl1w7s0bmr5ovxmxq58yqy8hqzc
- Remove Maven profile `spark-3.2`, and references on docs, release scripts, etc.
- Keep the cross-version verification to ensure that the Spark SQL engine built on the default Spark version (3.5) still works well on Spark 3.2 runtime.
- Merge `kyuubi-extension-spark-common` into `kyuubi-extension-spark-3-3`
- Remove `log4j.properties` as Spark moves to Log4j2 since 3.3 (SPARK-37814)
## Types of changes 🔖
- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [x] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
Pass GHA.
---
# Checklist 📝
- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
Closes#6545 from pan3793/deprecate-spark-3.2.
Closes#6545
54c172528 [Cheng Pan] fix
f4602e805 [Cheng Pan] Deprecate and remove building support for Spark 3.2
2e083f89f [Cheng Pan] fix style
458a92c53 [Cheng Pan] nit
929e1df36 [Cheng Pan] Deprecate and remove building support for Spark 3.2
Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
# 🔍 Description
This PR makes KSHC support Spark 4.0, and also makes sure that the KSHC jar compiled against Spark 3.5 is binary compatible with Spark 4.0.
We are ready to enable CI for Spark 4.0, except for authZ module.
## Types of changes 🔖
- [ ] Bugfix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
Pass GHA.
---
# Checklist 📝
- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
Closes#6453 from pan3793/spark4-ci.
Closes#6453
695e3d7f7 [Cheng Pan] Update pom.xml
2eaa0f88a [Cheng Pan] Update .github/workflows/master.yml
b1f540a34 [Cheng Pan] cross test
562839982 [Cheng Pan] fix
9f0c2e1be [Cheng Pan] fix
45f182462 [Cheng Pan] kshc
227ef5bae [Cheng Pan] fix
690a3b8b2 [Cheng Pan] Revert "fix"
87fe7678b [Cheng Pan] fix
60f55dbed [Cheng Pan] CI for Spark 4.
Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
# 🔍 Description
Improve data lake dependency management by extracting the following Maven properties:
- `delta.artifact`
- `hudi.artifact`
- `iceberg.artifact`
- `paimon.artifact`
It often takes a while for the downstream data lakes to support the new Spark versions, extracting those properties makes it easy to override in the new profile on the Kyuubi project's `pom.xml` to workaround before data lakes jars are available.
One use case is a19bb7c18e
## Types of changes 🔖
- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
Pass GHA.
---
# Checklist 📝
- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
Closes#6427 from pan3793/datalake-dep.
Closes#6427
74a9300e0 [Cheng Pan] Improve datalake dependency management
Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
# 🔍 Description
## Issue References 🔗
This pull request fixes#6447
## Describe Your Solution 🔧
Use static regex Pattern instances in JavaUtils.timeStringAs and JavaUtils.byteStringAs
## Types of changes 🔖
- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
#### Behavior Without This Pull Request ⚰️
#### Behavior With This Pull Request 🎉
#### Related Unit Tests
---
# Checklist 📝
- [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
Closes#6448 from lsm1/branch-kyuubi-6447.
Closes#6447
467066ce5 [senmiaoliu] Use static regex Pattern instances in JavaUtils
Authored-by: senmiaoliu <senmiaoliu@trip.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
# 🔍 Description
Kyuubi uses the Hudi Spark bundle jar in authZ module for testing, Hudi 0.15 brings Spark 3.5 and Scala 2.13 support, it also removes hacky for profile `spark-3.5`.
## Types of changes 🔖
- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
Pass GHA.
---
# Checklist 📝
- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
Closes#6451 from pan3793/hudi-0.15.
Closes#6451
98d6e97c5 [Cheng Pan] fix
2d31307da [Cheng Pan] remove spark-authz-hudi-test
8896f8c3f [Cheng Pan] Enable hudi test
7e9a7c7ae [Cheng Pan] Bump Hudi 0.15.0
Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
# 🔍 Description
The `kyuubi-util-scala_2.12-<version>-tests.jar` accidentally leaked to the compile scope but should be in the test scope.
## Types of changes 🔖
- [x] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
Run `build/dist` and check `dist/jars`
---
# Checklist 📝
- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
Closes#6439 from pan3793/util-scala-test.
Closes#6439
0576248f5 [Cheng Pan] fix
2bf2408f5 [Cheng Pan] fix
f7151dfc6 [Cheng Pan] kyuubi-util-scala test jar leaked to compile scope
Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
# 🔍 Description
## Issue References 🔗
This pull request adds cross-version tests for Kyuubi Spark TPC-DS Connector and TPC-H Connector.
## Describe Your Solution 🔧
Add TPC-DS Connector and TPC-H Connector into GitHub Actions job `spark-connector-cross-version-test`.
## Types of changes 🔖
- [ ] Bugfix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
#### Behavior Without This Pull Request ⚰️
#### Behavior With This Pull Request 🎉
#### Related Unit Tests
---
# Checklist 📝
- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
Closes#6441 from zhouyifan279/tcp-ds/h-cross-version.
Closes#6441
c2abc468a [zhouyifan279] Kyuubi Spark TPC-DS/H Connector cross version test
Authored-by: zhouyifan279 <zhouyifan279@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
# 🔍 Description
## Issue References 🔗
This pull request closes#6247
This also closes#6431
## Describe Your Solution 🔧
Add a job `spark-connector-cross-version-test` in GitHub Actions to:
1. Build KSHC package with maven opt `-Pspark-3.5`
2. Run KSHC tests with maven opt `-Pspark-3.3` and `-Pspark-3.4` and KSHC package built in step 1
3. Fix the binary-compatible issue via reflection.
## Types of changes 🔖
- [ ] Bugfix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
Pass GHA.
---
# Checklist 📝
- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
Closes#6436 from zhouyifan279/kshc-cross-version-test.
Closes#6247
d3ac2ef47 [zhouyifan279] Tune the KSHC code to fix binary-compatible issues
4e14edcb5 [zhouyifan279] Fix invalid unit-tests-log name
56ca45d18 [zhouyifan279] Fix invalid unit-tests-log name
4c5ab7b9e [zhouyifan279] Update test log name
8a84e8812 [zhouyifan279] Add matrix scala
17cb67155 [zhouyifan279] [KYUUBI #6247] KSHC cross-version test
Authored-by: zhouyifan279 <zhouyifan279@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
# 🔍 Description
```
build/mvn clean test -Pscala-2.13 -Pspark-master -pl :kyuubi-spark-lineage_2.13
```
```
- test group by *** FAILED ***
org.apache.spark.sql.catalyst.ExtendedAnalysisException: [DATATYPE_MISMATCH.BINARY_OP_WRONG_TYPE] Cannot resolve "(b + c)" due to data type mismatch: the binary operator requires the input type ("NUMERIC" or "INTERVAL DAY TO SECOND" or "INTERVAL YEAR TO MONTH" or "INTERVAL"), not "STRING". SQLSTATE: 42K09; line 1 pos 59;
'InsertIntoStatement RelationV2[a#546, b#547, c#548] v2_catalog.db.t1 v2_catalog.db.t1, false, false, false
+- 'Aggregate [a#543], [a#543, unresolvedalias('count(distinct (b#544 + c#545))), (count(distinct b#544) * count(distinct c#545)) AS (count(DISTINCT b) * count(DISTINCT c))#551L]
+- SubqueryAlias v2_catalog.db.t2
+- RelationV2[a#543, b#544, c#545] v2_catalog.db.t2 v2_catalog.db.t2
at org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.dataTypeMismatch(package.scala:73)
at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis0$7(CheckAnalysis.scala:315)
at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis0$7$adapted(CheckAnalysis.scala:302)
at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:244)
at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreachUp$1(TreeNode.scala:243)
at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreachUp$1$adapted(TreeNode.scala:243)
at scala.collection.immutable.Vector.foreach(Vector.scala:1856)
at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:243)
at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreachUp$1(TreeNode.scala:243)
at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreachUp$1$adapted(TreeNode.scala:243)
...
```
## Types of changes 🔖
- [x] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
Pass UT.
---
# Checklist 📝
- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
Closes#6398 from pan3793/lineage-fix.
Closes#6398
afce6b880 [Cheng Pan] Fix lineage plugin UT for Spark 4.0
Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
# 🔍 Description
This PR makes `javax.servlet` and `jakarta.servlet` co-exist, by introducing `javax.servlet-api-4.0.1` and upgrade `jakarta.servlet-api` to 5.0.0. (6.0.0 requires JDK 11)
Spark 4.0 migrated from `javax.servlet` to `jakarta.servlet` in SPARK-47118 while Kyuubi still uses `javax.servlet` in other modules, we should allow them to co-exist for a while.
## Types of changes 🔖
- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
Pass GHA.
---
# Checklist 📝
- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
Closes#6392 from pan3793/servlet.
Closes#6392
27d412599 [Cheng Pan] fix
9f1e72272 [Cheng Pan] other spark modules
f4545dc76 [Cheng Pan] fix
313826fa7 [Cheng Pan] exclude
7d5028154 [Cheng Pan] Support javax.servlet and jakarta.servlet co-exist
Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
# 🔍 Description
## Issue References 🔗
Now, MaxScanStrategy can be adopted to limit max scan file size in some datasources, such as Hive. Hopefully we can enhance MaxScanStrategy to include support for the datasourcev2.
## Describe Your Solution 🔧
get the statistics about files scanned through datasourcev2 API
## Types of changes 🔖
- [ ] Bugfix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
#### Behavior Without This Pull Request ⚰️
#### Behavior With This Pull Request 🎉
#### Related Unit Tests
---
# Checklists
## 📝 Author Self Checklist
- [x] My code follows the [style guidelines](https://kyuubi.readthedocs.io/en/master/contributing/code/style.html) of this project
- [x] I have performed a self-review
- [x] I have commented my code, particularly in hard-to-understand areas
- [ ] I have made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [x] I have added tests that prove my fix is effective or that my feature works
- [x] New and existing unit tests pass locally with my changes
- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
## 📝 Committer Pre-Merge Checklist
- [x] Pull request title is okay.
- [x] No license issues.
- [x] Milestone correctly set?
- [x] Test coverage is ok
- [x] Assignees are selected.
- [ ] Minimum number of approvals
- [ ] No changes are requested
**Be nice. Be informative.**
Closes#5852 from zhaohehuhu/dev-1213.
Closes#6315
3c5b0c276 [hezhao2] reformat
fb113d625 [hezhao2] disable the rule that checks the maxPartitions for dsv2
acc358732 [hezhao2] disable the rule that checks the maxPartitions for dsv2
c8399a021 [hezhao2] fix header
70c845bee [hezhao2] add UTs
3a0739686 [hezhao2] add ut
4d26ce131 [hezhao2] reformat
f87cb072c [hezhao2] reformat
b307022b8 [hezhao2] move code to Spark 3.5
73258c2ae [hezhao2] fix unused import
cf893a0e1 [hezhao2] drop reflection for loading iceberg class
dc128bc8e [hezhao2] refactor code
661834cce [hezhao2] revert code
6061f42ab [hezhao2] delete IcebergSparkPlanHelper
5f1c3c082 [hezhao2] fix
b15652f05 [hezhao2] remove iceberg dependency
fe620ca92 [hezhao2] enable MaxScanStrategy when accessing iceberg datasource
Authored-by: hezhao2 <hezhao2@cisco.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
# 🔍 Description
This pull request fixes#6212
When Kyuubi cleans up Ranger related threads like PolicyRefresher, it should also shutdown the audit threads that include SolrZkClient. Otherwise Spark Driver keeps on running since SolrZkClient is a non-daemon thread. Added the cleanup as part of the shutdown hook that Kyuubi registers.
## Types of changes 🔖
- [x] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
#### Behavior Without This Pull Request ⚰️
#### Behavior With This Pull Request 🎉
#### Related Unit Tests
---
# Checklist 📝
- [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
Closes#6233 from amanraj2520/auditShutdown.
Closes#6212
e663d466c [amanraj2520] Refactored code
ed293a9a4 [amanraj2520] Removed unused import
95a6814ad [amanraj2520] Added audit handler shutdown to the shutdown hook
Authored-by: amanraj2520 <rajaman@microsoft.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
# 🔍 Description
This pull request removes unused dependency management in POM
## Types of changes 🔖
- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
Pass GA.
---
# Checklist 📝
- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
Closes#6267 from pan3793/clean-pom.
Closes#6267
d19f719bf [Cheng Pan] Remove usued dependency management in POM
Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
# 🔍 Description
This pull request
- improves comments for SPARK-33832
- removes unused `spark.sql.analyzer.classification.enabled` (I didn't update the migration rules because this configuration seems never to work properly)
## Types of changes 🔖
- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
Review
---
# Checklist 📝
- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
Closes#6260 from pan3793/nit.
Closes#6260
d762d30e9 [Cheng Pan] update comment
4ebaa04ea [Cheng Pan] nit
b303f05bb [Cheng Pan] remove spark.sql.analyzer.classification.enabled
b021cbc0a [Cheng Pan] Improve docs for SPARK-33832
Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
# 🔍 Description
## Issue References 🔗
This pull request fixes #
## Describe Your Solution 🔧
Improve DropIgnoreNonexistent rule for spark 3.5
## Types of changes 🔖
- [ ] Bugfix (non-breaking change which fixes an issue)
- [X] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
#### Behavior Without This Pull Request ⚰️
#### Behavior With This Pull Request 🎉
#### Related Unit Tests
DropIgnoreNonexistentSuite
---
# Checklist 📝
- [X] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
Closes#6215 from wForget/hotfix2.
Closes#6215
cb1d34de1 [wforget] Improve DropIgnoreNonexistent rule for spark 3.5
Authored-by: wforget <643348094@qq.com>
Signed-off-by: wforget <643348094@qq.com>
# 🔍 Description
## Issue References 🔗
This pull request fixes #
## Describe Your Solution 🔧
We should check `spark.memory.offHeap.enabled` when applying for `executorOffHeapMemory`.
## Types of changes 🔖
- [X] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
#### Behavior Without This Pull Request ⚰️
#### Behavior With This Pull Request 🎉
#### Related Unit Tests
---
# Checklist 📝
- [X] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
Closes#6211 from wForget/hotfix.
Closes#6211
1c7c8cd75 [wforget] Check memory offHeap enabled for CustomResourceProfileExec
Authored-by: wforget <643348094@qq.com>
Signed-off-by: wforget <643348094@qq.com>
# 🔍 Description
## Issue References 🔗
The POM of `kyuubi-spark-authz-shaded` is redundant, just pull `kyuubi-spark-authz` is necessary.
The current dependency management does not work on Ranger 2.1.0, this patch cleans up the POM definition and fixes the compatibility with Ranger 2.1.0
## Describe Your Solution 🔧
Carefully revise the dependency list and exclusion.
## Types of changes 🔖
- [x] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
perform packing kyuubi-spark-authz-shaded module.
```
build/mvn clean install -pl extensions/spark/kyuubi-spark-authz-shaded -am -DskipTests
```
before
```
[INFO] --- maven-shade-plugin:3.5.2:shade (default) kyuubi-spark-authz-shaded_2.12 ---
[INFO] Including org.apache.kyuubi:kyuubi-spark-authz_2.12🫙1.10.0-SNAPSHOT in the shaded jar.
[INFO] Including org.apache.kyuubi:kyuubi-util-scala_2.12🫙1.10.0-SNAPSHOT in the shaded jar.
[INFO] Including org.apache.kyuubi:kyuubi-util:jar:1.10.0-SNAPSHOT in the shaded jar.
[INFO] Including org.apache.ranger:ranger-plugins-common:jar:2.4.0 in the shaded jar.
[INFO] Including org.codehaus.jackson:jackson-jaxrs:jar:1.9.13 in the shaded jar.
[INFO] Including org.codehaus.jackson:jackson-core-asl:jar:1.9.13 in the shaded jar.
[INFO] Including org.codehaus.jackson:jackson-mapper-asl:jar:1.9.13 in the shaded jar.
[INFO] Including org.apache.ranger:ranger-plugins-cred:jar:2.4.0 in the shaded jar.
[INFO] Including com.sun.jersey:jersey-client:jar:1.19.4 in the shaded jar.
[INFO] Including com.sun.jersey:jersey-core:jar:1.19.4 in the shaded jar.
[INFO] Including com.kstruct:gethostname4j:jar:1.0.0 in the shaded jar.
[INFO] Including net.java.dev.jna:jna:jar:5.7.0 in the shaded jar.
[INFO] Including net.java.dev.jna:jna-platform:jar:5.7.0 in the shaded jar.
[INFO] Including org.apache.ranger:ranger-plugins-audit:jar:2.4.0 in the shaded jar.
```
after
```
[INFO] --- maven-shade-plugin:3.5.2:shade (default) kyuubi-spark-authz-shaded_2.12 ---
[INFO] Including org.apache.kyuubi:kyuubi-spark-authz_2.12🫙1.10.0-SNAPSHOT in the shaded jar.
[INFO] Including org.apache.kyuubi:kyuubi-util-scala_2.12🫙1.10.0-SNAPSHOT in the shaded jar.
[INFO] Including org.apache.kyuubi:kyuubi-util:jar:1.10.0-SNAPSHOT in the shaded jar.
[INFO] Including org.apache.ranger:ranger-plugins-common:jar:2.4.0 in the shaded jar.
[INFO] Including org.codehaus.jackson:jackson-jaxrs:jar:1.9.13 in the shaded jar.
[INFO] Including org.codehaus.jackson:jackson-core-asl:jar:1.9.13 in the shaded jar.
[INFO] Including org.codehaus.jackson:jackson-mapper-asl:jar:1.9.13 in the shaded jar.
[INFO] Including org.apache.ranger:ranger-plugins-cred:jar:2.4.0 in the shaded jar.
[INFO] Including com.sun.jersey:jersey-client:jar:1.19.4 in the shaded jar.
[INFO] Including com.sun.jersey:jersey-core:jar:1.19.4 in the shaded jar.
[INFO] Including com.kstruct:gethostname4j:jar:1.0.0 in the shaded jar.
[INFO] Including net.java.dev.jna:jna:jar:5.7.0 in the shaded jar.
[INFO] Including net.java.dev.jna:jna-platform:jar:5.7.0 in the shaded jar.
[INFO] Including org.apache.ranger:ranger-plugin-classloader:jar:2.4.0 in the shaded jar.
[INFO] Including org.apache.ranger:ranger-plugins-audit:jar:2.4.0 in the shaded jar.
```
---
# Checklist 📝
- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
**Be nice. Be informative.**
Closes#6197 from pan3793/authz-dep.
Closes#6197
d0becabce [Cheng Pan] 2.4
47e38502a [Cheng Pan] ranger 2.4
af01f7ed5 [Cheng Pan] test ranger 2.1
203aff3b3 [Cheng Pan] ranger-plugins-cred
974d76b03 [Cheng Pan] Resive dependency management of authz
e5154f30f [Cheng Pan] improve authz deps
Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>