Commit Graph

434 Commits

Author SHA1 Message Date
davidyuan
1b3de28b2c
[KYUUBI #6958] Test INSERT TABLE
### Why are the changes needed?

Currently , ranger check missing paimon insert table command, add test cases
#6958

### How was this patch tested?

1. Test INSERT INTO:
 1.1 table1OnlyUserForNs could select table1, try to insert table1
 1.2 someone has no any permission, try to insert table1
2. Test INSERT OVERWRITE:
 2.1 table1OnlyUserForNs could select table1, try to insert table2
 2.2 someone has no any permiession, try select table1 then insert table2

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #6959 from davidyuan1223/test_insert.

Closes #6958

d1f41ba81 [davidyuan] Merge branch 'master' into test_insert
b56e701d4 [davidyuan] Test Insert Table
8306210ee [davidyuan] update

Authored-by: davidyuan <yuanfuyuan@mafengwo.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-03-06 22:35:48 +08:00
davidyuan
61b69771be
[KYUUBI #6936] Test RenameTable command
### Why are the changes needed?

Test Authz Support paimon rename table name command privilege check
#6936

### How was this patch tested?

Test Authz Support paimon rename table name command privilege check

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #6937 from davidyuan1223/check_authz_paimon_rename_table.

Closes #6936

797d1c489 [davidyuan] Merge branch 'master' into check_authz_paimon_rename_table
bc3c823a3 [davidyuan] Merge remote-tracking branch 'origin/master' into check_authz_paimon_rename_table
6205670d2 [davidyuan] add renameTable to command_spec.json
e4b241ef5 [davidyuan] Merge branch 'master' into check_authz_paimon_rename_table
5fec3bcb7 [davidyuan] test paimon rename table name command
30d09418c [davidyuan] test paimon rename table name command

Authored-by: davidyuan <yuanfuyuan@mafengwo.com>
Signed-off-by: Kent Yao <yao@apache.org>
2025-03-05 14:32:41 +08:00
davidyuan
37eaf75ae3
[KYUUBI #6949] Test adding column position
### Why are the changes needed?

Ranger check test case missing paimon adding column position command, add the test case
#6949

### How was this patch tested?

Test ranger check with paimon adding column position command

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #6954 from davidyuan1223/test_adding_column_position.

Closes #6949

262ecaaca [davidyuan] Merge remote-tracking branch 'origin/master' into test_adding_column_position
154765fc3 [davidyuan] Merge branch 'master' into test_adding_column_position
4ebf985a9 [davidyuan] test adding column position

Authored-by: davidyuan <yuanfuyuan@mafengwo.com>
Signed-off-by: Kent Yao <yao@apache.org>
2025-03-05 14:17:51 +08:00
davidyuan
851178ce9a
[KYUUBI #6940] Test Unset Table Properties Command
### Why are the changes needed?

Currently range check missing check UnsetTableProperties command, we need add it to the range check.
#6940

### How was this patch tested?

Use paimon removing table properties to test this command

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #6944 from davidyuan1223/test_remove_table_properties.

Closes #6940

4f24d7d6a [davidyuan] Merge branch 'master' into test_remove_table_properties
11d3773ed [davidyuan] test unset table properties command

Authored-by: davidyuan <yuanfuyuan@mafengwo.com>
Signed-off-by: Kent Yao <yao@apache.org>
2025-03-05 13:37:39 +08:00
davidyuan
4cab817913
[KYUUBI #6950] Test changing column position
### Why are the changes needed?

Ranger check test case missing paimon changing column position command, add the test case
#6950

### How was this patch tested?

Test ranger check with paimon changing column position command

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #6955 from davidyuan1223/test_changing_column_position.

Closes #6950

520b5377f [davidyuan] Merge branch 'master' into test_changing_column_position
1eed87346 [davidyuan] test changing column position

Authored-by: davidyuan <yuanfuyuan@mafengwo.com>
Signed-off-by: Kent Yao <yao@apache.org>
2025-03-04 16:52:25 +08:00
Cheng Pan
d5b01fa3e2
[KYUUBI #6939] Bump Spark 3.5.5
### Why are the changes needed?

Test Spark 3.5.5 Release Notes

https://spark.apache.org/releases/spark-release-3-5-5.html

### How was this patch tested?

Pass GHA.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #6939 from pan3793/spark-3.5.5.

Closes #6939

8c0288ae5 [Cheng Pan] ga
78b0e72db [Cheng Pan] nit
686a7b0a9 [Cheng Pan] fix
d40cc5bba [Cheng Pan] Bump Spark 3.5.5

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-03-03 13:42:09 +08:00
davidyuan
bfcf2e708f
[KYUUBI #6942] Test Rename Column Name for paimon
### Why are the changes needed?

Currently, ranger check for paimon missing rename column name command, add the test case
#6942

### How was this patch tested?

Test Paimon Rename column name with ranger

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #6946 from davidyuan1223/test_rename_column_name.

Closes #6942

8e49eb0ab [davidyuan] test rename column name

Authored-by: davidyuan <yuanfuyuan@mafengwo.com>
Signed-off-by: Kent Yao <yao@apache.org>
2025-03-03 09:56:42 +08:00
davidyuan
525aec04a1
[KYUUBI #6923] Test Create Partitioned Table for Paimon
### Why are the changes needed?

AUTHZ Test Create Partitioned Table for PAIMON, check that has support the command
#6923

### How was this patch tested?

est Authz for paimon with create partitioned table command. Check the permission

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #6931 from davidyuan1223/support_create_with_parition_for_paimon.

Closes #6923

61f7560d3 [Cheng Pan] Merge branch 'master' into support_create_with_parition_for_paimon
ffb79376f [Cheng Pan] Update extensions/spark/kyuubi-spark-authz/src/test/scala/org/apache/kyuubi/plugin/spark/authz/ranger/PaimonCatalogRangerSparkExtensionSuite.scala
b0829795a [Bowen Liang] Update extensions/spark/kyuubi-spark-authz/src/test/scala/org/apache/kyuubi/plugin/spark/authz/ranger/PaimonCatalogRangerSparkExtensionSuite.scala
4b160d720 [davidyuan] support create partition table as for paimon

Lead-authored-by: davidyuan <yuanfuyuan@mafengwo.com>
Co-authored-by: Bowen Liang <bowenliang@apache.org>
Co-authored-by: Cheng Pan <pan3793@gmail.com>
Co-authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-02-24 14:43:40 +08:00
davidyuan
ff3da59f63
[KYUUBI #6932] Test ALTER TBLPROPERTIES for Paimon
### Why are the changes needed?

AUTHZ Test Add/Change Table properties for PAIMON, check that has support the command
https://github.com/apache/kyuubi/issues/6932

### How was this patch tested?

Test Add/Change properties SQL

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #6933 from davidyuan1223/test_alter_tableproperties_for_paimin.

Closes #6932

4d64fbf23 [Cheng Pan] Update extensions/spark/kyuubi-spark-authz/src/test/scala/org/apache/kyuubi/plugin/spark/authz/ranger/PaimonCatalogRangerSparkExtensionSuite.scala
c861a778b [davidyuan] support add/change table properties for paimon

Lead-authored-by: davidyuan <yuanfuyuan@mafengwo.com>
Co-authored-by: Cheng Pan <pan3793@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-02-24 14:31:49 +08:00
davidyuan
ed96ac167d
[KYUUBI #6921][AUTHZ] Test CTAS for Paimon
### Why are the changes needed?

AUTHZ Test CTAS for Paimon to check it support this command, the related issue is https://github.com/apache/kyuubi/issues/6921

### How was this patch tested?

Test Authz for paimon with create table as command. Check the permission.

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #6922 from davidyuan1223/support_create_table_as_for_paimon_check.

Closes #6921

7bfd6ad49 [david yuan] Update extensions/spark/kyuubi-spark-authz/src/test/scala/org/apache/kyuubi/plugin/spark/authz/ranger/PaimonCatalogRangerSparkExtensionSuite.scala
a9ce20cc4 [davidyuan] support create table as for paimon

Lead-authored-by: davidyuan <yuanfuyuan@mafengwo.com>
Co-authored-by: david yuan <davidyuan1223@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-02-19 14:21:06 +08:00
Cheng Pan
93ac1ee269
[KYUUBI #6925] Only run Paimon authz tests with Scala 2.12
### Why are the changes needed?

Paimon does not seem to support Scala 2.13

### How was this patch tested?

Pass GHA.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #6925 from pan3793/authz-paimon-scala212.

Closes #6925

865a7dd72 [Cheng Pan] fix
971d23273 [Cheng Pan] Update extensions/spark/kyuubi-spark-authz/src/test/scala/org/apache/kyuubi/plugin/spark/authz/ranger/PaimonCatalogRangerSparkExtensionSuite.scala
499f10ab0 [Cheng Pan] Only run Paimon authz tests with Scala 2.12

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-02-19 14:19:22 +08:00
xglv1985
7c110b68f8
[KYUUBI #6912][LINEAGE] Properly handle empty attribute set on mergeRelationColumnLineage
# Why are the changes needed?
## Issue reference:
https://github.com/apache/kyuubi/issues/6912

## How to reproduce the issue?
The changes in this PR will avoid a wrong result when generating the instance of org.apache.kyuubi.plugin.lineage.Lineage, in the certain case as follows:
step 1: create a temporary view from a file
step 2: insert into a table by selecting from the temporary view in step 1
step 3: generate the lineage when executing the insert statement in step 2
In detail, please see the UT code submission in this patch.

## The issue analysis
Let's see the current code when getting the Lineage object by resolving a LogicalPlan object:
<img width="694" alt="image" src="https://github.com/user-attachments/assets/65256a0d-320d-4271-968f-59eafb74de9f" />

According to the above logic, a None org.apache.kyuubi.plugin.lineage.Lineage object will be generated due to "try-catch" self-protection, in this certain case. This None object will lead to problems in the following 2 scenes:
### Unit Test Environment
In Unit Test, when the code runs here a "None.get" exception will be raised:
<img width="682" alt="image" src="https://github.com/user-attachments/assets/102dc9bd-294f-4b1e-b1c6-01b6fee50fed" />

Here's the runtime exception stack:
```
None.get
java.util.NoSuchElementException: None.get
	at scala.None$.get(Option.scala:529)
	at scala.None$.get(Option.scala:527)
	at org.apache.kyuubi.plugin.lineage.helper.SparkSQLLineageParserHelperSuite.extractLineageWithoutExecuting(SparkSQLLineageParserHelperSuite.scala:1485)
	at org.apache.kyuubi.plugin.lineage.helper.SparkSQLLineageParserHelperSuite.$anonfun$new$83(SparkSQLLineageParserHelperSuite.scala:1465)
```
### Production Environment
This Lineage object cannot be used in the production environment because it has a None value which lacks some necessary lineage information. The right content of the Lineage instance in the above case should be:
```
inputTables(List())
outputTables(List(spark_catalog.test_db.test_table_from_dir))
columnLineage(List(ColumnLineage(spark_catalog.test_db.test_table_from_dir.a0,Set()), ColumnLineage(spark_catalog.test_db.test_table_from_dir.b0,Set())))
```

a newly added test case(test directory to table) passed after this issue is fixed.

# How to fix the issue?
Add a "Empty judgment" logic. In detail, please see the code submission in this patch.

# How was this patch tested?
1. by adding a new test case in UT code and make sure it passes
2. by submitting a Spark application including the SQL of this case in the production environment, and make sure a right Lineage instance is generated, instead of a None object

# Was this patch authored or co-authored using generative AI tooling?
No

Closes #6911 from xglv1985/fix_spark_lineage_runtime_exception.

Closes #6912

13a71075d [Cheng Pan] Update extensions/spark/kyuubi-spark-lineage/src/test/scala/org/apache/kyuubi/plugin/lineage/helper/SparkSQLLineageParserHelperSuite.scala
4e89b95cd [Cheng Pan] Update extensions/spark/kyuubi-spark-lineage/src/test/scala/org/apache/kyuubi/plugin/lineage/helper/SparkSQLLineageParserHelperSuite.scala
59b350bfb [xglv1985] fix a runtime exception when generate column lineage tuple--more readable code
52bc0288d [xglv1985] fix a runtime exception when generate column lineage tuple--spotless sytle
fea6bbc0d [xglv1985] fix a runtime exception when generate column lineage tuple--remove tab from UT code
901879095 [xglv1985] fix a runtime exception when generate column lineage tuple--unit test
fbb4df879 [xglv1985] fix a runtime exception when generate column lineage tuple

Lead-authored-by: xglv1985 <xglv1985@gmail.com>
Co-authored-by: Cheng Pan <pan3793@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-02-14 10:27:51 +08:00
Octavian Ciubotaru
1bd9e10987
[KYUUBI #6901] Default policy for spark
### Why are the changes needed?
Added a service definition for spark which in turn enables the creation of a default policy for the spark service.
Default policy will block access until another policy is downloaded from Apache Ranger.

### How was this patch tested?
Tested manually.
Configure Kyuubi Authz plugin. Do not start Apache Ranger, it must not be reachable.
Make sure that policy cache is empty.
Start Kyuubi engine and try to query any tables. The default policy should not allow any access.
Previously the access was not restricted because there wasn't a default policy defined.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes #6902 from developster/master.

Closes #6901

feb6ebf61 [Octavian Ciubotaru] Default policy for spark

Authored-by: Octavian Ciubotaru <ociubotaru@developmentgateway.org>
Signed-off-by: Kent Yao <yao@apache.org>
2025-02-11 13:52:08 +08:00
zhaohehuhu
117e56c7cb
[KYUUBI #6862] Spark 3.3: MaxScanStrategy supports DSv2
### Why are the changes needed?

Backport https://github.com/apache/kyuubi/pull/5852 to Spark 3.3, to enhance MaxScanStrategy to include support for the datasourcev2 in Spark 3.3

### How was this patch tested?

Add some UTs

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #6862 from zhaohehuhu/dev-1225.

Closes #6862

c745eda14 [zhaohehuhu] MaxScanStrategy supports DSv2 in Spark 3.3

Authored-by: zhaohehuhu <luoyedeyi459@163.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2024-12-25 17:21:23 +08:00
zhaohehuhu
3c72fef476
[KYUUBI #6857] Spark 3.4: MaxScanStrategy supports DSv2
### Why are the changes needed?

Backport https://github.com/apache/kyuubi/pull/5852 to Spark 3.4, to enhance MaxScanStrategy to include support for the datasourcev2 in Spark 3.4

### How was this patch tested?

Add some UTs

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #6857 from zhaohehuhu/dev-1224.

Closes #6857

c72c62984 [zhaohehuhu] remove the import
dfbf2bc2d [zhaohehuhu] MaxScanStrategy supports DSv2 in Spark 3.4

Authored-by: zhaohehuhu <luoyedeyi459@163.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2024-12-24 20:24:03 +08:00
Cheng Pan
4b506f4cb9
[KYUUBI #6820] Explicitly disable attach-scaladocs for pure Java modules
# 🔍 Description

```
export JAVA_HOME=/path/of/openjdk-17
build/mvn clean install -DskipTests -Dmaven.scaladoc.skip=false
```

```
[INFO] --- scala-maven-plugin:4.9.2:doc-jar (attach-scaladocs)  kyuubi-server-plugin ---
[INFO] compiler plugin: BasicArtifact(com.github.ghik,silencer-plugin_2.12.20,1.7.19,null)
error: fatal error: object scala in compiler mirror not found.
```

## Types of changes 🔖

- [x] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

Successfully run the build command
```
export JAVA_HOME=/path/of/openjdk-17
build/mvn clean install -DskipTests -Dmaven.scaladoc.skip=false
```

---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6820 from pan3793/scaladoc.

Closes #6820

f5cee3429 [Cheng Pan] Explicitly disable attach-scaladocs for pure Java modules

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2024-11-22 17:12:00 +08:00
Bowen Liang
d3520ddbce [KYUUBI #6769] [RELEASE] Bump 1.11.0-SNAPSHOT
# 🔍 Description
## Issue References 🔗

This pull request fixes #

## Describe Your Solution 🔧

Preparing v1.11.0-SNAPSHOT after branch-1.10 cut

```shell
build/mvn versions:set -DgenerateBackupPoms=false -DnewVersion="1.11.0-SNAPSHOT"
(cd kyuubi-server/web-ui && npm version "1.11.0-SNAPSHOT")
```

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

#### Behavior Without This Pull Request ⚰️

#### Behavior With This Pull Request 🎉

#### Related Unit Tests

---

# Checklist 📝

- [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6769 from bowenliang123/bump-1.11.

Closes #6769

6db219d28 [Bowen Liang] get latest_branch by sorting version in branch name
465276204 [Bowen Liang] update package.json
81f2865e5 [Bowen Liang] bump

Authored-by: Bowen Liang <liangbowen@gf.com.cn>
Signed-off-by: Bowen Liang <liangbowen@gf.com.cn>
2024-10-23 17:10:56 +08:00
wankunde
04f443792b [KYUUBI #6754][AUTHZ] Improve the performance of Ranger access requests deduplication
# 🔍 Description
## Issue References 🔗

This pull request fixes #6754

## Describe Your Solution 🔧

Right now in RuleAuthorization we use an ArrayBuffer to collect access requests, which is very slow because each new PrivilegeObject needs to be compared with all access requests.

## Types of changes 🔖

- [x] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

#### Behavior Without This Pull Request ⚰️

Add benchmark
Before
```sh
Java HotSpot(TM) 64-Bit Server VM 17.0.12+8-LTS-286 on Mac OS X 14.6
Apple M3
Collecting files ranger access request:   Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
50000 files benchmark                            181863         189434         NaN         -0.0 -181863368958.0       1.0X
````

#### Behavior With This Pull Request 🎉

After
```sh
Java HotSpot(TM) 64-Bit Server VM 17.0.12+8-LTS-286 on Mac OS X 14.6
Apple M3
Collecting files ranger access request:   Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
50000 files benchmark                              1281           1310          33         -0.0 -1280563000.0       1.0X
```

#### Related Unit Tests

Exists UT

---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6758 from wankunde/ranger2.

Closes #6754

9d7d1964b [wankunde] [KYUUBI #6754] Improve the performance of ranger access requests
88b9c049b [wankun] Update extensions/spark/kyuubi-spark-authz/src/test/scala/org/apache/spark/sql/RuleAuthorizationBenchmark.scala
20c55fbeb [wankun] Update extensions/spark/kyuubi-spark-authz/pom.xml
f5a3b6ca5 [wankunde] [KYUUBI #6754] Improve the performance of ranger access requests
9793249de [wankunde] [KYUUBI #6754] Improve the performance of ranger access requests
d86b01f9c [wankunde] [KYUUBI #6754] Improve the performance of ranger access requests
b904b491b [wankunde] [KYUUBI #6754] Improve the performance of ranger access requests
aad08a6bb [wankunde] [KYUUBI #6754] Improve the performance of ranger access requests
1374604bc [wankunde] [KYUUBI #6754] Improve the performance of ranger access requests
01e15c149 [wankun] Update extensions/spark/kyuubi-spark-authz/pom.xml
805e8a9c0 [wankun] Update extensions/spark/kyuubi-spark-authz/pom.xml
e19817943 [wankunde] [KYUUBI #6754] Improve the performance of ranger access requests

Lead-authored-by: wankunde <wankunde@163.com>
Co-authored-by: wankun <wankun@apache.org>
Signed-off-by: Bowen Liang <liangbowen@gf.com.cn>
2024-10-21 21:17:51 +08:00
wforget
1e9d68b000 [KYUUBI #6368] Flink engine supports user impersonation
# 🔍 Description
## Issue References 🔗

This pull request fixes #6368

## Describe Your Solution 🔧

Support impersonation mode for flink sql engine.

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [X] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

#### Behavior Without This Pull Request ⚰️

#### Behavior With This Pull Request 🎉

Test in hadoop-testing env.

Connection:

```
beeline -u "jdbc:hive2://hadoop-master1.orb.local:10009/default;hive.server2.proxy.user=spark;principal=kyuubi/_HOSTTEST.ORG?kyuubi.engine.type=FLINK_SQL;flink.execution.target=yarn-application;kyuubi.engine.share.level=CONNECTION;kyuubi.engine.flink.doAs.enabled=true;"
```

sql:

```
select 1;
```

result:

![image](https://github.com/apache/kyuubi/assets/17894939/4bde3e4e-0dac-4e09-ac7c-a2c3a3607a13)

launch engine command:

```
2024-06-12 03:22:10.242 INFO KyuubiSessionManager-exec-pool: Thread-62 org.apache.kyuubi.engine.EngineRef: Launching engine:
/opt/flink-1.18.1/bin/flink run-application \
	-t yarn-application \
	-Dyarn.ship-files=/opt/flink/opt/flink-sql-client-1.18.1.jar;/opt/flink/opt/flink-sql-gateway-1.18.1.jar;/etc/hive/conf/hive-site.xml \
	-Dyarn.application.name=kyuubi_CONNECTION_FLINK_SQL_spark_6170b9aa-c690-4b50-938f-d59cca9aa2d6 \
	-Dyarn.tags=KYUUBI,6170b9aa-c690-4b50-938f-d59cca9aa2d6 \
	-Dcontainerized.master.env.FLINK_CONF_DIR=. \
	-Dcontainerized.master.env.HIVE_CONF_DIR=. \
	-Dyarn.security.appmaster.delegation.token.services=kyuubi \
	-Dsecurity.delegation.token.provider.HiveServer2.enabled=false \
	-Dsecurity.delegation.token.provider.hbase.enabled=false \
	-Dexecution.target=yarn-application \
	-Dsecurity.module.factory.classes=org.apache.flink.runtime.security.modules.JaasModuleFactory;org.apache.flink.runtime.security.modules.ZookeeperModuleFa
ctory \
	-Dsecurity.delegation.token.provider.hadoopfs.enabled=false \
	-c org.apache.kyuubi.engine.flink.FlinkSQLEngine /opt/apache-kyuubi-1.10.0-SNAPSHOT-bin/externals/engines/flink/kyuubi-flink-sql-engine_2.12-1.10.0-SNAPS
HOT.jar \
	--conf kyuubi.session.user=spark \
	--conf kyuubi.client.ipAddress=172.20.0.5 \
	--conf kyuubi.engine.credentials=SERUUwACJnRocmlmdDovL2hhZG9vcC1tYXN0ZXIxLm9yYi5sb2NhbDo5MDgzRQAFc3BhcmsEaGl2ZShreXV1YmkvaGFkb29wLW1hc3RlcjEub3JiLmxvY2Fs
QFRFU1QuT1JHigGQCneevIoBkC6EIrwWDxSg03pnAB8dA295wh+Dim7Fx4FNxhVISVZFX0RFTEVHQVRJT05fVE9LRU4ADzE3Mi4yMC4wLjU6ODAyMEEABXNwYXJrAChreXV1YmkvaGFkb29wLW1hc3RlcjEub3JiL
mxvY2FsQFRFU1QuT1JHigGQCneekIoBkC6EIpBHHBSket0SQnlXT5EIMN0U2fUKFRIVvBVIREZTX0RFTEVHQVRJT05fVE9LRU4PMTcyLjIwLjAuNTo4MDIwAA== \
	--conf kyuubi.engine.flink.doAs.enabled=true \
	--conf kyuubi.engine.hive.extra.classpath=/opt/hadoop/share/hadoop/client/*:/opt/hadoop/share/hadoop/mapreduce/* \
	--conf kyuubi.engine.share.level=CONNECTION \
	--conf kyuubi.engine.submit.time=1718162530017 \
	--conf kyuubi.engine.type=FLINK_SQL \
	--conf kyuubi.frontend.protocols=THRIFT_BINARY,REST \
	--conf kyuubi.ha.addresses=hadoop-master1.orb.local:2181 \
	--conf kyuubi.ha.engine.ref.id=6170b9aa-c690-4b50-938f-d59cca9aa2d6 \
	--conf kyuubi.ha.namespace=/kyuubi_1.10.0-SNAPSHOT_CONNECTION_FLINK_SQL/spark/6170b9aa-c690-4b50-938f-d59cca9aa2d6 \
	--conf kyuubi.server.ipAddress=172.20.0.5 \
	--conf kyuubi.session.connection.url=hadoop-master1.orb.local:10009 \
	--conf kyuubi.session.engine.startup.waitCompletion=false \
	--conf kyuubi.session.real.user=spark
```

launch engine log:

![image](https://github.com/apache/kyuubi/assets/17894939/590463a8-2858-47a2-8897-0ddfbe3ffdf6)

jobmanager job:

```
2024-06-12 03:22:26,400 INFO  org.apache.flink.runtime.security.token.DefaultDelegationTokenManager [] - Loading delegation token providers
2024-06-12 03:22:26,992 INFO  org.apache.kyuubi.engine.flink.security.token.KyuubiDelegationTokenProvider [] - Renew delegation token with engine credentials: SERUUwACJnRocmlmdDovL2hhZG9vcC1tYXN0ZXIxLm9yYi5sb2NhbDo5MDgzRQAFc3BhcmsEaGl2ZShreXV1YmkvaGFkb29wLW1hc3RlcjEub3JiLmxvY2FsQFRFU1QuT1JHigGQCneevIoBkC6EIrwWDxSg03pnAB8dA295wh+Dim7Fx4FNxhVISVZFX0RFTEVHQVRJT05fVE9LRU4ADzE3Mi4yMC4wLjU6ODAyMEEABXNwYXJrAChreXV1YmkvaGFkb29wLW1hc3RlcjEub3JiLmxvY2FsQFRFU1QuT1JHigGQCneekIoBkC6EIpBHHBSket0SQnlXT5EIMN0U2fUKFRIVvBVIREZTX0RFTEVHQVRJT05fVE9LRU4PMTcyLjIwLjAuNTo4MDIwAA==
2024-06-12 03:22:27,100 INFO  org.apache.kyuubi.engine.flink.FlinkEngineUtils              [] - Add new unknown token Kind: HIVE_DELEGATION_TOKEN, Service: , Ident: 00 05 73 70 61 72 6b 04 68 69 76 65 28 6b 79 75 75 62 69 2f 68 61 64 6f 6f 70 2d 6d 61 73 74 65 72 31 2e 6f 72 62 2e 6c 6f 63 61 6c 40 54 45 53 54 2e 4f 52 47 8a 01 90 0a 77 9e bc 8a 01 90 2e 84 22 bc 16 0f
2024-06-12 03:22:27,104 WARN  org.apache.kyuubi.engine.flink.FlinkEngineUtils              [] - Ignore token with earlier issue date: Kind: HDFS_DELEGATION_TOKEN, Service: 172.20.0.5:8020, Ident: (token for spark: HDFS_DELEGATION_TOKEN owner=spark, renewer=, realUser=kyuubi/hadoop-master1.orb.localTEST.ORG, issueDate=1718162529936, maxDate=1718767329936, sequenceNumber=71, masterKeyId=28)
2024-06-12 03:22:27,104 INFO  org.apache.kyuubi.engine.flink.FlinkEngineUtils              [] - Update delegation tokens. The number of tokens sent by the server is 2. The actual number of updated tokens is 1.
......
4-06-12 03:22:29,414 INFO  org.apache.flink.runtime.security.token.DefaultDelegationTokenManager [] - Starting tokens update task
2024-06-12 03:22:29,415 INFO  org.apache.flink.runtime.security.token.DelegationTokenReceiverRepository [] - New delegation tokens arrived, sending them to receivers
2024-06-12 03:22:29,422 INFO  org.apache.kyuubi.engine.flink.security.token.KyuubiDelegationTokenReceiver [] - Updating delegation tokens for current user
2024-06-12 03:22:29,422 INFO  org.apache.kyuubi.engine.flink.security.token.KyuubiDelegationTokenReceiver [] - Token Service: Identifier:[10, 13, 10, 9, 8, 10, 16, -78, -36, -49, -17, -5, 49, 16, 1, 16, -100, -112, -60, -127, -8, -1, -1, -1, -1, 1]
2024-06-12 03:22:29,422 INFO  org.apache.kyuubi.engine.flink.security.token.KyuubiDelegationTokenReceiver [] - Token Service: Identifier:[0, 5, 115, 112, 97, 114, 107, 4, 104, 105, 118, 101, 40, 107, 121, 117, 117, 98, 105, 47, 104, 97, 100, 111, 111, 112, 45, 109, 97, 115, 116, 101, 114, 49, 46, 111, 114, 98, 46, 108, 111, 99, 97, 108, 64, 84, 69, 83, 84, 46, 79, 82, 71, -118, 1, -112, 10, 119, -98, -68, -118, 1, -112, 46, -124, 34, -68, 22, 15]
2024-06-12 03:22:29,422 INFO  org.apache.kyuubi.engine.flink.security.token.KyuubiDelegationTokenReceiver [] - Token Service:172.20.0.5:8020 Identifier:[0, 5, 115, 112, 97, 114, 107, 0, 40, 107, 121, 117, 117, 98, 105, 47, 104, 97, 100, 111, 111, 112, 45, 109, 97, 115, 116, 101, 114, 49, 46, 111, 114, 98, 46, 108, 111, 99, 97, 108, 64, 84, 69, 83, 84, 46, 79, 82, 71, -118, 1, -112, 10, 119, -98, -112, -118, 1, -112, 46, -124, 34, -112, 71, 28]
2024-06-12 03:22:29,422 INFO  org.apache.kyuubi.engine.flink.security.token.KyuubiDelegationTokenReceiver [] - Updated delegation tokens for current user successfully

```

taskmanager log:

```
2024-06-12 03:45:06,622 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Receive initial delegation tokens from resource manager
2024-06-12 03:45:06,627 INFO  org.apache.flink.runtime.security.token.DelegationTokenReceiverRepository [] - New delegation tokens arrived, sending them to receivers
2024-06-12 03:45:06,628 INFO  org.apache.kyuubi.engine.flink.security.token.KyuubiDelegationTokenReceiver [] - Updating delegation tokens for current user
2024-06-12 03:45:06,629 INFO  org.apache.kyuubi.engine.flink.security.token.KyuubiDelegationTokenReceiver [] - Token Service: Identifier:[10, 13, 10, 9, 8, 10, 16, -78, -36, -49, -17, -5, 49, 16, 1, 16, -100, -112, -60, -127, -8, -1, -1, -1, -1, 1]
2024-06-12 03:45:06,630 INFO  org.apache.kyuubi.engine.flink.security.token.KyuubiDelegationTokenReceiver [] - Token Service: Identifier:[0, 5, 115, 112, 97, 114, 107, 4, 104, 105, 118, 101, 40, 107, 121, 117, 117, 98, 105, 47, 104, 97, 100, 111, 111, 112, 45, 109, 97, 115, 116, 101, 114, 49, 46, 111, 114, 98, 46, 108, 111, 99, 97, 108, 64, 84, 69, 83, 84, 46, 79, 82, 71, -118, 1, -112, 10, 119, -98, -68, -118, 1, -112, 46, -124, 34, -68, 22, 15]
2024-06-12 03:45:06,630 INFO  org.apache.kyuubi.engine.flink.security.token.KyuubiDelegationTokenReceiver [] - Token Service:172.20.0.5:8020 Identifier:[0, 5, 115, 112, 97, 114, 107, 0, 40, 107, 121, 117, 117, 98, 105, 47, 104, 97, 100, 111, 111, 112, 45, 109, 97, 115, 116, 101, 114, 49, 46, 111, 114, 98, 46, 108, 111, 99, 97, 108, 64, 84, 69, 83, 84, 46, 79, 82, 71, -118, 1, -112, 10, 119, -98, -112, -118, 1, -112, 46, -124, 34, -112, 71, 28]
2024-06-12 03:45:06,636 INFO  org.apache.kyuubi.engine.flink.security.token.KyuubiDelegationTokenReceiver [] - Updated delegation tokens for current user successfully
2024-06-12 03:45:06,636 INFO  org.apache.flink.runtime.security.token.DelegationTokenReceiverRepository [] - Delegation tokens sent to receivers
```

#### Related Unit Tests

---

# Checklist 📝

- [X] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6383 from wForget/KYUUBI-6368.

Closes #6368

47df43ef0 [wforget] remove doAsEnabled
984b96c74 [wforget] update settings.md
c7f8d474e [wforget] make generateTokenFile conf to internal
8632176b1 [wforget] address comments
2ec270e8a [wforget] licenses
ed0e22f4e [wforget] separate kyuubi-flink-token-provider module
b66b855b6 [wforget] address comment
d4fc2bd1d [wforget] fix
1a3dc4643 [wforget] fix style
825e2a7a0 [wforget] address comments
a679ba1c2 [wforget] revert remove renewer
cdd499b95 [wforget] fix and comment
19caec6c0 [wforget] pass token to submit process
b2991d419 [wforget] fix
7c3bdde1b [wforget] remove security.delegation.tokens.enabled check
8987c9176 [wforget] fix
5bd8cfe7c [wforget] fix
08992642d [wforget] Implement KyuubiDelegationToken Provider/Receiver
fa16d7def [wforget] enable delegation token manager
e50db7497 [wforget] [KYUUBI #6368] Support impersonation mode for flink sql engine

Authored-by: wforget <643348094@qq.com>
Signed-off-by: Bowen Liang <liangbowen@gf.com.cn>
2024-10-21 17:32:39 +08:00
Cheng Pan
9c105da117 [KYUUBI #6638][FOLLOWUP] Authz shaded should include jsr311-api
# 🔍 Description
## Issue References 🔗

Fix a ClassNotFound issue.
```
java.lang.NoClassDefFoundError: org/apache/kyuubi/shade/javax/ws/rs/core/Cookie
```

## Types of changes 🔖

- [x] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

Verified manually.

---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6723 from pan3793/6638-followup.

Closes #6638

56e9842e0 [Cheng Pan] [KYUUBI #6638] authz shaded should include jsr311-api

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Bowen Liang <liangbowen@gf.com.cn>
2024-10-15 17:24:07 +08:00
madlnu
ebe7e922ee
[KYUUBI #6666][AUTHZ]Upgrade Ranger plugin to 2.5.0
# 🔍 Description
## Issue References 🔗

This pull request fixes #6666

## Describe Your Solution 🔧

Bump ranger version to 2.5.0
Release notes: https://cwiki.apache.org/confluence/display/RANGER/Apache+Ranger+2.5.0+-+Release+Notes

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

#### Behavior Without This Pull Request ⚰️

#### Behavior With This Pull Request 🎉

#### Related Unit Tests

---

# Checklist 📝

- [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6692 from Madhukar525722/ranger_upgrade.

Closes #6666

88e1e12c5 [madlnu] [KYUUBI #6666] Upgrade spark ranger plugin to 2.5.0

Authored-by: madlnu <madlnu@visa.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2024-09-23 17:51:17 +08:00
Cheng Pan
1bfc8c5840
[KYUUBI #6699] Bump Spark 4.0.0-preview2
# 🔍 Description

Spark 4.0.0-preview2 RC1 passed the vote
https://lists.apache.org/thread/4ctj2mlgs4q2yb4hdw2jy4z34p5yw2b1

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

Pass GHA.

---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6699 from pan3793/spark-4.0.0-preview2.

Closes #6699

2db1f645d [Cheng Pan] 4.0.0-preview2
42055bb1e [Cheng Pan] fix
d29c0ef83 [Cheng Pan] disable delta test
98d323b95 [Cheng Pan] fix
2e782c00b [Cheng Pan] log4j-slf4j2-impl
fde4bb6ba [Cheng Pan] spark-4.0.0-preview2

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2024-09-23 17:42:48 +08:00
Bowen Liang
57ab60a495 [KYUUBI #6672] Cleanup unused Commons Lang 2 dependency
# 🔍 Description
## Issue References 🔗

This pull request fixes #

## Describe Your Solution 🔧

- Apache Commons Lang2 is no longer actively maintained and not used by Kyuubi modules

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

#### Behavior Without This Pull Request ⚰️

#### Behavior With This Pull Request 🎉

#### Related Unit Tests

---

# Checklist 📝

- [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6672 from bowenliang123/remove-commonlang2.

Closes #6672

34cda170a [liangbowen] remove common lang2

Lead-authored-by: Bowen Liang <liangbowen@gf.com.cn>
Co-authored-by: liangbowen <liangbowen@gf.com.cn>
Signed-off-by: liangbowen <liangbowen@gf.com.cn>
2024-09-05 17:44:20 +08:00
王龙
e1e7772a9f
[KYUUBI #5402] Introduce Spark JVM quake plugin
# 🔍 Description
## Issue References 🔗

This pull request fixes #5402

## Describe Your Solution 🔧

When facing out-of-control memory management in Spark engine, we typically use JVMkill as a remedy by killing the process and generating a heap dump for post-analysis. However, even with jvmkill protection, we may still encounter issues caused by JVM running out of memory, such as repeated execution of Full GC without performing any useful work during the pause time. Since the JVM does not exhaust 100% of resources, JVMkill will not be triggered.

So introducing JVMQuake provides more granular monitoring of GC behavior, enabling early detection of memory management issues and facilitating fast failure.
You can use the following configuration to enable jvmQuake plugins:
```
spark.plugins=org.apache.spark.kyuubi.jvm.quake.KyuubiJVMQuakePlugin
```
|  configuration   | default  | comment  |
|  ----  | ----  | ----  |
| spark.driver.jvmQuake.enabled  | false | when true, enable driver jvmQuake   |
| spark.executor.jvmQuake.enabled  | false | when true, enable executor jvmQuake   |
| spark.driver.jvmQuake.heapDump.enabled  | false | when true, enable jvm heap dump when jvmQuake rearch the threshold   |
| spark.executor.jvmQuake.heapDump.enabled  | false | when true, enable jvm heap dump when jvmQuake rearch the threshold   |
| spark.jvmQuake.dumpThreshold  | 100 | The number of seconds to dump memory  |
| spark.jvmQuake.killThreshold  | 200 | The number of seconds to kill process  |
| spark.jvmQuake.exitCode  | 502 | The exit code of kill process  |
| spark.jvmQuake.heapDumpPath  | /tmp/kyuubi_jvm_quake/apps | The path of heap dump  |
| spark.jvmQuake.checkInterval  | 3 | The number of seconds to check jvmQuake  |
| spark.jvmQuake.runTimeWeight  | 1.0 | The weight of rum time  |

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

#### Behavior Without This Pull Request ⚰️

#### Behavior With This Pull Request 🎉

#### Related Unit Tests

---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6572 from yoock/features/kyuubi-jvm-quake.

Closes #5402

84361ce8f [王龙] add jvm quake

Authored-by: 王龙 <wanglong16@xiaomi.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2024-09-02 12:29:41 +08:00
Cheng Pan
d5c31a85a4
[KYUUBI #6640] [AUTHZ] Adapt Derby 10.16 new JDBC driver package name
# 🔍 Description

SPARK-46257 (Spark 4.0.0) moves to Derby 10.16, `org.apache.derby.jdbc.AutoloadedDriver` has been moved to `org.apache.derby.iapi.jdbc.AutoloadedDriver`

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

Manually tested with Spark 4.0.

---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6640 from pan3793/authz-derby.

Closes #6640

46edb32be [Cheng Pan] Update extensions/spark/kyuubi-spark-authz/src/main/scala/org/apache/kyuubi/plugin/spark/authz/util/AuthZUtils.scala
7eee47f0d [Cheng Pan] Adapt Derby 10.16 new JDBC driver package name

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2024-08-23 12:27:48 +08:00
Cheng Pan
96ec1323ac
[KYUUBI #6638] Shade jsr311-api in Authz
# 🔍 Description

I faced the following error when trying to run authz with Spark 4.0
```
  Cause: java.lang.NoClassDefFoundError: javax/ws/rs/core/Cookie
  at java.base/java.lang.Class.forName0(Native Method)
  at java.base/java.lang.Class.forName(Class.java:375)
  at org.apache.ranger.plugin.policyengine.RangerPluginContext.createAdminClient(RangerPluginContext.java:96)
  at org.apache.ranger.plugin.util.PolicyRefresher.<init>(PolicyRefresher.java:90)
  at org.apache.ranger.plugin.service.RangerBasePlugin.init(RangerBasePlugin.java:251)
  at org.apache.kyuubi.plugin.spark.authz.ranger.SparkRangerAdminPlugin$.initialize(SparkRangerAdminPlugin.scala:68)
```

The `javax.ws.rs:jsr311-api` is the transitive dep of `jersey-client`, we should shade and relocate it correctly.

Why does it work with Spark 3? Spark 3 provides `jakarta.ws.rs:jakarta.ws.rs-api:2.1.6` which provides `java.ws.rs.*` classes, but Spark 4 upgrades to `jakarta.ws.rs:jakarta.ws.rs-api:3.0.0` which changed package name to`jakarta.ws.rs.*`.

## Types of changes 🔖

- [x] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

Pass GHA and manually tested with Spark 4

---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6638 from pan3793/jsr311.

Closes #6638

5699200cf [Cheng Pan] Shade jsr311-api in Authz

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2024-08-23 00:40:35 +08:00
xorsum
d414535cb6
[KYUUBI #6582] [KYUUBI-6581] Zorder clause syntax does not support special characters
# 🔍 Description
## Issue References 🔗

This pull request fixes #6581

## Describe Your Solution 🔧

Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.

I modified `KyuubiSparkSQLAstBuilder#visitMultipartIdentifier` and implemented `KyuubiSparkSQLAstBuilder#visitQuotedIdentifier` to process the quoted identifiers.

## Types of changes 🔖

- [x] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

#### Behavior Without This Pull Request ⚰️

#### Behavior With This Pull Request 🎉

#### Related Unit Tests

```
extensions/spark/kyuubi-extension-spark-3-3/src/test/scala/org/apache/spark/sql/ZorderSuiteBase.scala

test("optimize sort by backquoted column name")
```

---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6582 from XorSum/features/zorder-backquote.

Closes #6582

16ffa1238 [xorsum] zorder by support quote

Authored-by: xorsum <xorsum@outlook.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2024-08-06 13:39:25 +08:00
joey.ljy
80c8e38066
[KYUUBI #6564] Insert into table check the privilege of table
# 🔍 Description
## Issue References 🔗

This pull request fixes #6564

## Describe Your Solution 🔧

Remove the `columnDesc` for `InsertIntoHadoopFsRelationCommand ` and `InsertIntoHiveTable ` in `table_command_spec.json`

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [x] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

#### Behavior Without This Pull Request ⚰️
Insert into table will check the privilege of columns.

#### Behavior With This Pull Request 🎉
Insert into table will check the privilege of table.

#### Related Unit Tests

---

# Checklist 📝

- [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6570 from liujiayi771/insert-permission.

Closes #6564

d956aa916 [joey.ljy] Fix ut
d282f8ec5 [joey.ljy] insert into table check the privilege of table

Authored-by: joey.ljy <joey.ljy@alibaba-inc.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2024-08-05 16:58:24 +08:00
caoyu
d9d2109070 [KYUUBI #6541] [AUTHZ] Fix DataSourceV2RelationTableExtractor can't get the 'database' attribute if it's a Paimon plan.
# 🔍 Description
## Issue References 🔗

This pull request fixes #6541

## Describe Your Solution 🔧
Fix an issue where DataSourceV2RelationTableExtractor#table could not fetch the ‘database’ attribute causing the Ranger checks to fail when using the Paimon Catalog.
If the database attribute is not resolved, use DataSourceV2RelationTableExtractor#identifier to complete it.

## Types of changes 🔖

- [x] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

#### Behavior Without This Pull Request ⚰️

#### Behavior With This Pull Request 🎉

#### Related Unit Tests

---

# Checklist 📝

- [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6544 from promising-forever/issues/6541.

Closes #6541

6549f8528 [caoyu] Fix test failure, paimon-spark run on Scala 2.12.
c1a09214a [caoyu] Optimising the 'database' capture logic
69fb0bc7e [caoyu] PolicyJsonFileGenerator#genPolicies add paimonNamespace
c89c70bad [caoyu] [KYUUBI #6541] [AUTHZ] Fix DataSourceV2RelationTableExtractor#table can't get the 'database' attribute if it's a Paimon plan.
77f121b0d [caoyu] [KYUUBI #6541] [AUTHZ] Fix DataSourceV2RelationTableExtractor#table can't get the 'database' attribute if it's a Paimon plan.
9cfb5847b [caoyu] [KYUUBI #6541] [AUTHZ] Fix DataSourceV2RelationTableExtractor#table can't get the 'database' attribute if it's a Paimon plan.

Authored-by: caoyu <caoy.5@jifenn.com>
Signed-off-by: Bowen Liang <liangbowen@gf.com.cn>
2024-07-28 23:25:04 +08:00
huangxiaoping
0f6d7643ae
[KYUUBI #6554] Delete redundant code related to zorder
# 🔍 Description
## Issue References 🔗

This pull request fixes #6554

## Describe Your Solution 🔧

- Delete `/kyuubi/extensions/spark/kyuubi-extension-spark-3-x/src/main/scala/org/apache/kyuubi/sql/zorder/InsertZorderBeforeWritingBase.scala` file
- Rename `InsertZorderBeforeWriting33.scala` to `InsertZorderBeforeWriting.scala`
- Rename `InsertZorderHelper33,  InsertZorderBeforeWritingDatasource33,  InsertZorderBeforeWritingHive33, ZorderSuiteSpark33` to `InsertZorderHelper,  InsertZorderBeforeWritingDatasource,  InsertZorderBeforeWritingHive, ZorderSuiteSpark`

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

#### Behavior Without This Pull Request ⚰️

#### Behavior With This Pull Request 🎉

#### Related Unit Tests

---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6555 from huangxiaopingRD/6554.

Closes #6554

26de4fa09 [huangxiaoping] [KYUUBI #6554] Delete redundant code related to zorder

Authored-by: huangxiaoping <1754789345@qq.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2024-07-23 12:14:55 +08:00
huangxiaoping
ec232c18b5
[KYUUBI #6551] Allow insert zorder when global sort is false and the plan is Repartition or RepartitionByExpression.
# 🔍 Description
## Issue References 🔗

This pull request fixes #6551

## Describe Your Solution 🔧

Update `canInsertZorder` to allow insert zorder when global sort is `false` and the plan is `Repartition` or `RepartitionByExpression`.

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

#### Behavior Without This Pull Request ⚰️

#### Behavior With This Pull Request 🎉

#### Related Unit Tests
/kyuubi-extension-spark-common/src/test/scala/org/apache/spark/sql/ZorderSuiteBase.scala

---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6552 from huangxiaopingRD/6551.

Closes #6551

b597443c3 [huangxiaoping] Fix code style
618594667 [huangxiaoping] [KYUUBI #6551] Allow insert zorder when when the plan is Repartition or RepartitionByExpression

Authored-by: huangxiaoping <1754789345@qq.com>
Signed-off-by: ulyssesyou <ulyssesyou@apache.org>
2024-07-23 09:36:21 +08:00
Cheng Pan
063a192c7a
[KYUUBI #6545] Deprecate and remove building support for Spark 3.2
# 🔍 Description

This pull request aims to remove building support for Spark 3.2, while still keeping the engine support for Spark 3.2.

Mailing list discussion: https://lists.apache.org/thread/l74n5zl1w7s0bmr5ovxmxq58yqy8hqzc

- Remove Maven profile `spark-3.2`, and references on docs, release scripts, etc.
- Keep the cross-version verification to ensure that the Spark SQL engine built on the default Spark version (3.5) still works well on Spark 3.2 runtime.
- Merge `kyuubi-extension-spark-common` into `kyuubi-extension-spark-3-3`
- Remove `log4j.properties` as Spark moves to Log4j2 since 3.3 (SPARK-37814)

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [x] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

Pass GHA.

---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6545 from pan3793/deprecate-spark-3.2.

Closes #6545

54c172528 [Cheng Pan] fix
f4602e805 [Cheng Pan] Deprecate and remove building support for Spark 3.2
2e083f89f [Cheng Pan] fix style
458a92c53 [Cheng Pan] nit
929e1df36 [Cheng Pan] Deprecate and remove building support for Spark 3.2

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2024-07-22 11:59:34 +08:00
huangxiaoping
4040acb321
[KYUUBI #6546] Update incorrect descriptions in Zorder related configurations
# 🔍 Description
## Issue References 🔗

This pull request fixes #6546

## Describe Your Solution 🔧

Fix incorrect documentation description.

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

#### Behavior Without This Pull Request ⚰️

#### Behavior With This Pull Request 🎉

#### Related Unit Tests

---

# Checklist 📝

- [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6547 from huangxiaopingRD/6546.

Closes #6546

fab3b93c2 [huangxiaoping] Merge remote-tracking branch 'origin/6546'
17bd5ea0d [huangxiaoping] [KYUUBI #6546] Fix incorrect documentation description
8f53a8911 [huangxiaoping] [KYUUBI #6546] Fix incorrect documentation description
449d3f1ea [huangxiaoping] [KYUUBI #6546] Fix incorrect documentation description

Authored-by: huangxiaoping <1754789345@qq.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2024-07-19 10:27:39 +08:00
Cheng Pan
03dcedd89e
[KYUUBI #6453] Make KSHC support Spark 4.0 and enable CI for Spark 4.0
# 🔍 Description

This PR makes KSHC support Spark 4.0, and also makes sure that the KSHC jar compiled against Spark 3.5 is binary compatible with Spark 4.0.

We are ready to enable CI for Spark 4.0, except for authZ module.

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

Pass GHA.

---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6453 from pan3793/spark4-ci.

Closes #6453

695e3d7f7 [Cheng Pan] Update pom.xml
2eaa0f88a [Cheng Pan] Update .github/workflows/master.yml
b1f540a34 [Cheng Pan] cross test
562839982 [Cheng Pan] fix
9f0c2e1be [Cheng Pan] fix
45f182462 [Cheng Pan] kshc
227ef5bae [Cheng Pan] fix
690a3b8b2 [Cheng Pan] Revert "fix"
87fe7678b [Cheng Pan] fix
60f55dbed [Cheng Pan] CI for Spark 4.

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2024-06-07 11:01:24 +08:00
Cheng Pan
a07c57f064
[KYUUBI #6427] Extract data lake artifact names as maven properties
# 🔍 Description

Improve data lake dependency management by extracting the following Maven properties:

- `delta.artifact`
- `hudi.artifact`
- `iceberg.artifact`
- `paimon.artifact`

It often takes a while for the downstream data lakes to support the new Spark versions, extracting those properties makes it easy to override in the new profile on the Kyuubi project's `pom.xml` to workaround before data lakes jars are available.

One use case is a19bb7c18e

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

Pass GHA.

---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6427 from pan3793/datalake-dep.

Closes #6427

74a9300e0 [Cheng Pan] Improve datalake dependency management

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2024-06-05 15:23:45 +08:00
senmiaoliu
bb92128131
[KYUUBI #6447] Use static regex Pattern instances in JavaUtils.timeStringAs and JavaUtils.byteStringAs
# 🔍 Description
## Issue References 🔗

This pull request fixes #6447

## Describe Your Solution 🔧
Use static regex Pattern instances in JavaUtils.timeStringAs and JavaUtils.byteStringAs

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

#### Behavior Without This Pull Request ⚰️

#### Behavior With This Pull Request 🎉

#### Related Unit Tests

---

# Checklist 📝

- [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6448 from lsm1/branch-kyuubi-6447.

Closes #6447

467066ce5 [senmiaoliu] Use static regex Pattern instances in JavaUtils

Authored-by: senmiaoliu <senmiaoliu@trip.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2024-06-05 13:29:26 +08:00
Cheng Pan
6933a91588
[KYUUBI #6451] Bump Hudi 0.15.0 and enable Hudi authZ test for Spark 3.5
# 🔍 Description

Kyuubi uses the Hudi Spark bundle jar in authZ module for testing, Hudi 0.15 brings Spark 3.5 and Scala 2.13 support, it also removes hacky for profile `spark-3.5`.

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

Pass GHA.

---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6451 from pan3793/hudi-0.15.

Closes #6451

98d6e97c5 [Cheng Pan] fix
2d31307da [Cheng Pan] remove spark-authz-hudi-test
8896f8c3f [Cheng Pan] Enable hudi test
7e9a7c7ae [Cheng Pan] Bump Hudi 0.15.0

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2024-06-05 12:33:29 +08:00
Cheng Pan
1fb1f854eb
[KYUUBI #6439] kyuubi-util-scala test jar leaked to compile scope
# 🔍 Description

The `kyuubi-util-scala_2.12-<version>-tests.jar` accidentally leaked to the compile scope but should be in the test scope.

## Types of changes 🔖

- [x] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

Run `build/dist` and check `dist/jars`

---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6439 from pan3793/util-scala-test.

Closes #6439

0576248f5 [Cheng Pan] fix
2bf2408f5 [Cheng Pan] fix
f7151dfc6 [Cheng Pan] kyuubi-util-scala test jar leaked to compile scope

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2024-06-04 11:31:58 +08:00
zhouyifan279
7bf0f57239
[KYUUBI #6441] Kyuubi Spark TPC-DS/H Connector cross version test
# 🔍 Description
## Issue References 🔗

This pull request adds cross-version tests for Kyuubi Spark TPC-DS Connector and TPC-H Connector.

## Describe Your Solution 🔧
Add TPC-DS Connector and TPC-H Connector into GitHub Actions job `spark-connector-cross-version-test`.

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

#### Behavior Without This Pull Request ⚰️

#### Behavior With This Pull Request 🎉

#### Related Unit Tests

---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6441 from zhouyifan279/tcp-ds/h-cross-version.

Closes #6441

c2abc468a [zhouyifan279] Kyuubi Spark TPC-DS/H Connector cross version test

Authored-by: zhouyifan279 <zhouyifan279@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2024-06-03 11:08:47 +08:00
zhouyifan279
3ed912f5de
[KYUUBI #6247] Make KSHC binary compatible with multiple Spark versions
# 🔍 Description
## Issue References 🔗

This pull request closes #6247

This also closes #6431

## Describe Your Solution 🔧
Add a job `spark-connector-cross-version-test` in GitHub Actions to:
1. Build KSHC package with maven opt `-Pspark-3.5`
2. Run KSHC tests with maven opt `-Pspark-3.3` and `-Pspark-3.4` and KSHC package built in step 1
3. Fix the binary-compatible issue via reflection.

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

Pass GHA.

---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6436 from zhouyifan279/kshc-cross-version-test.

Closes #6247

d3ac2ef47 [zhouyifan279] Tune the KSHC code to fix binary-compatible issues
4e14edcb5 [zhouyifan279] Fix invalid unit-tests-log name
56ca45d18 [zhouyifan279] Fix invalid unit-tests-log name
4c5ab7b9e [zhouyifan279] Update test log name
8a84e8812 [zhouyifan279] Add matrix scala
17cb67155 [zhouyifan279] [KYUUBI #6247] KSHC cross-version test

Authored-by: zhouyifan279 <zhouyifan279@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2024-06-01 20:13:41 +08:00
Cheng Pan
82441671a5 [KYUUBI #6424] TPC-H/DS connector support Spark 4.0
# 🔍 Description

Adapt changes in SPARK-45857

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

```
build/mvn -pl ':kyuubi-spark-connector-tpch_2.13,:kyuubi-spark-connector-tpcds_2.13' \
    -Pscala-2.13 -Pspark-master -am clean install -DskipTests
build/mvn -pl ':kyuubi-spark-connector-tpch_2.13,:kyuubi-spark-connector-tpcds_2.13' \
    -Pscala-2.13 -Pspark-master test
```

```
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary for Kyuubi Spark TPC-DS Connector 1.10.0-SNAPSHOT:
[INFO]
[INFO] Kyuubi Spark TPC-DS Connector ...................... SUCCESS [ 53.699 s]
[INFO] Kyuubi Spark TPC-H Connector ....................... SUCCESS [ 30.511 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  01:24 min
[INFO] Finished at: 2024-05-27T06:01:58Z
[INFO] ------------------------------------------------------------------------
```

---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6424 from pan3793/tpc-conn-4.

Closes #6424

9012a177f [Cheng Pan] TPC-H/DS connector support Spark 4.0

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2024-05-27 07:02:52 +00:00
Cheng Pan
522a28e1d5
[KYUUBI #6398] Fix lineage plugin UT for Spark 4.0
# 🔍 Description

```
build/mvn clean test -Pscala-2.13 -Pspark-master -pl :kyuubi-spark-lineage_2.13
```

```
- test group by *** FAILED ***
  org.apache.spark.sql.catalyst.ExtendedAnalysisException: [DATATYPE_MISMATCH.BINARY_OP_WRONG_TYPE] Cannot resolve "(b + c)" due to data type mismatch: the binary operator requires the input type ("NUMERIC" or "INTERVAL DAY TO SECOND" or "INTERVAL YEAR TO MONTH" or "INTERVAL"), not "STRING". SQLSTATE: 42K09; line 1 pos 59;
'InsertIntoStatement RelationV2[a#546, b#547, c#548] v2_catalog.db.t1 v2_catalog.db.t1, false, false, false
+- 'Aggregate [a#543], [a#543, unresolvedalias('count(distinct (b#544 + c#545))), (count(distinct b#544) * count(distinct c#545)) AS (count(DISTINCT b) * count(DISTINCT c))#551L]
   +- SubqueryAlias v2_catalog.db.t2
      +- RelationV2[a#543, b#544, c#545] v2_catalog.db.t2 v2_catalog.db.t2
  at org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.dataTypeMismatch(package.scala:73)
  at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis0$7(CheckAnalysis.scala:315)
  at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis0$7$adapted(CheckAnalysis.scala:302)
  at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:244)
  at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreachUp$1(TreeNode.scala:243)
  at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreachUp$1$adapted(TreeNode.scala:243)
  at scala.collection.immutable.Vector.foreach(Vector.scala:1856)
  at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:243)
  at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreachUp$1(TreeNode.scala:243)
  at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreachUp$1$adapted(TreeNode.scala:243)
  ...
```

## Types of changes 🔖

- [x] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

Pass UT.

---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6398 from pan3793/lineage-fix.

Closes #6398

afce6b880 [Cheng Pan] Fix lineage plugin UT for Spark 4.0

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2024-05-20 22:03:48 +08:00
Cheng Pan
6bdf2bdaf8
[KYUUBI #6392] Support javax.servlet and jakarta.servlet co-exist
# 🔍 Description

This PR makes `javax.servlet` and `jakarta.servlet` co-exist, by introducing `javax.servlet-api-4.0.1` and upgrade `jakarta.servlet-api` to 5.0.0. (6.0.0 requires JDK 11)

Spark 4.0 migrated from `javax.servlet` to `jakarta.servlet` in SPARK-47118 while Kyuubi still uses `javax.servlet` in other modules, we should allow them to co-exist for a while.

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

Pass GHA.

---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6392 from pan3793/servlet.

Closes #6392

27d412599 [Cheng Pan] fix
9f1e72272 [Cheng Pan] other spark modules
f4545dc76 [Cheng Pan] fix
313826fa7 [Cheng Pan] exclude
7d5028154 [Cheng Pan] Support javax.servlet and jakarta.servlet co-exist

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2024-05-20 21:09:30 +08:00
hezhao2
8edcb005ee
[KYUUBI #6315] Spark 3.5: MaxScanStrategy supports DSv2
# 🔍 Description
## Issue References 🔗

Now, MaxScanStrategy can be adopted to limit max scan file size in some datasources, such as Hive. Hopefully we can enhance MaxScanStrategy to include support for the datasourcev2.
## Describe Your Solution 🔧

get the statistics about files scanned through datasourcev2 API

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

#### Behavior Without This Pull Request ⚰️

#### Behavior With This Pull Request 🎉

#### Related Unit Tests

---

# Checklists
## 📝 Author Self Checklist

- [x] My code follows the [style guidelines](https://kyuubi.readthedocs.io/en/master/contributing/code/style.html) of this project
- [x] I have performed a self-review
- [x] I have commented my code, particularly in hard-to-understand areas
- [ ] I have made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [x] I have added tests that prove my fix is effective or that my feature works
- [x] New and existing unit tests pass locally with my changes
- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

## 📝 Committer Pre-Merge Checklist

- [x] Pull request title is okay.
- [x] No license issues.
- [x] Milestone correctly set?
- [x] Test coverage is ok
- [x] Assignees are selected.
- [ ] Minimum number of approvals
- [ ] No changes are requested

**Be nice. Be informative.**

Closes #5852 from zhaohehuhu/dev-1213.

Closes #6315

3c5b0c276 [hezhao2] reformat
fb113d625 [hezhao2] disable the rule that checks the maxPartitions for dsv2
acc358732 [hezhao2] disable the rule that checks the maxPartitions for dsv2
c8399a021 [hezhao2] fix header
70c845bee [hezhao2] add UTs
3a0739686 [hezhao2] add ut
4d26ce131 [hezhao2] reformat
f87cb072c [hezhao2] reformat
b307022b8 [hezhao2] move code to Spark 3.5
73258c2ae [hezhao2] fix unused import
cf893a0e1 [hezhao2] drop reflection for loading iceberg class
dc128bc8e [hezhao2] refactor code
661834cce [hezhao2] revert code
6061f42ab [hezhao2] delete IcebergSparkPlanHelper
5f1c3c082 [hezhao2] fix
b15652f05 [hezhao2] remove iceberg dependency
fe620ca92 [hezhao2] enable MaxScanStrategy when accessing iceberg datasource

Authored-by: hezhao2 <hezhao2@cisco.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2024-04-17 16:29:50 +08:00
amanraj2520
35d4b5f0c7
[KYUUBI #6212] Added audit handler shutdown to the shutdown hook
# 🔍 Description

This pull request fixes #6212

When Kyuubi cleans up Ranger related threads like PolicyRefresher, it should also shutdown the audit threads that include SolrZkClient. Otherwise Spark Driver keeps on running since SolrZkClient is a non-daemon thread. Added the cleanup as part of the shutdown hook that Kyuubi registers.

## Types of changes 🔖

- [x] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

#### Behavior Without This Pull Request ⚰️

#### Behavior With This Pull Request 🎉

#### Related Unit Tests

---

# Checklist 📝

- [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6233 from amanraj2520/auditShutdown.

Closes #6212

e663d466c [amanraj2520] Refactored code
ed293a9a4 [amanraj2520] Removed unused import
95a6814ad [amanraj2520] Added audit handler shutdown to the shutdown hook

Authored-by: amanraj2520 <rajaman@microsoft.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2024-04-08 10:40:04 +08:00
Cheng Pan
b4f35d2c44
[KYUUBI #6267] Remove unused dependency management in POM
# 🔍 Description

This pull request removes unused dependency management in POM

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

Pass GA.

---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6267 from pan3793/clean-pom.

Closes #6267

d19f719bf [Cheng Pan] Remove usued dependency management in POM

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2024-04-07 23:53:46 +08:00
Cheng Pan
4fcc5c72a2
[KYUUBI #6260] Clean up and improve comments for spark extensions
# 🔍 Description

This pull request

- improves comments for SPARK-33832
- removes unused `spark.sql.analyzer.classification.enabled` (I didn't update the migration rules because this configuration seems never to work properly)

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

Review

---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6260 from pan3793/nit.

Closes #6260

d762d30e9 [Cheng Pan] update comment
4ebaa04ea [Cheng Pan] nit
b303f05bb [Cheng Pan] remove spark.sql.analyzer.classification.enabled
b021cbc0a [Cheng Pan] Improve docs for SPARK-33832

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2024-04-07 18:20:14 +08:00
wforget
ad612349fb [KYUUBI #6215] Improve DropIgnoreNonexistent rule for Spark 3.5
# 🔍 Description
## Issue References 🔗

This pull request fixes #

## Describe Your Solution 🔧

Improve DropIgnoreNonexistent rule for spark 3.5

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [X] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

#### Behavior Without This Pull Request ⚰️

#### Behavior With This Pull Request 🎉

#### Related Unit Tests

DropIgnoreNonexistentSuite

---

# Checklist 📝

- [X] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6215 from wForget/hotfix2.

Closes #6215

cb1d34de1 [wforget] Improve DropIgnoreNonexistent rule for spark 3.5

Authored-by: wforget <643348094@qq.com>
Signed-off-by: wforget <643348094@qq.com>
2024-03-29 10:51:46 +08:00
wforget
9114e507c4 [KYUUBI #6211] Check memory offHeap enabled for CustomResourceProfileExec
# 🔍 Description
## Issue References 🔗

This pull request fixes #

## Describe Your Solution 🔧

We should check `spark.memory.offHeap.enabled` when applying for `executorOffHeapMemory`.

## Types of changes 🔖

- [X] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

#### Behavior Without This Pull Request ⚰️

#### Behavior With This Pull Request 🎉

#### Related Unit Tests

---

# Checklist 📝

- [X] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6211 from wForget/hotfix.

Closes #6211

1c7c8cd75 [wforget] Check memory offHeap enabled for CustomResourceProfileExec

Authored-by: wforget <643348094@qq.com>
Signed-off-by: wforget <643348094@qq.com>
2024-03-28 13:17:59 +08:00
Cheng Pan
3b9f25b62d
[KYUUBI #6197] Revise dependency management of Spark authZ plugin
# 🔍 Description
## Issue References 🔗

The POM of `kyuubi-spark-authz-shaded` is redundant, just pull `kyuubi-spark-authz` is necessary.

The current dependency management does not work on Ranger 2.1.0, this patch cleans up the POM definition and fixes the compatibility with Ranger 2.1.0

## Describe Your Solution 🔧

Carefully revise the dependency list and exclusion.

## Types of changes 🔖

- [x] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

perform packing kyuubi-spark-authz-shaded module.
```
build/mvn clean install -pl extensions/spark/kyuubi-spark-authz-shaded -am -DskipTests
```

before
```
[INFO] --- maven-shade-plugin:3.5.2:shade (default)  kyuubi-spark-authz-shaded_2.12 ---
[INFO] Including org.apache.kyuubi:kyuubi-spark-authz_2.12🫙1.10.0-SNAPSHOT in the shaded jar.
[INFO] Including org.apache.kyuubi:kyuubi-util-scala_2.12🫙1.10.0-SNAPSHOT in the shaded jar.
[INFO] Including org.apache.kyuubi:kyuubi-util:jar:1.10.0-SNAPSHOT in the shaded jar.
[INFO] Including org.apache.ranger:ranger-plugins-common:jar:2.4.0 in the shaded jar.
[INFO] Including org.codehaus.jackson:jackson-jaxrs:jar:1.9.13 in the shaded jar.
[INFO] Including org.codehaus.jackson:jackson-core-asl:jar:1.9.13 in the shaded jar.
[INFO] Including org.codehaus.jackson:jackson-mapper-asl:jar:1.9.13 in the shaded jar.
[INFO] Including org.apache.ranger:ranger-plugins-cred:jar:2.4.0 in the shaded jar.
[INFO] Including com.sun.jersey:jersey-client:jar:1.19.4 in the shaded jar.
[INFO] Including com.sun.jersey:jersey-core:jar:1.19.4 in the shaded jar.
[INFO] Including com.kstruct:gethostname4j:jar:1.0.0 in the shaded jar.
[INFO] Including net.java.dev.jna:jna:jar:5.7.0 in the shaded jar.
[INFO] Including net.java.dev.jna:jna-platform:jar:5.7.0 in the shaded jar.
[INFO] Including org.apache.ranger:ranger-plugins-audit:jar:2.4.0 in the shaded jar.
```

after

```
[INFO] --- maven-shade-plugin:3.5.2:shade (default)  kyuubi-spark-authz-shaded_2.12 ---
[INFO] Including org.apache.kyuubi:kyuubi-spark-authz_2.12🫙1.10.0-SNAPSHOT in the shaded jar.
[INFO] Including org.apache.kyuubi:kyuubi-util-scala_2.12🫙1.10.0-SNAPSHOT in the shaded jar.
[INFO] Including org.apache.kyuubi:kyuubi-util:jar:1.10.0-SNAPSHOT in the shaded jar.
[INFO] Including org.apache.ranger:ranger-plugins-common:jar:2.4.0 in the shaded jar.
[INFO] Including org.codehaus.jackson:jackson-jaxrs:jar:1.9.13 in the shaded jar.
[INFO] Including org.codehaus.jackson:jackson-core-asl:jar:1.9.13 in the shaded jar.
[INFO] Including org.codehaus.jackson:jackson-mapper-asl:jar:1.9.13 in the shaded jar.
[INFO] Including org.apache.ranger:ranger-plugins-cred:jar:2.4.0 in the shaded jar.
[INFO] Including com.sun.jersey:jersey-client:jar:1.19.4 in the shaded jar.
[INFO] Including com.sun.jersey:jersey-core:jar:1.19.4 in the shaded jar.
[INFO] Including com.kstruct:gethostname4j:jar:1.0.0 in the shaded jar.
[INFO] Including net.java.dev.jna:jna:jar:5.7.0 in the shaded jar.
[INFO] Including net.java.dev.jna:jna-platform:jar:5.7.0 in the shaded jar.
[INFO] Including org.apache.ranger:ranger-plugin-classloader:jar:2.4.0 in the shaded jar.
[INFO] Including org.apache.ranger:ranger-plugins-audit:jar:2.4.0 in the shaded jar.
```

---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6197 from pan3793/authz-dep.

Closes #6197

d0becabce [Cheng Pan] 2.4
47e38502a [Cheng Pan] ranger 2.4
af01f7ed5 [Cheng Pan] test ranger 2.1
203aff3b3 [Cheng Pan] ranger-plugins-cred
974d76b03 [Cheng Pan] Resive dependency management of authz
e5154f30f [Cheng Pan] improve authz deps

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2024-03-22 10:30:30 +08:00