kyuubi

Author	SHA1	Message	Date
davidyuan	1b3de28b2c	[KYUUBI #6958 ] Test INSERT TABLE ### Why are the changes needed? Currently , ranger check missing paimon insert table command, add test cases #6958 ### How was this patch tested? 1. Test INSERT INTO: 1.1 table1OnlyUserForNs could select table1, try to insert table1 1.2 someone has no any permission, try to insert table1 2. Test INSERT OVERWRITE: 2.1 table1OnlyUserForNs could select table1, try to insert table2 2.2 someone has no any permiession, try select table1 then insert table2 ### Was this patch authored or co-authored using generative AI tooling? No Closes #6959 from davidyuan1223/test_insert. Closes #6958 d1f41ba81 [davidyuan] Merge branch 'master' into test_insert b56e701d4 [davidyuan] Test Insert Table 8306210ee [davidyuan] update Authored-by: davidyuan <yuanfuyuan@mafengwo.com> Signed-off-by: Cheng Pan <chengpan@apache.org>	2025-03-06 22:35:48 +08:00
davidyuan	61b69771be	[KYUUBI #6936 ] Test RenameTable command ### Why are the changes needed? Test Authz Support paimon rename table name command privilege check #6936 ### How was this patch tested? Test Authz Support paimon rename table name command privilege check ### Was this patch authored or co-authored using generative AI tooling? No Closes #6937 from davidyuan1223/check_authz_paimon_rename_table. Closes #6936 797d1c489 [davidyuan] Merge branch 'master' into check_authz_paimon_rename_table bc3c823a3 [davidyuan] Merge remote-tracking branch 'origin/master' into check_authz_paimon_rename_table 6205670d2 [davidyuan] add renameTable to command_spec.json e4b241ef5 [davidyuan] Merge branch 'master' into check_authz_paimon_rename_table 5fec3bcb7 [davidyuan] test paimon rename table name command 30d09418c [davidyuan] test paimon rename table name command Authored-by: davidyuan <yuanfuyuan@mafengwo.com> Signed-off-by: Kent Yao <yao@apache.org>	2025-03-05 14:32:41 +08:00
davidyuan	37eaf75ae3	[KYUUBI #6949 ] Test adding column position ### Why are the changes needed? Ranger check test case missing paimon adding column position command, add the test case #6949 ### How was this patch tested? Test ranger check with paimon adding column position command ### Was this patch authored or co-authored using generative AI tooling? No Closes #6954 from davidyuan1223/test_adding_column_position. Closes #6949 262ecaaca [davidyuan] Merge remote-tracking branch 'origin/master' into test_adding_column_position 154765fc3 [davidyuan] Merge branch 'master' into test_adding_column_position 4ebf985a9 [davidyuan] test adding column position Authored-by: davidyuan <yuanfuyuan@mafengwo.com> Signed-off-by: Kent Yao <yao@apache.org>	2025-03-05 14:17:51 +08:00
davidyuan	851178ce9a	[KYUUBI #6940 ] Test Unset Table Properties Command ### Why are the changes needed? Currently range check missing check UnsetTableProperties command, we need add it to the range check. #6940 ### How was this patch tested? Use paimon removing table properties to test this command ### Was this patch authored or co-authored using generative AI tooling? No Closes #6944 from davidyuan1223/test_remove_table_properties. Closes #6940 4f24d7d6a [davidyuan] Merge branch 'master' into test_remove_table_properties 11d3773ed [davidyuan] test unset table properties command Authored-by: davidyuan <yuanfuyuan@mafengwo.com> Signed-off-by: Kent Yao <yao@apache.org>	2025-03-05 13:37:39 +08:00
davidyuan	4cab817913	[KYUUBI #6950 ] Test changing column position ### Why are the changes needed? Ranger check test case missing paimon changing column position command, add the test case #6950 ### How was this patch tested? Test ranger check with paimon changing column position command ### Was this patch authored or co-authored using generative AI tooling? No Closes #6955 from davidyuan1223/test_changing_column_position. Closes #6950 520b5377f [davidyuan] Merge branch 'master' into test_changing_column_position 1eed87346 [davidyuan] test changing column position Authored-by: davidyuan <yuanfuyuan@mafengwo.com> Signed-off-by: Kent Yao <yao@apache.org>	2025-03-04 16:52:25 +08:00
Cheng Pan	d5b01fa3e2	[KYUUBI #6939 ] Bump Spark 3.5.5 ### Why are the changes needed? Test Spark 3.5.5 Release Notes https://spark.apache.org/releases/spark-release-3-5-5.html ### How was this patch tested? Pass GHA. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #6939 from pan3793/spark-3.5.5. Closes #6939 8c0288ae5 [Cheng Pan] ga 78b0e72db [Cheng Pan] nit 686a7b0a9 [Cheng Pan] fix d40cc5bba [Cheng Pan] Bump Spark 3.5.5 Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>	2025-03-03 13:42:09 +08:00
davidyuan	bfcf2e708f	[KYUUBI #6942 ] Test Rename Column Name for paimon ### Why are the changes needed? Currently, ranger check for paimon missing rename column name command, add the test case #6942 ### How was this patch tested? Test Paimon Rename column name with ranger ### Was this patch authored or co-authored using generative AI tooling? No Closes #6946 from davidyuan1223/test_rename_column_name. Closes #6942 8e49eb0ab [davidyuan] test rename column name Authored-by: davidyuan <yuanfuyuan@mafengwo.com> Signed-off-by: Kent Yao <yao@apache.org>	2025-03-03 09:56:42 +08:00
davidyuan	525aec04a1	[KYUUBI #6923 ] Test Create Partitioned Table for Paimon ### Why are the changes needed? AUTHZ Test Create Partitioned Table for PAIMON, check that has support the command #6923 ### How was this patch tested? est Authz for paimon with create partitioned table command. Check the permission ### Was this patch authored or co-authored using generative AI tooling? No Closes #6931 from davidyuan1223/support_create_with_parition_for_paimon. Closes #6923 61f7560d3 [Cheng Pan] Merge branch 'master' into support_create_with_parition_for_paimon ffb79376f [Cheng Pan] Update extensions/spark/kyuubi-spark-authz/src/test/scala/org/apache/kyuubi/plugin/spark/authz/ranger/PaimonCatalogRangerSparkExtensionSuite.scala b0829795a [Bowen Liang] Update extensions/spark/kyuubi-spark-authz/src/test/scala/org/apache/kyuubi/plugin/spark/authz/ranger/PaimonCatalogRangerSparkExtensionSuite.scala 4b160d720 [davidyuan] support create partition table as for paimon Lead-authored-by: davidyuan <yuanfuyuan@mafengwo.com> Co-authored-by: Bowen Liang <bowenliang@apache.org> Co-authored-by: Cheng Pan <pan3793@gmail.com> Co-authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>	2025-02-24 14:43:40 +08:00
davidyuan	ff3da59f63	[KYUUBI #6932 ] Test ALTER TBLPROPERTIES for Paimon ### Why are the changes needed? AUTHZ Test Add/Change Table properties for PAIMON, check that has support the command https://github.com/apache/kyuubi/issues/6932 ### How was this patch tested? Test Add/Change properties SQL ### Was this patch authored or co-authored using generative AI tooling? No Closes #6933 from davidyuan1223/test_alter_tableproperties_for_paimin. Closes #6932 4d64fbf23 [Cheng Pan] Update extensions/spark/kyuubi-spark-authz/src/test/scala/org/apache/kyuubi/plugin/spark/authz/ranger/PaimonCatalogRangerSparkExtensionSuite.scala c861a778b [davidyuan] support add/change table properties for paimon Lead-authored-by: davidyuan <yuanfuyuan@mafengwo.com> Co-authored-by: Cheng Pan <pan3793@gmail.com> Signed-off-by: Cheng Pan <chengpan@apache.org>	2025-02-24 14:31:49 +08:00
davidyuan	ed96ac167d	[KYUUBI #6921 ][AUTHZ] Test CTAS for Paimon ### Why are the changes needed? AUTHZ Test CTAS for Paimon to check it support this command, the related issue is https://github.com/apache/kyuubi/issues/6921 ### How was this patch tested? Test Authz for paimon with create table as command. Check the permission. ### Was this patch authored or co-authored using generative AI tooling? No Closes #6922 from davidyuan1223/support_create_table_as_for_paimon_check. Closes #6921 7bfd6ad49 [david yuan] Update extensions/spark/kyuubi-spark-authz/src/test/scala/org/apache/kyuubi/plugin/spark/authz/ranger/PaimonCatalogRangerSparkExtensionSuite.scala a9ce20cc4 [davidyuan] support create table as for paimon Lead-authored-by: davidyuan <yuanfuyuan@mafengwo.com> Co-authored-by: david yuan <davidyuan1223@gmail.com> Signed-off-by: Cheng Pan <chengpan@apache.org>	2025-02-19 14:21:06 +08:00
Cheng Pan	93ac1ee269	[KYUUBI #6925 ] Only run Paimon authz tests with Scala 2.12 ### Why are the changes needed? Paimon does not seem to support Scala 2.13 ### How was this patch tested? Pass GHA. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #6925 from pan3793/authz-paimon-scala212. Closes #6925 865a7dd72 [Cheng Pan] fix 971d23273 [Cheng Pan] Update extensions/spark/kyuubi-spark-authz/src/test/scala/org/apache/kyuubi/plugin/spark/authz/ranger/PaimonCatalogRangerSparkExtensionSuite.scala 499f10ab0 [Cheng Pan] Only run Paimon authz tests with Scala 2.12 Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>	2025-02-19 14:19:22 +08:00
xglv1985	7c110b68f8	[KYUUBI #6912 ][LINEAGE] Properly handle empty attribute set on mergeRelationColumnLineage # Why are the changes needed? ## Issue reference: https://github.com/apache/kyuubi/issues/6912 ## How to reproduce the issue? The changes in this PR will avoid a wrong result when generating the instance of org.apache.kyuubi.plugin.lineage.Lineage, in the certain case as follows: step 1: create a temporary view from a file step 2: insert into a table by selecting from the temporary view in step 1 step 3: generate the lineage when executing the insert statement in step 2 In detail, please see the UT code submission in this patch. ## The issue analysis Let's see the current code when getting the Lineage object by resolving a LogicalPlan object: <img width="694" alt="image" src="https://github.com/user-attachments/assets/65256a0d-320d-4271-968f-59eafb74de9f" /> According to the above logic, a None org.apache.kyuubi.plugin.lineage.Lineage object will be generated due to "try-catch" self-protection, in this certain case. This None object will lead to problems in the following 2 scenes: ### Unit Test Environment In Unit Test, when the code runs here a "None.get" exception will be raised: <img width="682" alt="image" src="https://github.com/user-attachments/assets/102dc9bd-294f-4b1e-b1c6-01b6fee50fed" /> Here's the runtime exception stack: ``` None.get java.util.NoSuchElementException: None.get at scala.None$.get(Option.scala:529) at scala.None$.get(Option.scala:527) at org.apache.kyuubi.plugin.lineage.helper.SparkSQLLineageParserHelperSuite.extractLineageWithoutExecuting(SparkSQLLineageParserHelperSuite.scala:1485) at org.apache.kyuubi.plugin.lineage.helper.SparkSQLLineageParserHelperSuite.$anonfun$new$83(SparkSQLLineageParserHelperSuite.scala:1465) ``` ### Production Environment This Lineage object cannot be used in the production environment because it has a None value which lacks some necessary lineage information. The right content of the Lineage instance in the above case should be: ``` inputTables(List()) outputTables(List(spark_catalog.test_db.test_table_from_dir)) columnLineage(List(ColumnLineage(spark_catalog.test_db.test_table_from_dir.a0,Set()), ColumnLineage(spark_catalog.test_db.test_table_from_dir.b0,Set()))) ``` a newly added test case(test directory to table) passed after this issue is fixed. # How to fix the issue? Add a "Empty judgment" logic. In detail, please see the code submission in this patch. # How was this patch tested? 1. by adding a new test case in UT code and make sure it passes 2. by submitting a Spark application including the SQL of this case in the production environment, and make sure a right Lineage instance is generated, instead of a None object # Was this patch authored or co-authored using generative AI tooling? No Closes #6911 from xglv1985/fix_spark_lineage_runtime_exception. Closes #6912 13a71075d [Cheng Pan] Update extensions/spark/kyuubi-spark-lineage/src/test/scala/org/apache/kyuubi/plugin/lineage/helper/SparkSQLLineageParserHelperSuite.scala 4e89b95cd [Cheng Pan] Update extensions/spark/kyuubi-spark-lineage/src/test/scala/org/apache/kyuubi/plugin/lineage/helper/SparkSQLLineageParserHelperSuite.scala 59b350bfb [xglv1985] fix a runtime exception when generate column lineage tuple--more readable code 52bc0288d [xglv1985] fix a runtime exception when generate column lineage tuple--spotless sytle fea6bbc0d [xglv1985] fix a runtime exception when generate column lineage tuple--remove tab from UT code 901879095 [xglv1985] fix a runtime exception when generate column lineage tuple--unit test fbb4df879 [xglv1985] fix a runtime exception when generate column lineage tuple Lead-authored-by: xglv1985 <xglv1985@gmail.com> Co-authored-by: Cheng Pan <pan3793@gmail.com> Signed-off-by: Cheng Pan <chengpan@apache.org>	2025-02-14 10:27:51 +08:00
Octavian Ciubotaru	1bd9e10987	[KYUUBI #6901 ] Default policy for spark ### Why are the changes needed? Added a service definition for spark which in turn enables the creation of a default policy for the spark service. Default policy will block access until another policy is downloaded from Apache Ranger. ### How was this patch tested? Tested manually. Configure Kyuubi Authz plugin. Do not start Apache Ranger, it must not be reachable. Make sure that policy cache is empty. Start Kyuubi engine and try to query any tables. The default policy should not allow any access. Previously the access was not restricted because there wasn't a default policy defined. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #6902 from developster/master. Closes #6901 feb6ebf61 [Octavian Ciubotaru] Default policy for spark Authored-by: Octavian Ciubotaru <ociubotaru@developmentgateway.org> Signed-off-by: Kent Yao <yao@apache.org>	2025-02-11 13:52:08 +08:00
zhaohehuhu	117e56c7cb	[KYUUBI #6862 ] Spark 3.3: MaxScanStrategy supports DSv2 ### Why are the changes needed? Backport https://github.com/apache/kyuubi/pull/5852 to Spark 3.3, to enhance MaxScanStrategy to include support for the datasourcev2 in Spark 3.3 ### How was this patch tested? Add some UTs ### Was this patch authored or co-authored using generative AI tooling? No Closes #6862 from zhaohehuhu/dev-1225. Closes #6862 c745eda14 [zhaohehuhu] MaxScanStrategy supports DSv2 in Spark 3.3 Authored-by: zhaohehuhu <luoyedeyi459@163.com> Signed-off-by: Cheng Pan <chengpan@apache.org>	2024-12-25 17:21:23 +08:00
zhaohehuhu	3c72fef476	[KYUUBI #6857 ] Spark 3.4: MaxScanStrategy supports DSv2 ### Why are the changes needed? Backport https://github.com/apache/kyuubi/pull/5852 to Spark 3.4, to enhance MaxScanStrategy to include support for the datasourcev2 in Spark 3.4 ### How was this patch tested? Add some UTs ### Was this patch authored or co-authored using generative AI tooling? No Closes #6857 from zhaohehuhu/dev-1224. Closes #6857 c72c62984 [zhaohehuhu] remove the import dfbf2bc2d [zhaohehuhu] MaxScanStrategy supports DSv2 in Spark 3.4 Authored-by: zhaohehuhu <luoyedeyi459@163.com> Signed-off-by: Cheng Pan <chengpan@apache.org>	2024-12-24 20:24:03 +08:00
Cheng Pan	4b506f4cb9	[KYUUBI #6820 ] Explicitly disable attach-scaladocs for pure Java modules # 🔍 Description ``` export JAVA_HOME=/path/of/openjdk-17 build/mvn clean install -DskipTests -Dmaven.scaladoc.skip=false ``` ``` [INFO] --- scala-maven-plugin:4.9.2:doc-jar (attach-scaladocs) kyuubi-server-plugin --- [INFO] compiler plugin: BasicArtifact(com.github.ghik,silencer-plugin_2.12.20,1.7.19,null) error: fatal error: object scala in compiler mirror not found. ``` ## Types of changes 🔖 - [x] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 Successfully run the build command ``` export JAVA_HOME=/path/of/openjdk-17 build/mvn clean install -DskipTests -Dmaven.scaladoc.skip=false ``` --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6820 from pan3793/scaladoc. Closes #6820 f5cee3429 [Cheng Pan] Explicitly disable attach-scaladocs for pure Java modules Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>	2024-11-22 17:12:00 +08:00
Bowen Liang	d3520ddbce	[KYUUBI #6769 ] [RELEASE] Bump 1.11.0-SNAPSHOT # 🔍 Description ## Issue References 🔗 This pull request fixes # ## Describe Your Solution 🔧 Preparing v1.11.0-SNAPSHOT after branch-1.10 cut ```shell build/mvn versions:set -DgenerateBackupPoms=false -DnewVersion="1.11.0-SNAPSHOT" (cd kyuubi-server/web-ui && npm version "1.11.0-SNAPSHOT") ``` ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests --- # Checklist 📝 - [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6769 from bowenliang123/bump-1.11. Closes #6769 6db219d28 [Bowen Liang] get latest_branch by sorting version in branch name 465276204 [Bowen Liang] update package.json 81f2865e5 [Bowen Liang] bump Authored-by: Bowen Liang <liangbowen@gf.com.cn> Signed-off-by: Bowen Liang <liangbowen@gf.com.cn>	2024-10-23 17:10:56 +08:00
wankunde	04f443792b	[KYUUBI #6754 ][AUTHZ] Improve the performance of Ranger access requests deduplication # 🔍 Description ## Issue References 🔗 This pull request fixes #6754 ## Describe Your Solution 🔧 Right now in RuleAuthorization we use an ArrayBuffer to collect access requests, which is very slow because each new PrivilegeObject needs to be compared with all access requests. ## Types of changes 🔖 - [x] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ Add benchmark Before ```sh Java HotSpot(TM) 64-Bit Server VM 17.0.12+8-LTS-286 on Mac OS X 14.6 Apple M3 Collecting files ranger access request: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ 50000 files benchmark 181863 189434 NaN -0.0 -181863368958.0 1.0X ```` #### Behavior With This Pull Request 🎉 After ```sh Java HotSpot(TM) 64-Bit Server VM 17.0.12+8-LTS-286 on Mac OS X 14.6 Apple M3 Collecting files ranger access request: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ 50000 files benchmark 1281 1310 33 -0.0 -1280563000.0 1.0X ``` #### Related Unit Tests Exists UT --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6758 from wankunde/ranger2. Closes #6754 9d7d1964b [wankunde] [KYUUBI #6754] Improve the performance of ranger access requests 88b9c049b [wankun] Update extensions/spark/kyuubi-spark-authz/src/test/scala/org/apache/spark/sql/RuleAuthorizationBenchmark.scala 20c55fbeb [wankun] Update extensions/spark/kyuubi-spark-authz/pom.xml f5a3b6ca5 [wankunde] [KYUUBI #6754] Improve the performance of ranger access requests 9793249de [wankunde] [KYUUBI #6754] Improve the performance of ranger access requests d86b01f9c [wankunde] [KYUUBI #6754] Improve the performance of ranger access requests b904b491b [wankunde] [KYUUBI #6754] Improve the performance of ranger access requests aad08a6bb [wankunde] [KYUUBI #6754] Improve the performance of ranger access requests 1374604bc [wankunde] [KYUUBI #6754] Improve the performance of ranger access requests 01e15c149 [wankun] Update extensions/spark/kyuubi-spark-authz/pom.xml 805e8a9c0 [wankun] Update extensions/spark/kyuubi-spark-authz/pom.xml e19817943 [wankunde] [KYUUBI #6754] Improve the performance of ranger access requests Lead-authored-by: wankunde <wankunde@163.com> Co-authored-by: wankun <wankun@apache.org> Signed-off-by: Bowen Liang <liangbowen@gf.com.cn>	2024-10-21 21:17:51 +08:00
wforget	1e9d68b000	[KYUUBI #6368 ] Flink engine supports user impersonation # 🔍 Description ## Issue References 🔗 This pull request fixes #6368 ## Describe Your Solution 🔧 Support impersonation mode for flink sql engine. ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [X] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 Test in hadoop-testing env. Connection: ``` beeline -u "jdbc:hive2://hadoop-master1.orb.local:10009/default;hive.server2.proxy.user=spark;principal=kyuubi/_HOSTTEST.ORG?kyuubi.engine.type=FLINK_SQL;flink.execution.target=yarn-application;kyuubi.engine.share.level=CONNECTION;kyuubi.engine.flink.doAs.enabled=true;" ``` sql: ``` select 1; ``` result: ![image](https://github.com/apache/kyuubi/assets/17894939/4bde3e4e-0dac-4e09-ac7c-a2c3a3607a13) launch engine command: ``` 2024-06-12 03:22:10.242 INFO KyuubiSessionManager-exec-pool: Thread-62 org.apache.kyuubi.engine.EngineRef: Launching engine: /opt/flink-1.18.1/bin/flink run-application \ -t yarn-application \ -Dyarn.ship-files=/opt/flink/opt/flink-sql-client-1.18.1.jar;/opt/flink/opt/flink-sql-gateway-1.18.1.jar;/etc/hive/conf/hive-site.xml \ -Dyarn.application.name=kyuubi_CONNECTION_FLINK_SQL_spark_6170b9aa-c690-4b50-938f-d59cca9aa2d6 \ -Dyarn.tags=KYUUBI,6170b9aa-c690-4b50-938f-d59cca9aa2d6 \ -Dcontainerized.master.env.FLINK_CONF_DIR=. \ -Dcontainerized.master.env.HIVE_CONF_DIR=. \ -Dyarn.security.appmaster.delegation.token.services=kyuubi \ -Dsecurity.delegation.token.provider.HiveServer2.enabled=false \ -Dsecurity.delegation.token.provider.hbase.enabled=false \ -Dexecution.target=yarn-application \ -Dsecurity.module.factory.classes=org.apache.flink.runtime.security.modules.JaasModuleFactory;org.apache.flink.runtime.security.modules.ZookeeperModuleFa ctory \ -Dsecurity.delegation.token.provider.hadoopfs.enabled=false \ -c org.apache.kyuubi.engine.flink.FlinkSQLEngine /opt/apache-kyuubi-1.10.0-SNAPSHOT-bin/externals/engines/flink/kyuubi-flink-sql-engine_2.12-1.10.0-SNAPS HOT.jar \ --conf kyuubi.session.user=spark \ --conf kyuubi.client.ipAddress=172.20.0.5 \ --conf kyuubi.engine.credentials=SERUUwACJnRocmlmdDovL2hhZG9vcC1tYXN0ZXIxLm9yYi5sb2NhbDo5MDgzRQAFc3BhcmsEaGl2ZShreXV1YmkvaGFkb29wLW1hc3RlcjEub3JiLmxvY2Fs QFRFU1QuT1JHigGQCneevIoBkC6EIrwWDxSg03pnAB8dA295wh+Dim7Fx4FNxhVISVZFX0RFTEVHQVRJT05fVE9LRU4ADzE3Mi4yMC4wLjU6ODAyMEEABXNwYXJrAChreXV1YmkvaGFkb29wLW1hc3RlcjEub3JiL mxvY2FsQFRFU1QuT1JHigGQCneekIoBkC6EIpBHHBSket0SQnlXT5EIMN0U2fUKFRIVvBVIREZTX0RFTEVHQVRJT05fVE9LRU4PMTcyLjIwLjAuNTo4MDIwAA== \ --conf kyuubi.engine.flink.doAs.enabled=true \ --conf kyuubi.engine.hive.extra.classpath=/opt/hadoop/share/hadoop/client/:/opt/hadoop/share/hadoop/mapreduce/ \ --conf kyuubi.engine.share.level=CONNECTION \ --conf kyuubi.engine.submit.time=1718162530017 \ --conf kyuubi.engine.type=FLINK_SQL \ --conf kyuubi.frontend.protocols=THRIFT_BINARY,REST \ --conf kyuubi.ha.addresses=hadoop-master1.orb.local:2181 \ --conf kyuubi.ha.engine.ref.id=6170b9aa-c690-4b50-938f-d59cca9aa2d6 \ --conf kyuubi.ha.namespace=/kyuubi_1.10.0-SNAPSHOT_CONNECTION_FLINK_SQL/spark/6170b9aa-c690-4b50-938f-d59cca9aa2d6 \ --conf kyuubi.server.ipAddress=172.20.0.5 \ --conf kyuubi.session.connection.url=hadoop-master1.orb.local:10009 \ --conf kyuubi.session.engine.startup.waitCompletion=false \ --conf kyuubi.session.real.user=spark ``` launch engine log: ![image](https://github.com/apache/kyuubi/assets/17894939/590463a8-2858-47a2-8897-0ddfbe3ffdf6) jobmanager job: ``` 2024-06-12 03:22:26,400 INFO org.apache.flink.runtime.security.token.DefaultDelegationTokenManager [] - Loading delegation token providers 2024-06-12 03:22:26,992 INFO org.apache.kyuubi.engine.flink.security.token.KyuubiDelegationTokenProvider [] - Renew delegation token with engine credentials: SERUUwACJnRocmlmdDovL2hhZG9vcC1tYXN0ZXIxLm9yYi5sb2NhbDo5MDgzRQAFc3BhcmsEaGl2ZShreXV1YmkvaGFkb29wLW1hc3RlcjEub3JiLmxvY2FsQFRFU1QuT1JHigGQCneevIoBkC6EIrwWDxSg03pnAB8dA295wh+Dim7Fx4FNxhVISVZFX0RFTEVHQVRJT05fVE9LRU4ADzE3Mi4yMC4wLjU6ODAyMEEABXNwYXJrAChreXV1YmkvaGFkb29wLW1hc3RlcjEub3JiLmxvY2FsQFRFU1QuT1JHigGQCneekIoBkC6EIpBHHBSket0SQnlXT5EIMN0U2fUKFRIVvBVIREZTX0RFTEVHQVRJT05fVE9LRU4PMTcyLjIwLjAuNTo4MDIwAA== 2024-06-12 03:22:27,100 INFO org.apache.kyuubi.engine.flink.FlinkEngineUtils [] - Add new unknown token Kind: HIVE_DELEGATION_TOKEN, Service: , Ident: 00 05 73 70 61 72 6b 04 68 69 76 65 28 6b 79 75 75 62 69 2f 68 61 64 6f 6f 70 2d 6d 61 73 74 65 72 31 2e 6f 72 62 2e 6c 6f 63 61 6c 40 54 45 53 54 2e 4f 52 47 8a 01 90 0a 77 9e bc 8a 01 90 2e 84 22 bc 16 0f 2024-06-12 03:22:27,104 WARN org.apache.kyuubi.engine.flink.FlinkEngineUtils [] - Ignore token with earlier issue date: Kind: HDFS_DELEGATION_TOKEN, Service: 172.20.0.5:8020, Ident: (token for spark: HDFS_DELEGATION_TOKEN owner=spark, renewer=, realUser=kyuubi/hadoop-master1.orb.localTEST.ORG, issueDate=1718162529936, maxDate=1718767329936, sequenceNumber=71, masterKeyId=28) 2024-06-12 03:22:27,104 INFO org.apache.kyuubi.engine.flink.FlinkEngineUtils [] - Update delegation tokens. The number of tokens sent by the server is 2. The actual number of updated tokens is 1. ...... 4-06-12 03:22:29,414 INFO org.apache.flink.runtime.security.token.DefaultDelegationTokenManager [] - Starting tokens update task 2024-06-12 03:22:29,415 INFO org.apache.flink.runtime.security.token.DelegationTokenReceiverRepository [] - New delegation tokens arrived, sending them to receivers 2024-06-12 03:22:29,422 INFO org.apache.kyuubi.engine.flink.security.token.KyuubiDelegationTokenReceiver [] - Updating delegation tokens for current user 2024-06-12 03:22:29,422 INFO org.apache.kyuubi.engine.flink.security.token.KyuubiDelegationTokenReceiver [] - Token Service: Identifier:[10, 13, 10, 9, 8, 10, 16, -78, -36, -49, -17, -5, 49, 16, 1, 16, -100, -112, -60, -127, -8, -1, -1, -1, -1, 1] 2024-06-12 03:22:29,422 INFO org.apache.kyuubi.engine.flink.security.token.KyuubiDelegationTokenReceiver [] - Token Service: Identifier:[0, 5, 115, 112, 97, 114, 107, 4, 104, 105, 118, 101, 40, 107, 121, 117, 117, 98, 105, 47, 104, 97, 100, 111, 111, 112, 45, 109, 97, 115, 116, 101, 114, 49, 46, 111, 114, 98, 46, 108, 111, 99, 97, 108, 64, 84, 69, 83, 84, 46, 79, 82, 71, -118, 1, -112, 10, 119, -98, -68, -118, 1, -112, 46, -124, 34, -68, 22, 15] 2024-06-12 03:22:29,422 INFO org.apache.kyuubi.engine.flink.security.token.KyuubiDelegationTokenReceiver [] - Token Service:172.20.0.5:8020 Identifier:[0, 5, 115, 112, 97, 114, 107, 0, 40, 107, 121, 117, 117, 98, 105, 47, 104, 97, 100, 111, 111, 112, 45, 109, 97, 115, 116, 101, 114, 49, 46, 111, 114, 98, 46, 108, 111, 99, 97, 108, 64, 84, 69, 83, 84, 46, 79, 82, 71, -118, 1, -112, 10, 119, -98, -112, -118, 1, -112, 46, -124, 34, -112, 71, 28] 2024-06-12 03:22:29,422 INFO org.apache.kyuubi.engine.flink.security.token.KyuubiDelegationTokenReceiver [] - Updated delegation tokens for current user successfully ``` taskmanager log: ``` 2024-06-12 03:45:06,622 INFO org.apache.flink.runtime.taskexecutor.TaskExecutor [] - Receive initial delegation tokens from resource manager 2024-06-12 03:45:06,627 INFO org.apache.flink.runtime.security.token.DelegationTokenReceiverRepository [] - New delegation tokens arrived, sending them to receivers 2024-06-12 03:45:06,628 INFO org.apache.kyuubi.engine.flink.security.token.KyuubiDelegationTokenReceiver [] - Updating delegation tokens for current user 2024-06-12 03:45:06,629 INFO org.apache.kyuubi.engine.flink.security.token.KyuubiDelegationTokenReceiver [] - Token Service: Identifier:[10, 13, 10, 9, 8, 10, 16, -78, -36, -49, -17, -5, 49, 16, 1, 16, -100, -112, -60, -127, -8, -1, -1, -1, -1, 1] 2024-06-12 03:45:06,630 INFO org.apache.kyuubi.engine.flink.security.token.KyuubiDelegationTokenReceiver [] - Token Service: Identifier:[0, 5, 115, 112, 97, 114, 107, 4, 104, 105, 118, 101, 40, 107, 121, 117, 117, 98, 105, 47, 104, 97, 100, 111, 111, 112, 45, 109, 97, 115, 116, 101, 114, 49, 46, 111, 114, 98, 46, 108, 111, 99, 97, 108, 64, 84, 69, 83, 84, 46, 79, 82, 71, -118, 1, -112, 10, 119, -98, -68, -118, 1, -112, 46, -124, 34, -68, 22, 15] 2024-06-12 03:45:06,630 INFO org.apache.kyuubi.engine.flink.security.token.KyuubiDelegationTokenReceiver [] - Token Service:172.20.0.5:8020 Identifier:[0, 5, 115, 112, 97, 114, 107, 0, 40, 107, 121, 117, 117, 98, 105, 47, 104, 97, 100, 111, 111, 112, 45, 109, 97, 115, 116, 101, 114, 49, 46, 111, 114, 98, 46, 108, 111, 99, 97, 108, 64, 84, 69, 83, 84, 46, 79, 82, 71, -118, 1, -112, 10, 119, -98, -112, -118, 1, -112, 46, -124, 34, -112, 71, 28] 2024-06-12 03:45:06,636 INFO org.apache.kyuubi.engine.flink.security.token.KyuubiDelegationTokenReceiver [] - Updated delegation tokens for current user successfully 2024-06-12 03:45:06,636 INFO org.apache.flink.runtime.security.token.DelegationTokenReceiverRepository [] - Delegation tokens sent to receivers ``` #### Related Unit Tests --- # Checklist 📝 - [X] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6383 from wForget/KYUUBI-6368. Closes #6368 47df43ef0 [wforget] remove doAsEnabled 984b96c74 [wforget] update settings.md c7f8d474e [wforget] make generateTokenFile conf to internal 8632176b1 [wforget] address comments 2ec270e8a [wforget] licenses ed0e22f4e [wforget] separate kyuubi-flink-token-provider module b66b855b6 [wforget] address comment d4fc2bd1d [wforget] fix 1a3dc4643 [wforget] fix style 825e2a7a0 [wforget] address comments a679ba1c2 [wforget] revert remove renewer cdd499b95 [wforget] fix and comment 19caec6c0 [wforget] pass token to submit process b2991d419 [wforget] fix 7c3bdde1b [wforget] remove security.delegation.tokens.enabled check 8987c9176 [wforget] fix 5bd8cfe7c [wforget] fix 08992642d [wforget] Implement KyuubiDelegationToken Provider/Receiver fa16d7def [wforget] enable delegation token manager e50db7497 [wforget] [KYUUBI #6368] Support impersonation mode for flink sql engine Authored-by: wforget <643348094@qq.com> Signed-off-by: Bowen Liang <liangbowen@gf.com.cn>	2024-10-21 17:32:39 +08:00
Cheng Pan	9c105da117	[KYUUBI #6638 ][FOLLOWUP] Authz shaded should include jsr311-api # 🔍 Description ## Issue References 🔗 Fix a ClassNotFound issue. ``` java.lang.NoClassDefFoundError: org/apache/kyuubi/shade/javax/ws/rs/core/Cookie ``` ## Types of changes 🔖 - [x] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 Verified manually. --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6723 from pan3793/6638-followup. Closes #6638 56e9842e0 [Cheng Pan] [KYUUBI #6638] authz shaded should include jsr311-api Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Bowen Liang <liangbowen@gf.com.cn>	2024-10-15 17:24:07 +08:00
madlnu	ebe7e922ee	[KYUUBI #6666 ][AUTHZ]Upgrade Ranger plugin to 2.5.0 # 🔍 Description ## Issue References 🔗 This pull request fixes #6666 ## Describe Your Solution 🔧 Bump ranger version to 2.5.0 Release notes: https://cwiki.apache.org/confluence/display/RANGER/Apache+Ranger+2.5.0+-+Release+Notes ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests --- # Checklist 📝 - [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6692 from Madhukar525722/ranger_upgrade. Closes #6666 88e1e12c5 [madlnu] [KYUUBI #6666] Upgrade spark ranger plugin to 2.5.0 Authored-by: madlnu <madlnu@visa.com> Signed-off-by: Cheng Pan <chengpan@apache.org>	2024-09-23 17:51:17 +08:00
Cheng Pan	1bfc8c5840	[KYUUBI #6699 ] Bump Spark 4.0.0-preview2 # 🔍 Description Spark 4.0.0-preview2 RC1 passed the vote https://lists.apache.org/thread/4ctj2mlgs4q2yb4hdw2jy4z34p5yw2b1 ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [x] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 Pass GHA. --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6699 from pan3793/spark-4.0.0-preview2. Closes #6699 2db1f645d [Cheng Pan] 4.0.0-preview2 42055bb1e [Cheng Pan] fix d29c0ef83 [Cheng Pan] disable delta test 98d323b95 [Cheng Pan] fix 2e782c00b [Cheng Pan] log4j-slf4j2-impl fde4bb6ba [Cheng Pan] spark-4.0.0-preview2 Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>	2024-09-23 17:42:48 +08:00
Bowen Liang	57ab60a495	[KYUUBI #6672 ] Cleanup unused Commons Lang 2 dependency # 🔍 Description ## Issue References 🔗 This pull request fixes # ## Describe Your Solution 🔧 - Apache Commons Lang2 is no longer actively maintained and not used by Kyuubi modules ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests --- # Checklist 📝 - [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6672 from bowenliang123/remove-commonlang2. Closes #6672 34cda170a [liangbowen] remove common lang2 Lead-authored-by: Bowen Liang <liangbowen@gf.com.cn> Co-authored-by: liangbowen <liangbowen@gf.com.cn> Signed-off-by: liangbowen <liangbowen@gf.com.cn>	2024-09-05 17:44:20 +08:00
王龙	e1e7772a9f	[KYUUBI #5402 ] Introduce Spark JVM quake plugin # 🔍 Description ## Issue References 🔗 This pull request fixes #5402 ## Describe Your Solution 🔧 When facing out-of-control memory management in Spark engine, we typically use JVMkill as a remedy by killing the process and generating a heap dump for post-analysis. However, even with jvmkill protection, we may still encounter issues caused by JVM running out of memory, such as repeated execution of Full GC without performing any useful work during the pause time. Since the JVM does not exhaust 100% of resources, JVMkill will not be triggered. So introducing JVMQuake provides more granular monitoring of GC behavior, enabling early detection of memory management issues and facilitating fast failure. You can use the following configuration to enable jvmQuake plugins： ``` spark.plugins=org.apache.spark.kyuubi.jvm.quake.KyuubiJVMQuakePlugin ``` \| configuration \| default \| comment \| \| ---- \| ---- \| ---- \| \| spark.driver.jvmQuake.enabled \| false \| when true, enable driver jvmQuake \| \| spark.executor.jvmQuake.enabled \| false \| when true, enable executor jvmQuake \| \| spark.driver.jvmQuake.heapDump.enabled \| false \| when true, enable jvm heap dump when jvmQuake rearch the threshold \| \| spark.executor.jvmQuake.heapDump.enabled \| false \| when true, enable jvm heap dump when jvmQuake rearch the threshold \| \| spark.jvmQuake.dumpThreshold \| 100 \| The number of seconds to dump memory \| \| spark.jvmQuake.killThreshold \| 200 \| The number of seconds to kill process \| \| spark.jvmQuake.exitCode \| 502 \| The exit code of kill process \| \| spark.jvmQuake.heapDumpPath \| /tmp/kyuubi_jvm_quake/apps \| The path of heap dump \| \| spark.jvmQuake.checkInterval \| 3 \| The number of seconds to check jvmQuake \| \| spark.jvmQuake.runTimeWeight \| 1.0 \| The weight of rum time \| ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [x] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6572 from yoock/features/kyuubi-jvm-quake. Closes #5402 84361ce8f [王龙] add jvm quake Authored-by: 王龙 <wanglong16@xiaomi.com> Signed-off-by: Cheng Pan <chengpan@apache.org>	2024-09-02 12:29:41 +08:00
Cheng Pan	d5c31a85a4	[KYUUBI #6640 ] [AUTHZ] Adapt Derby 10.16 new JDBC driver package name # 🔍 Description SPARK-46257 (Spark 4.0.0) moves to Derby 10.16, `org.apache.derby.jdbc.AutoloadedDriver` has been moved to `org.apache.derby.iapi.jdbc.AutoloadedDriver` ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [x] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 Manually tested with Spark 4.0. --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6640 from pan3793/authz-derby. Closes #6640 46edb32be [Cheng Pan] Update extensions/spark/kyuubi-spark-authz/src/main/scala/org/apache/kyuubi/plugin/spark/authz/util/AuthZUtils.scala 7eee47f0d [Cheng Pan] Adapt Derby 10.16 new JDBC driver package name Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>	2024-08-23 12:27:48 +08:00
Cheng Pan	96ec1323ac	[KYUUBI #6638 ] Shade jsr311-api in Authz # 🔍 Description I faced the following error when trying to run authz with Spark 4.0 ``` Cause: java.lang.NoClassDefFoundError: javax/ws/rs/core/Cookie at java.base/java.lang.Class.forName0(Native Method) at java.base/java.lang.Class.forName(Class.java:375) at org.apache.ranger.plugin.policyengine.RangerPluginContext.createAdminClient(RangerPluginContext.java:96) at org.apache.ranger.plugin.util.PolicyRefresher.<init>(PolicyRefresher.java:90) at org.apache.ranger.plugin.service.RangerBasePlugin.init(RangerBasePlugin.java:251) at org.apache.kyuubi.plugin.spark.authz.ranger.SparkRangerAdminPlugin$.initialize(SparkRangerAdminPlugin.scala:68) ``` The `javax.ws.rs:jsr311-api` is the transitive dep of `jersey-client`, we should shade and relocate it correctly. Why does it work with Spark 3? Spark 3 provides `jakarta.ws.rs:jakarta.ws.rs-api:2.1.6` which provides `java.ws.rs.` classes, but Spark 4 upgrades to `jakarta.ws.rs:jakarta.ws.rs-api:3.0.0` which changed package name to`jakarta.ws.rs.`. ## Types of changes 🔖 - [x] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 Pass GHA and manually tested with Spark 4 --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6638 from pan3793/jsr311. Closes #6638 5699200cf [Cheng Pan] Shade jsr311-api in Authz Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>	2024-08-23 00:40:35 +08:00
xorsum	d414535cb6	[KYUUBI #6582 ] [KYUUBI-6581] Zorder clause syntax does not support special characters # 🔍 Description ## Issue References 🔗 This pull request fixes #6581 ## Describe Your Solution 🔧 Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change. I modified `KyuubiSparkSQLAstBuilder#visitMultipartIdentifier` and implemented `KyuubiSparkSQLAstBuilder#visitQuotedIdentifier` to process the quoted identifiers. ## Types of changes 🔖 - [x] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests ``` extensions/spark/kyuubi-extension-spark-3-3/src/test/scala/org/apache/spark/sql/ZorderSuiteBase.scala test("optimize sort by backquoted column name") ``` --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6582 from XorSum/features/zorder-backquote. Closes #6582 16ffa1238 [xorsum] zorder by support quote Authored-by: xorsum <xorsum@outlook.com> Signed-off-by: Cheng Pan <chengpan@apache.org>	2024-08-06 13:39:25 +08:00
joey.ljy	80c8e38066	[KYUUBI #6564 ] Insert into table check the privilege of table # 🔍 Description ## Issue References 🔗 This pull request fixes #6564 ## Describe Your Solution 🔧 Remove the `columnDesc` for `InsertIntoHadoopFsRelationCommand ` and `InsertIntoHiveTable ` in `table_command_spec.json` ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [x] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ Insert into table will check the privilege of columns. #### Behavior With This Pull Request 🎉 Insert into table will check the privilege of table. #### Related Unit Tests --- # Checklist 📝 - [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6570 from liujiayi771/insert-permission. Closes #6564 d956aa916 [joey.ljy] Fix ut d282f8ec5 [joey.ljy] insert into table check the privilege of table Authored-by: joey.ljy <joey.ljy@alibaba-inc.com> Signed-off-by: Cheng Pan <chengpan@apache.org>	2024-08-05 16:58:24 +08:00
caoyu	d9d2109070	[KYUUBI #6541 ] [AUTHZ] Fix DataSourceV2RelationTableExtractor can't get the 'database' attribute if it's a Paimon plan. # 🔍 Description ## Issue References 🔗 This pull request fixes #6541 ## Describe Your Solution 🔧 Fix an issue where DataSourceV2RelationTableExtractor#table could not fetch the ‘database’ attribute causing the Ranger checks to fail when using the Paimon Catalog. If the database attribute is not resolved, use DataSourceV2RelationTableExtractor#identifier to complete it. ## Types of changes 🔖 - [x] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests --- # Checklist 📝 - [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6544 from promising-forever/issues/6541. Closes #6541 6549f8528 [caoyu] Fix test failure, paimon-spark run on Scala 2.12. c1a09214a [caoyu] Optimising the 'database' capture logic 69fb0bc7e [caoyu] PolicyJsonFileGenerator#genPolicies add paimonNamespace c89c70bad [caoyu] [KYUUBI #6541] [AUTHZ] Fix DataSourceV2RelationTableExtractor#table can't get the 'database' attribute if it's a Paimon plan. 77f121b0d [caoyu] [KYUUBI #6541] [AUTHZ] Fix DataSourceV2RelationTableExtractor#table can't get the 'database' attribute if it's a Paimon plan. 9cfb5847b [caoyu] [KYUUBI #6541] [AUTHZ] Fix DataSourceV2RelationTableExtractor#table can't get the 'database' attribute if it's a Paimon plan. Authored-by: caoyu <caoy.5@jifenn.com> Signed-off-by: Bowen Liang <liangbowen@gf.com.cn>	2024-07-28 23:25:04 +08:00
huangxiaoping	0f6d7643ae	[KYUUBI #6554 ] Delete redundant code related to zorder # 🔍 Description ## Issue References 🔗 This pull request fixes #6554 ## Describe Your Solution 🔧 - Delete `/kyuubi/extensions/spark/kyuubi-extension-spark-3-x/src/main/scala/org/apache/kyuubi/sql/zorder/InsertZorderBeforeWritingBase.scala` file - Rename `InsertZorderBeforeWriting33.scala` to `InsertZorderBeforeWriting.scala` - Rename `InsertZorderHelper33, InsertZorderBeforeWritingDatasource33, InsertZorderBeforeWritingHive33, ZorderSuiteSpark33` to `InsertZorderHelper, InsertZorderBeforeWritingDatasource, InsertZorderBeforeWritingHive, ZorderSuiteSpark` ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6555 from huangxiaopingRD/6554. Closes #6554 26de4fa09 [huangxiaoping] [KYUUBI #6554] Delete redundant code related to zorder Authored-by: huangxiaoping <1754789345@qq.com> Signed-off-by: Cheng Pan <chengpan@apache.org>	2024-07-23 12:14:55 +08:00
huangxiaoping	ec232c18b5	[KYUUBI #6551 ] Allow insert zorder when global sort is false and the plan is Repartition or RepartitionByExpression. # 🔍 Description ## Issue References 🔗 This pull request fixes #6551 ## Describe Your Solution 🔧 Update `canInsertZorder` to allow insert zorder when global sort is `false` and the plan is `Repartition` or `RepartitionByExpression`. ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests /kyuubi-extension-spark-common/src/test/scala/org/apache/spark/sql/ZorderSuiteBase.scala --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6552 from huangxiaopingRD/6551. Closes #6551 b597443c3 [huangxiaoping] Fix code style 618594667 [huangxiaoping] [KYUUBI #6551] Allow insert zorder when when the plan is Repartition or RepartitionByExpression Authored-by: huangxiaoping <1754789345@qq.com> Signed-off-by: ulyssesyou <ulyssesyou@apache.org>	2024-07-23 09:36:21 +08:00
Cheng Pan	063a192c7a	[KYUUBI #6545 ] Deprecate and remove building support for Spark 3.2 # 🔍 Description This pull request aims to remove building support for Spark 3.2, while still keeping the engine support for Spark 3.2. Mailing list discussion: https://lists.apache.org/thread/l74n5zl1w7s0bmr5ovxmxq58yqy8hqzc - Remove Maven profile `spark-3.2`, and references on docs, release scripts, etc. - Keep the cross-version verification to ensure that the Spark SQL engine built on the default Spark version (3.5) still works well on Spark 3.2 runtime. - Merge `kyuubi-extension-spark-common` into `kyuubi-extension-spark-3-3` - Remove `log4j.properties` as Spark moves to Log4j2 since 3.3 (SPARK-37814) ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [x] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 Pass GHA. --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6545 from pan3793/deprecate-spark-3.2. Closes #6545 54c172528 [Cheng Pan] fix f4602e805 [Cheng Pan] Deprecate and remove building support for Spark 3.2 2e083f89f [Cheng Pan] fix style 458a92c53 [Cheng Pan] nit 929e1df36 [Cheng Pan] Deprecate and remove building support for Spark 3.2 Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>	2024-07-22 11:59:34 +08:00
huangxiaoping	4040acb321	[KYUUBI #6546 ] Update incorrect descriptions in Zorder related configurations # 🔍 Description ## Issue References 🔗 This pull request fixes #6546 ## Describe Your Solution 🔧 Fix incorrect documentation description. ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests --- # Checklist 📝 - [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6547 from huangxiaopingRD/6546. Closes #6546 fab3b93c2 [huangxiaoping] Merge remote-tracking branch 'origin/6546' 17bd5ea0d [huangxiaoping] [KYUUBI #6546] Fix incorrect documentation description 8f53a8911 [huangxiaoping] [KYUUBI #6546] Fix incorrect documentation description 449d3f1ea [huangxiaoping] [KYUUBI #6546] Fix incorrect documentation description Authored-by: huangxiaoping <1754789345@qq.com> Signed-off-by: Cheng Pan <chengpan@apache.org>	2024-07-19 10:27:39 +08:00
Cheng Pan	03dcedd89e	[KYUUBI #6453 ] Make KSHC support Spark 4.0 and enable CI for Spark 4.0 # 🔍 Description This PR makes KSHC support Spark 4.0, and also makes sure that the KSHC jar compiled against Spark 3.5 is binary compatible with Spark 4.0. We are ready to enable CI for Spark 4.0, except for authZ module. ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [x] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 Pass GHA. --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6453 from pan3793/spark4-ci. Closes #6453 695e3d7f7 [Cheng Pan] Update pom.xml 2eaa0f88a [Cheng Pan] Update .github/workflows/master.yml b1f540a34 [Cheng Pan] cross test 562839982 [Cheng Pan] fix 9f0c2e1be [Cheng Pan] fix 45f182462 [Cheng Pan] kshc 227ef5bae [Cheng Pan] fix 690a3b8b2 [Cheng Pan] Revert "fix" 87fe7678b [Cheng Pan] fix 60f55dbed [Cheng Pan] CI for Spark 4. Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>	2024-06-07 11:01:24 +08:00
Cheng Pan	a07c57f064	[KYUUBI #6427 ] Extract data lake artifact names as maven properties # 🔍 Description Improve data lake dependency management by extracting the following Maven properties: - `delta.artifact` - `hudi.artifact` - `iceberg.artifact` - `paimon.artifact` It often takes a while for the downstream data lakes to support the new Spark versions, extracting those properties makes it easy to override in the new profile on the Kyuubi project's `pom.xml` to workaround before data lakes jars are available. One use case is `a19bb7c18e` ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 Pass GHA. --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6427 from pan3793/datalake-dep. Closes #6427 74a9300e0 [Cheng Pan] Improve datalake dependency management Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>	2024-06-05 15:23:45 +08:00
senmiaoliu	bb92128131	[KYUUBI #6447 ] Use static regex Pattern instances in JavaUtils.timeStringAs and JavaUtils.byteStringAs # 🔍 Description ## Issue References 🔗 This pull request fixes #6447 ## Describe Your Solution 🔧 Use static regex Pattern instances in JavaUtils.timeStringAs and JavaUtils.byteStringAs ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests --- # Checklist 📝 - [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6448 from lsm1/branch-kyuubi-6447. Closes #6447 467066ce5 [senmiaoliu] Use static regex Pattern instances in JavaUtils Authored-by: senmiaoliu <senmiaoliu@trip.com> Signed-off-by: Cheng Pan <chengpan@apache.org>	2024-06-05 13:29:26 +08:00
Cheng Pan	6933a91588	[KYUUBI #6451 ] Bump Hudi 0.15.0 and enable Hudi authZ test for Spark 3.5 # 🔍 Description Kyuubi uses the Hudi Spark bundle jar in authZ module for testing, Hudi 0.15 brings Spark 3.5 and Scala 2.13 support, it also removes hacky for profile `spark-3.5`. ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 Pass GHA. --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6451 from pan3793/hudi-0.15. Closes #6451 98d6e97c5 [Cheng Pan] fix 2d31307da [Cheng Pan] remove spark-authz-hudi-test 8896f8c3f [Cheng Pan] Enable hudi test 7e9a7c7ae [Cheng Pan] Bump Hudi 0.15.0 Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>	2024-06-05 12:33:29 +08:00
Cheng Pan	1fb1f854eb	[KYUUBI #6439 ] kyuubi-util-scala test jar leaked to compile scope # 🔍 Description The `kyuubi-util-scala_2.12-<version>-tests.jar` accidentally leaked to the compile scope but should be in the test scope. ## Types of changes 🔖 - [x] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 Run `build/dist` and check `dist/jars` --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6439 from pan3793/util-scala-test. Closes #6439 0576248f5 [Cheng Pan] fix 2bf2408f5 [Cheng Pan] fix f7151dfc6 [Cheng Pan] kyuubi-util-scala test jar leaked to compile scope Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>	2024-06-04 11:31:58 +08:00
zhouyifan279	7bf0f57239	[KYUUBI #6441 ] Kyuubi Spark TPC-DS/H Connector cross version test # 🔍 Description ## Issue References 🔗 This pull request adds cross-version tests for Kyuubi Spark TPC-DS Connector and TPC-H Connector. ## Describe Your Solution 🔧 Add TPC-DS Connector and TPC-H Connector into GitHub Actions job `spark-connector-cross-version-test`. ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [x] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6441 from zhouyifan279/tcp-ds/h-cross-version. Closes #6441 c2abc468a [zhouyifan279] Kyuubi Spark TPC-DS/H Connector cross version test Authored-by: zhouyifan279 <zhouyifan279@gmail.com> Signed-off-by: Cheng Pan <chengpan@apache.org>	2024-06-03 11:08:47 +08:00
zhouyifan279	3ed912f5de	[KYUUBI #6247 ] Make KSHC binary compatible with multiple Spark versions # 🔍 Description ## Issue References 🔗 This pull request closes #6247 This also closes #6431 ## Describe Your Solution 🔧 Add a job `spark-connector-cross-version-test` in GitHub Actions to: 1. Build KSHC package with maven opt `-Pspark-3.5` 2. Run KSHC tests with maven opt `-Pspark-3.3` and `-Pspark-3.4` and KSHC package built in step 1 3. Fix the binary-compatible issue via reflection. ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [x] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 Pass GHA. --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6436 from zhouyifan279/kshc-cross-version-test. Closes #6247 d3ac2ef47 [zhouyifan279] Tune the KSHC code to fix binary-compatible issues 4e14edcb5 [zhouyifan279] Fix invalid unit-tests-log name 56ca45d18 [zhouyifan279] Fix invalid unit-tests-log name 4c5ab7b9e [zhouyifan279] Update test log name 8a84e8812 [zhouyifan279] Add matrix scala 17cb67155 [zhouyifan279] [KYUUBI #6247] KSHC cross-version test Authored-by: zhouyifan279 <zhouyifan279@gmail.com> Signed-off-by: Cheng Pan <chengpan@apache.org>	2024-06-01 20:13:41 +08:00
Cheng Pan	82441671a5	[KYUUBI #6424 ] TPC-H/DS connector support Spark 4.0 # 🔍 Description Adapt changes in SPARK-45857 ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 ``` build/mvn -pl ':kyuubi-spark-connector-tpch_2.13,:kyuubi-spark-connector-tpcds_2.13' \ -Pscala-2.13 -Pspark-master -am clean install -DskipTests build/mvn -pl ':kyuubi-spark-connector-tpch_2.13,:kyuubi-spark-connector-tpcds_2.13' \ -Pscala-2.13 -Pspark-master test ``` ``` [INFO] ------------------------------------------------------------------------ [INFO] Reactor Summary for Kyuubi Spark TPC-DS Connector 1.10.0-SNAPSHOT: [INFO] [INFO] Kyuubi Spark TPC-DS Connector ...................... SUCCESS [ 53.699 s] [INFO] Kyuubi Spark TPC-H Connector ....................... SUCCESS [ 30.511 s] [INFO] ------------------------------------------------------------------------ [INFO] BUILD SUCCESS [INFO] ------------------------------------------------------------------------ [INFO] Total time: 01:24 min [INFO] Finished at: 2024-05-27T06:01:58Z [INFO] ------------------------------------------------------------------------ ``` --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6424 from pan3793/tpc-conn-4. Closes #6424 9012a177f [Cheng Pan] TPC-H/DS connector support Spark 4.0 Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>	2024-05-27 07:02:52 +00:00
Cheng Pan	522a28e1d5	[KYUUBI #6398 ] Fix lineage plugin UT for Spark 4.0 # 🔍 Description ``` build/mvn clean test -Pscala-2.13 -Pspark-master -pl :kyuubi-spark-lineage_2.13 ``` ``` - test group by * FAILED * org.apache.spark.sql.catalyst.ExtendedAnalysisException: [DATATYPE_MISMATCH.BINARY_OP_WRONG_TYPE] Cannot resolve "(b + c)" due to data type mismatch: the binary operator requires the input type ("NUMERIC" or "INTERVAL DAY TO SECOND" or "INTERVAL YEAR TO MONTH" or "INTERVAL"), not "STRING". SQLSTATE: 42K09; line 1 pos 59; 'InsertIntoStatement RelationV2[a#546, b#547, c#548] v2_catalog.db.t1 v2_catalog.db.t1, false, false, false +- 'Aggregate [a#543], [a#543, unresolvedalias('count(distinct (b#544 + c#545))), (count(distinct b#544) * count(distinct c#545)) AS (count(DISTINCT b) * count(DISTINCT c))#551L] +- SubqueryAlias v2_catalog.db.t2 +- RelationV2[a#543, b#544, c#545] v2_catalog.db.t2 v2_catalog.db.t2 at org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.dataTypeMismatch(package.scala:73) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis0$7(CheckAnalysis.scala:315) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis0$7$adapted(CheckAnalysis.scala:302) at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:244) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreachUp$1(TreeNode.scala:243) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreachUp$1$adapted(TreeNode.scala:243) at scala.collection.immutable.Vector.foreach(Vector.scala:1856) at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:243) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreachUp$1(TreeNode.scala:243) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreachUp$1$adapted(TreeNode.scala:243) ... ``` ## Types of changes 🔖 - [x] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 Pass UT. --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6398 from pan3793/lineage-fix. Closes #6398 afce6b880 [Cheng Pan] Fix lineage plugin UT for Spark 4.0 Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>	2024-05-20 22:03:48 +08:00
Cheng Pan	6bdf2bdaf8	[KYUUBI #6392 ] Support javax.servlet and jakarta.servlet co-exist # 🔍 Description This PR makes `javax.servlet` and `jakarta.servlet` co-exist, by introducing `javax.servlet-api-4.0.1` and upgrade `jakarta.servlet-api` to 5.0.0. (6.0.0 requires JDK 11) Spark 4.0 migrated from `javax.servlet` to `jakarta.servlet` in SPARK-47118 while Kyuubi still uses `javax.servlet` in other modules, we should allow them to co-exist for a while. ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 Pass GHA. --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6392 from pan3793/servlet. Closes #6392 27d412599 [Cheng Pan] fix 9f1e72272 [Cheng Pan] other spark modules f4545dc76 [Cheng Pan] fix 313826fa7 [Cheng Pan] exclude 7d5028154 [Cheng Pan] Support javax.servlet and jakarta.servlet co-exist Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>	2024-05-20 21:09:30 +08:00
hezhao2	8edcb005ee	[KYUUBI #6315 ] Spark 3.5: MaxScanStrategy supports DSv2 # 🔍 Description ## Issue References 🔗 Now, MaxScanStrategy can be adopted to limit max scan file size in some datasources, such as Hive. Hopefully we can enhance MaxScanStrategy to include support for the datasourcev2. ## Describe Your Solution 🔧 get the statistics about files scanned through datasourcev2 API ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [x] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests --- # Checklists ## 📝 Author Self Checklist - [x] My code follows the [style guidelines](https://kyuubi.readthedocs.io/en/master/contributing/code/style.html) of this project - [x] I have performed a self-review - [x] I have commented my code, particularly in hard-to-understand areas - [ ] I have made corresponding changes to the documentation - [x] My changes generate no new warnings - [x] I have added tests that prove my fix is effective or that my feature works - [x] New and existing unit tests pass locally with my changes - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) ## 📝 Committer Pre-Merge Checklist - [x] Pull request title is okay. - [x] No license issues. - [x] Milestone correctly set? - [x] Test coverage is ok - [x] Assignees are selected. - [ ] Minimum number of approvals - [ ] No changes are requested Be nice. Be informative. Closes #5852 from zhaohehuhu/dev-1213. Closes #6315 3c5b0c276 [hezhao2] reformat fb113d625 [hezhao2] disable the rule that checks the maxPartitions for dsv2 acc358732 [hezhao2] disable the rule that checks the maxPartitions for dsv2 c8399a021 [hezhao2] fix header 70c845bee [hezhao2] add UTs 3a0739686 [hezhao2] add ut 4d26ce131 [hezhao2] reformat f87cb072c [hezhao2] reformat b307022b8 [hezhao2] move code to Spark 3.5 73258c2ae [hezhao2] fix unused import cf893a0e1 [hezhao2] drop reflection for loading iceberg class dc128bc8e [hezhao2] refactor code 661834cce [hezhao2] revert code 6061f42ab [hezhao2] delete IcebergSparkPlanHelper 5f1c3c082 [hezhao2] fix b15652f05 [hezhao2] remove iceberg dependency fe620ca92 [hezhao2] enable MaxScanStrategy when accessing iceberg datasource Authored-by: hezhao2 <hezhao2@cisco.com> Signed-off-by: Cheng Pan <chengpan@apache.org>	2024-04-17 16:29:50 +08:00
amanraj2520	35d4b5f0c7	[KYUUBI #6212 ] Added audit handler shutdown to the shutdown hook # 🔍 Description This pull request fixes #6212 When Kyuubi cleans up Ranger related threads like PolicyRefresher, it should also shutdown the audit threads that include SolrZkClient. Otherwise Spark Driver keeps on running since SolrZkClient is a non-daemon thread. Added the cleanup as part of the shutdown hook that Kyuubi registers. ## Types of changes 🔖 - [x] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests --- # Checklist 📝 - [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6233 from amanraj2520/auditShutdown. Closes #6212 e663d466c [amanraj2520] Refactored code ed293a9a4 [amanraj2520] Removed unused import 95a6814ad [amanraj2520] Added audit handler shutdown to the shutdown hook Authored-by: amanraj2520 <rajaman@microsoft.com> Signed-off-by: Cheng Pan <chengpan@apache.org>	2024-04-08 10:40:04 +08:00
Cheng Pan	b4f35d2c44	[KYUUBI #6267 ] Remove unused dependency management in POM # 🔍 Description This pull request removes unused dependency management in POM ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 Pass GA. --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6267 from pan3793/clean-pom. Closes #6267 d19f719bf [Cheng Pan] Remove usued dependency management in POM Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>	2024-04-07 23:53:46 +08:00
Cheng Pan	4fcc5c72a2	[KYUUBI #6260 ] Clean up and improve comments for spark extensions # 🔍 Description This pull request - improves comments for SPARK-33832 - removes unused `spark.sql.analyzer.classification.enabled` (I didn't update the migration rules because this configuration seems never to work properly) ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 Review --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6260 from pan3793/nit. Closes #6260 d762d30e9 [Cheng Pan] update comment 4ebaa04ea [Cheng Pan] nit b303f05bb [Cheng Pan] remove spark.sql.analyzer.classification.enabled b021cbc0a [Cheng Pan] Improve docs for SPARK-33832 Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>	2024-04-07 18:20:14 +08:00
wforget	ad612349fb	[KYUUBI #6215 ] Improve DropIgnoreNonexistent rule for Spark 3.5 # 🔍 Description ## Issue References 🔗 This pull request fixes # ## Describe Your Solution 🔧 Improve DropIgnoreNonexistent rule for spark 3.5 ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [X] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests DropIgnoreNonexistentSuite --- # Checklist 📝 - [X] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6215 from wForget/hotfix2. Closes #6215 cb1d34de1 [wforget] Improve DropIgnoreNonexistent rule for spark 3.5 Authored-by: wforget <643348094@qq.com> Signed-off-by: wforget <643348094@qq.com>	2024-03-29 10:51:46 +08:00
wforget	9114e507c4	[KYUUBI #6211 ] Check memory offHeap enabled for CustomResourceProfileExec # 🔍 Description ## Issue References 🔗 This pull request fixes # ## Describe Your Solution 🔧 We should check `spark.memory.offHeap.enabled` when applying for `executorOffHeapMemory`. ## Types of changes 🔖 - [X] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests --- # Checklist 📝 - [X] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6211 from wForget/hotfix. Closes #6211 1c7c8cd75 [wforget] Check memory offHeap enabled for CustomResourceProfileExec Authored-by: wforget <643348094@qq.com> Signed-off-by: wforget <643348094@qq.com>	2024-03-28 13:17:59 +08:00
Cheng Pan	3b9f25b62d	[KYUUBI #6197 ] Revise dependency management of Spark authZ plugin # 🔍 Description ## Issue References 🔗 The POM of `kyuubi-spark-authz-shaded` is redundant, just pull `kyuubi-spark-authz` is necessary. The current dependency management does not work on Ranger 2.1.0, this patch cleans up the POM definition and fixes the compatibility with Ranger 2.1.0 ## Describe Your Solution 🔧 Carefully revise the dependency list and exclusion. ## Types of changes 🔖 - [x] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 perform packing kyuubi-spark-authz-shaded module. ``` build/mvn clean install -pl extensions/spark/kyuubi-spark-authz-shaded -am -DskipTests ``` before ``` [INFO] --- maven-shade-plugin:3.5.2:shade (default) kyuubi-spark-authz-shaded_2.12 --- [INFO] Including org.apache.kyuubi:kyuubi-spark-authz_2.12🫙1.10.0-SNAPSHOT in the shaded jar. [INFO] Including org.apache.kyuubi:kyuubi-util-scala_2.12🫙1.10.0-SNAPSHOT in the shaded jar. [INFO] Including org.apache.kyuubi:kyuubi-util:jar:1.10.0-SNAPSHOT in the shaded jar. [INFO] Including org.apache.ranger:ranger-plugins-common:jar:2.4.0 in the shaded jar. [INFO] Including org.codehaus.jackson:jackson-jaxrs:jar:1.9.13 in the shaded jar. [INFO] Including org.codehaus.jackson:jackson-core-asl:jar:1.9.13 in the shaded jar. [INFO] Including org.codehaus.jackson:jackson-mapper-asl:jar:1.9.13 in the shaded jar. [INFO] Including org.apache.ranger:ranger-plugins-cred:jar:2.4.0 in the shaded jar. [INFO] Including com.sun.jersey:jersey-client:jar:1.19.4 in the shaded jar. [INFO] Including com.sun.jersey:jersey-core:jar:1.19.4 in the shaded jar. [INFO] Including com.kstruct:gethostname4j:jar:1.0.0 in the shaded jar. [INFO] Including net.java.dev.jna:jna:jar:5.7.0 in the shaded jar. [INFO] Including net.java.dev.jna:jna-platform:jar:5.7.0 in the shaded jar. [INFO] Including org.apache.ranger:ranger-plugins-audit:jar:2.4.0 in the shaded jar. ``` after ``` [INFO] --- maven-shade-plugin:3.5.2:shade (default) kyuubi-spark-authz-shaded_2.12 --- [INFO] Including org.apache.kyuubi:kyuubi-spark-authz_2.12🫙1.10.0-SNAPSHOT in the shaded jar. [INFO] Including org.apache.kyuubi:kyuubi-util-scala_2.12🫙1.10.0-SNAPSHOT in the shaded jar. [INFO] Including org.apache.kyuubi:kyuubi-util:jar:1.10.0-SNAPSHOT in the shaded jar. [INFO] Including org.apache.ranger:ranger-plugins-common:jar:2.4.0 in the shaded jar. [INFO] Including org.codehaus.jackson:jackson-jaxrs:jar:1.9.13 in the shaded jar. [INFO] Including org.codehaus.jackson:jackson-core-asl:jar:1.9.13 in the shaded jar. [INFO] Including org.codehaus.jackson:jackson-mapper-asl:jar:1.9.13 in the shaded jar. [INFO] Including org.apache.ranger:ranger-plugins-cred:jar:2.4.0 in the shaded jar. [INFO] Including com.sun.jersey:jersey-client:jar:1.19.4 in the shaded jar. [INFO] Including com.sun.jersey:jersey-core:jar:1.19.4 in the shaded jar. [INFO] Including com.kstruct:gethostname4j:jar:1.0.0 in the shaded jar. [INFO] Including net.java.dev.jna:jna:jar:5.7.0 in the shaded jar. [INFO] Including net.java.dev.jna:jna-platform:jar:5.7.0 in the shaded jar. [INFO] Including org.apache.ranger:ranger-plugin-classloader:jar:2.4.0 in the shaded jar. [INFO] Including org.apache.ranger:ranger-plugins-audit:jar:2.4.0 in the shaded jar. ``` --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6197 from pan3793/authz-dep. Closes #6197 d0becabce [Cheng Pan] 2.4 47e38502a [Cheng Pan] ranger 2.4 af01f7ed5 [Cheng Pan] test ranger 2.1 203aff3b3 [Cheng Pan] ranger-plugins-cred 974d76b03 [Cheng Pan] Resive dependency management of authz e5154f30f [Cheng Pan] improve authz deps Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>	2024-03-22 10:30:30 +08:00

1 2 3 4 5 ...

434 Commits