471237be92
394 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
82441671a5 |
[KYUUBI #6424] TPC-H/DS connector support Spark 4.0
# 🔍 Description Adapt changes in SPARK-45857 ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 ``` build/mvn -pl ':kyuubi-spark-connector-tpch_2.13,:kyuubi-spark-connector-tpcds_2.13' \ -Pscala-2.13 -Pspark-master -am clean install -DskipTests build/mvn -pl ':kyuubi-spark-connector-tpch_2.13,:kyuubi-spark-connector-tpcds_2.13' \ -Pscala-2.13 -Pspark-master test ``` ``` [INFO] ------------------------------------------------------------------------ [INFO] Reactor Summary for Kyuubi Spark TPC-DS Connector 1.10.0-SNAPSHOT: [INFO] [INFO] Kyuubi Spark TPC-DS Connector ...................... SUCCESS [ 53.699 s] [INFO] Kyuubi Spark TPC-H Connector ....................... SUCCESS [ 30.511 s] [INFO] ------------------------------------------------------------------------ [INFO] BUILD SUCCESS [INFO] ------------------------------------------------------------------------ [INFO] Total time: 01:24 min [INFO] Finished at: 2024-05-27T06:01:58Z [INFO] ------------------------------------------------------------------------ ``` --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) **Be nice. Be informative.** Closes #6424 from pan3793/tpc-conn-4. Closes #6424 9012a177f [Cheng Pan] TPC-H/DS connector support Spark 4.0 Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org> |
||
|
|
522a28e1d5
|
[KYUUBI #6398] Fix lineage plugin UT for Spark 4.0
# 🔍 Description ``` build/mvn clean test -Pscala-2.13 -Pspark-master -pl :kyuubi-spark-lineage_2.13 ``` ``` - test group by *** FAILED *** org.apache.spark.sql.catalyst.ExtendedAnalysisException: [DATATYPE_MISMATCH.BINARY_OP_WRONG_TYPE] Cannot resolve "(b + c)" due to data type mismatch: the binary operator requires the input type ("NUMERIC" or "INTERVAL DAY TO SECOND" or "INTERVAL YEAR TO MONTH" or "INTERVAL"), not "STRING". SQLSTATE: 42K09; line 1 pos 59; 'InsertIntoStatement RelationV2[a#546, b#547, c#548] v2_catalog.db.t1 v2_catalog.db.t1, false, false, false +- 'Aggregate [a#543], [a#543, unresolvedalias('count(distinct (b#544 + c#545))), (count(distinct b#544) * count(distinct c#545)) AS (count(DISTINCT b) * count(DISTINCT c))#551L] +- SubqueryAlias v2_catalog.db.t2 +- RelationV2[a#543, b#544, c#545] v2_catalog.db.t2 v2_catalog.db.t2 at org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.dataTypeMismatch(package.scala:73) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis0$7(CheckAnalysis.scala:315) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis0$7$adapted(CheckAnalysis.scala:302) at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:244) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreachUp$1(TreeNode.scala:243) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreachUp$1$adapted(TreeNode.scala:243) at scala.collection.immutable.Vector.foreach(Vector.scala:1856) at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:243) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreachUp$1(TreeNode.scala:243) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreachUp$1$adapted(TreeNode.scala:243) ... ``` ## Types of changes 🔖 - [x] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 Pass UT. --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) **Be nice. Be informative.** Closes #6398 from pan3793/lineage-fix. Closes #6398 afce6b880 [Cheng Pan] Fix lineage plugin UT for Spark 4.0 Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org> |
||
|
|
6bdf2bdaf8
|
[KYUUBI #6392] Support javax.servlet and jakarta.servlet co-exist
# 🔍 Description This PR makes `javax.servlet` and `jakarta.servlet` co-exist, by introducing `javax.servlet-api-4.0.1` and upgrade `jakarta.servlet-api` to 5.0.0. (6.0.0 requires JDK 11) Spark 4.0 migrated from `javax.servlet` to `jakarta.servlet` in SPARK-47118 while Kyuubi still uses `javax.servlet` in other modules, we should allow them to co-exist for a while. ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 Pass GHA. --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) **Be nice. Be informative.** Closes #6392 from pan3793/servlet. Closes #6392 27d412599 [Cheng Pan] fix 9f1e72272 [Cheng Pan] other spark modules f4545dc76 [Cheng Pan] fix 313826fa7 [Cheng Pan] exclude 7d5028154 [Cheng Pan] Support javax.servlet and jakarta.servlet co-exist Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org> |
||
|
|
8edcb005ee
|
[KYUUBI #6315] Spark 3.5: MaxScanStrategy supports DSv2
# 🔍 Description ## Issue References 🔗 Now, MaxScanStrategy can be adopted to limit max scan file size in some datasources, such as Hive. Hopefully we can enhance MaxScanStrategy to include support for the datasourcev2. ## Describe Your Solution 🔧 get the statistics about files scanned through datasourcev2 API ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [x] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests --- # Checklists ## 📝 Author Self Checklist - [x] My code follows the [style guidelines](https://kyuubi.readthedocs.io/en/master/contributing/code/style.html) of this project - [x] I have performed a self-review - [x] I have commented my code, particularly in hard-to-understand areas - [ ] I have made corresponding changes to the documentation - [x] My changes generate no new warnings - [x] I have added tests that prove my fix is effective or that my feature works - [x] New and existing unit tests pass locally with my changes - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) ## 📝 Committer Pre-Merge Checklist - [x] Pull request title is okay. - [x] No license issues. - [x] Milestone correctly set? - [x] Test coverage is ok - [x] Assignees are selected. - [ ] Minimum number of approvals - [ ] No changes are requested **Be nice. Be informative.** Closes #5852 from zhaohehuhu/dev-1213. Closes #6315 3c5b0c276 [hezhao2] reformat fb113d625 [hezhao2] disable the rule that checks the maxPartitions for dsv2 acc358732 [hezhao2] disable the rule that checks the maxPartitions for dsv2 c8399a021 [hezhao2] fix header 70c845bee [hezhao2] add UTs 3a0739686 [hezhao2] add ut 4d26ce131 [hezhao2] reformat f87cb072c [hezhao2] reformat b307022b8 [hezhao2] move code to Spark 3.5 73258c2ae [hezhao2] fix unused import cf893a0e1 [hezhao2] drop reflection for loading iceberg class dc128bc8e [hezhao2] refactor code 661834cce [hezhao2] revert code 6061f42ab [hezhao2] delete IcebergSparkPlanHelper 5f1c3c082 [hezhao2] fix b15652f05 [hezhao2] remove iceberg dependency fe620ca92 [hezhao2] enable MaxScanStrategy when accessing iceberg datasource Authored-by: hezhao2 <hezhao2@cisco.com> Signed-off-by: Cheng Pan <chengpan@apache.org> |
||
|
|
35d4b5f0c7
|
[KYUUBI #6212] Added audit handler shutdown to the shutdown hook
# 🔍 Description This pull request fixes #6212 When Kyuubi cleans up Ranger related threads like PolicyRefresher, it should also shutdown the audit threads that include SolrZkClient. Otherwise Spark Driver keeps on running since SolrZkClient is a non-daemon thread. Added the cleanup as part of the shutdown hook that Kyuubi registers. ## Types of changes 🔖 - [x] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests --- # Checklist 📝 - [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) **Be nice. Be informative.** Closes #6233 from amanraj2520/auditShutdown. Closes #6212 e663d466c [amanraj2520] Refactored code ed293a9a4 [amanraj2520] Removed unused import 95a6814ad [amanraj2520] Added audit handler shutdown to the shutdown hook Authored-by: amanraj2520 <rajaman@microsoft.com> Signed-off-by: Cheng Pan <chengpan@apache.org> |
||
|
|
b4f35d2c44
|
[KYUUBI #6267] Remove unused dependency management in POM
# 🔍 Description This pull request removes unused dependency management in POM ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 Pass GA. --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) **Be nice. Be informative.** Closes #6267 from pan3793/clean-pom. Closes #6267 d19f719bf [Cheng Pan] Remove usued dependency management in POM Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org> |
||
|
|
4fcc5c72a2
|
[KYUUBI #6260] Clean up and improve comments for spark extensions
# 🔍 Description This pull request - improves comments for SPARK-33832 - removes unused `spark.sql.analyzer.classification.enabled` (I didn't update the migration rules because this configuration seems never to work properly) ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 Review --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) **Be nice. Be informative.** Closes #6260 from pan3793/nit. Closes #6260 d762d30e9 [Cheng Pan] update comment 4ebaa04ea [Cheng Pan] nit b303f05bb [Cheng Pan] remove spark.sql.analyzer.classification.enabled b021cbc0a [Cheng Pan] Improve docs for SPARK-33832 Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org> |
||
|
|
ad612349fb |
[KYUUBI #6215] Improve DropIgnoreNonexistent rule for Spark 3.5
# 🔍 Description ## Issue References 🔗 This pull request fixes # ## Describe Your Solution 🔧 Improve DropIgnoreNonexistent rule for spark 3.5 ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [X] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests DropIgnoreNonexistentSuite --- # Checklist 📝 - [X] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) **Be nice. Be informative.** Closes #6215 from wForget/hotfix2. Closes #6215 cb1d34de1 [wforget] Improve DropIgnoreNonexistent rule for spark 3.5 Authored-by: wforget <643348094@qq.com> Signed-off-by: wforget <643348094@qq.com> |
||
|
|
9114e507c4 |
[KYUUBI #6211] Check memory offHeap enabled for CustomResourceProfileExec
# 🔍 Description ## Issue References 🔗 This pull request fixes # ## Describe Your Solution 🔧 We should check `spark.memory.offHeap.enabled` when applying for `executorOffHeapMemory`. ## Types of changes 🔖 - [X] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests --- # Checklist 📝 - [X] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) **Be nice. Be informative.** Closes #6211 from wForget/hotfix. Closes #6211 1c7c8cd75 [wforget] Check memory offHeap enabled for CustomResourceProfileExec Authored-by: wforget <643348094@qq.com> Signed-off-by: wforget <643348094@qq.com> |
||
|
|
3b9f25b62d
|
[KYUUBI #6197] Revise dependency management of Spark authZ plugin
# 🔍 Description ## Issue References 🔗 The POM of `kyuubi-spark-authz-shaded` is redundant, just pull `kyuubi-spark-authz` is necessary. The current dependency management does not work on Ranger 2.1.0, this patch cleans up the POM definition and fixes the compatibility with Ranger 2.1.0 ## Describe Your Solution 🔧 Carefully revise the dependency list and exclusion. ## Types of changes 🔖 - [x] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 perform packing kyuubi-spark-authz-shaded module. ``` build/mvn clean install -pl extensions/spark/kyuubi-spark-authz-shaded -am -DskipTests ``` before ``` [INFO] --- maven-shade-plugin:3.5.2:shade (default) kyuubi-spark-authz-shaded_2.12 --- [INFO] Including org.apache.kyuubi:kyuubi-spark-authz_2.12🫙1.10.0-SNAPSHOT in the shaded jar. [INFO] Including org.apache.kyuubi:kyuubi-util-scala_2.12🫙1.10.0-SNAPSHOT in the shaded jar. [INFO] Including org.apache.kyuubi:kyuubi-util:jar:1.10.0-SNAPSHOT in the shaded jar. [INFO] Including org.apache.ranger:ranger-plugins-common:jar:2.4.0 in the shaded jar. [INFO] Including org.codehaus.jackson:jackson-jaxrs:jar:1.9.13 in the shaded jar. [INFO] Including org.codehaus.jackson:jackson-core-asl:jar:1.9.13 in the shaded jar. [INFO] Including org.codehaus.jackson:jackson-mapper-asl:jar:1.9.13 in the shaded jar. [INFO] Including org.apache.ranger:ranger-plugins-cred:jar:2.4.0 in the shaded jar. [INFO] Including com.sun.jersey:jersey-client:jar:1.19.4 in the shaded jar. [INFO] Including com.sun.jersey:jersey-core:jar:1.19.4 in the shaded jar. [INFO] Including com.kstruct:gethostname4j:jar:1.0.0 in the shaded jar. [INFO] Including net.java.dev.jna:jna:jar:5.7.0 in the shaded jar. [INFO] Including net.java.dev.jna:jna-platform:jar:5.7.0 in the shaded jar. [INFO] Including org.apache.ranger:ranger-plugins-audit:jar:2.4.0 in the shaded jar. ``` after ``` [INFO] --- maven-shade-plugin:3.5.2:shade (default) kyuubi-spark-authz-shaded_2.12 --- [INFO] Including org.apache.kyuubi:kyuubi-spark-authz_2.12🫙1.10.0-SNAPSHOT in the shaded jar. [INFO] Including org.apache.kyuubi:kyuubi-util-scala_2.12🫙1.10.0-SNAPSHOT in the shaded jar. [INFO] Including org.apache.kyuubi:kyuubi-util:jar:1.10.0-SNAPSHOT in the shaded jar. [INFO] Including org.apache.ranger:ranger-plugins-common:jar:2.4.0 in the shaded jar. [INFO] Including org.codehaus.jackson:jackson-jaxrs:jar:1.9.13 in the shaded jar. [INFO] Including org.codehaus.jackson:jackson-core-asl:jar:1.9.13 in the shaded jar. [INFO] Including org.codehaus.jackson:jackson-mapper-asl:jar:1.9.13 in the shaded jar. [INFO] Including org.apache.ranger:ranger-plugins-cred:jar:2.4.0 in the shaded jar. [INFO] Including com.sun.jersey:jersey-client:jar:1.19.4 in the shaded jar. [INFO] Including com.sun.jersey:jersey-core:jar:1.19.4 in the shaded jar. [INFO] Including com.kstruct:gethostname4j:jar:1.0.0 in the shaded jar. [INFO] Including net.java.dev.jna:jna:jar:5.7.0 in the shaded jar. [INFO] Including net.java.dev.jna:jna-platform:jar:5.7.0 in the shaded jar. [INFO] Including org.apache.ranger:ranger-plugin-classloader:jar:2.4.0 in the shaded jar. [INFO] Including org.apache.ranger:ranger-plugins-audit:jar:2.4.0 in the shaded jar. ``` --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) **Be nice. Be informative.** Closes #6197 from pan3793/authz-dep. Closes #6197 d0becabce [Cheng Pan] 2.4 47e38502a [Cheng Pan] ranger 2.4 af01f7ed5 [Cheng Pan] test ranger 2.1 203aff3b3 [Cheng Pan] ranger-plugins-cred 974d76b03 [Cheng Pan] Resive dependency management of authz e5154f30f [Cheng Pan] improve authz deps Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org> |
||
|
|
74351c7e6b
|
[KYUUBI #6194] AuthZ shaded should include ranger-plugins-cred
# 🔍 Description ## Issue References 🔗 This pull request fixes a class not found issue. ``` Caused by: java.lang.ClassNotFoundException: org.apache.ranger.authorization.hadoop.utils.RangerCredentialProvider ... ``` ## Describe Your Solution 🔧 `org.apache.ranger:ranger-plugins-cred` was missing in include list. ## Types of changes 🔖 - [x] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 Manual test. --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) **Be nice. Be informative.** Closes #6194 from pan3793/auth-shaded. Closes #6194 4eae524bd [Cheng Pan] Authz shaded should include ranger-plugins-cred Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org> |
||
|
|
eb278c562d
|
[RELEASE] Bump 1.10.0-SNAPSHOT | ||
|
|
6297651d83
|
[KYUUBI #6163] Set default Spark version to 3.5
# 🔍 Description ## Issue References 🔗 Kyuubi fully supports Spark 3.5 now, this pull request aims to set the default Spark to 3.5 in Kyuubi 1.9 ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 Pass GA. --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) **Be nice. Be informative.** Closes #6163 from pan3793/spark-3.5-default. Closes #6163 f386aeb7a [Cheng Pan] Set default Spark version to 3.5 Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org> |
||
|
|
6a4f6f0c88
|
[KYUUBI #6168] Check if forcedMaxOutputRows is negative
# 🔍 Description ## Issue References 🔗 This pull request fixes #6168 ## Describe Your Solution 🔧 Check if forcedMaxOutputRows is negative. ## Types of changes 🔖 - [X] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests --- # Checklist 📝 - [X] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) **Be nice. Be informative.** Closes #6169 from wForget/KYUUBI-6168. Closes #6168 b18d8e5f5 [wforget] fix style 057c5388b [wforget] Check if forcedMaxOutputRows is negative Authored-by: wforget <643348094@qq.com> Signed-off-by: Cheng Pan <chengpan@apache.org> |
||
|
|
8cc9b98e25
|
[KYUUBI #5384][KSCH] Hive connector supports Spark 3.5
# 🔍 Description ## Issue References 🔗 This pull request fixes #5384 ## Describe Your Solution 🔧 ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [x] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests --- # Checklist 📝 - [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) **Be nice. Be informative.** Closes #6133 from Kwafoor/kyuubi_6073. Closes #5384 9234e35ad [Cheng Pan] fix 7766dfda5 [Cheng Pan] nit e9da162f8 [Cheng Pan] nit 676bfb26e [Cheng Pan] pretty c241859af [Cheng Pan] pretty 0eedcf82c [wangjunbo] compat with spark 3.3 3d866546c [wangjunbo] format code a0898f50f [wangjunbo] delete Unused import 9577f7fe8 [wangjunbo] [KYUUBI #5384] kyuubi-spark-connector-hive supports Spark 3.5 Lead-authored-by: Cheng Pan <chengpan@apache.org> Co-authored-by: wangjunbo <wangjunbo@qiyi.com> Signed-off-by: Cheng Pan <chengpan@apache.org> |
||
|
|
5bee05e45f
|
[KYUUBI #6078] KSHC should handle the commit of the partitioned table as dynamic partition at write path
# 🔍 Description ## Issue References 🔗 This pull request fixes https://github.com/apache/kyuubi/issues/6078, KSHC should handle the commit of the partitioned table as dynamic partition at write path, that's beacuse the process of writing with Apache Spark DataSourceV2 using dynamic partitioning to handle static partitions. ## Describe Your Solution 🔧 Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change. ## Types of changes 🔖 - [x] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) **Be nice. Be informative.** Closes #6082 from Yikf/KYUUBI-6078. Closes #6078 2ae183672 [yikaifei] KSHC should handle the commit of the partitioned table as dynamic partition at write path Authored-by: yikaifei <yikaifei@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org> |
||
|
|
f1cf1e42de
|
[KYUUBI #6131] Simplify Maven dependency management after dropping building support for Spark 3.1
# 🔍 Description ## Issue References 🔗 SPARK-33212 (fixed in 3.2.0) moves from `hadoop-client` to shaded hadoop client, to simplify the dependency management, previously , we add some workaround to handle Spark 3.1 dependency issues. As we removed building support for Spark 3.1 now, we can remove those workaround to simplify `pom.xml` ## Describe Your Solution 🔧 As above. ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 Pass GA. --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) **Be nice. Be informative.** Closes #6131 from pan3793/3-1-cleanup. Closes #6131 1341065a7 [Cheng Pan] nit 1d7323f6e [Cheng Pan] fix 9e2e3b747 [Cheng Pan] nit 271166b58 [Cheng Pan] test Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org> |
||
|
|
0a0af165e3
|
[KYUUBI #6125] Drop Kyuubi extension for Spark 3.1
# 🔍 Description ## Issue References 🔗 This pull request is the next step of deprecating and removing support of Spark 3.1 VOTE: https://lists.apache.org/thread/670fx1qx7rm0vpvk8k8094q2d0fthw5b VOTE RESULT: https://lists.apache.org/thread/0zdxg5zjnc1wpxmw9mgtsxp1ywqt6qvb ## Describe Your Solution 🔧 Drop module `kyuubi-extension-spark-3-1` and delete Spark 3.1 specific codes. ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 Pass GA. --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) **Be nice. Be informative.** Closes #6125 from pan3793/drop-spark-ext-3-1. Closes #6125 212012f18 [Cheng Pan] fix style 021532ccd [Cheng Pan] doc 329f69ab9 [Cheng Pan] address comments 43fac4201 [Cheng Pan] fix a12c8062c [Cheng Pan] fix dcf51c1a1 [Cheng Pan] minor 814a187a6 [Cheng Pan] Drop Kyuubi extension for Spark 3.1 Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org> |
||
|
|
e0d706e696
|
[KYUUBI #6091] Deprecate and remove building support for Spark 3.1
# 🔍 Description ## Issue References 🔗 This pull request aims to remove building support for Spark 3.1, while still keeping the engine support for Spark 3.1. - VOTE: https://lists.apache.org/thread/670fx1qx7rm0vpvk8k8094q2d0fthw5b - VOTE RESULT: https://lists.apache.org/thread/0zdxg5zjnc1wpxmw9mgtsxp1ywqt6qvb The next step is to clean up code in Spark extensions to drop 3.1-related code. ## Describe Your Solution 🔧 - Remove Maven profile `spark-3.1`, and references on docs, release scripts, etc. - Keep the cross-version verification to ensure that the Spark SQL engine built on the default Spark version (3.4) still works well on Spark 3.1 runtime. ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [x] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 Pass GA. --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) **Be nice. Be informative.** Closes #6091 from pan3793/remove-spark-3.1-profile. Closes #6091 ce2983284 [Cheng Pan] nit 5887c808b [Cheng Pan] migration guide cf28096d3 [Cheng Pan] Log deprecation message on Spark SQL engine with 3.1 a467e618d [Cheng Pan] nit e11c0fb31 [Cheng Pan] Remove building support for Spark 3.1 Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org> |
||
|
|
07068a8416
|
[KYUUBI #6095] Enable PaimonTest for Spark 3.5
# 🔍 Description ## Issue References 🔗 This pull request enables PaimonTest for Spark 3.5 ## Describe Your Solution 🔧 As Paimon 0.7.0 already brings support for Spark 3.5, we should enable PaimonTest for Spark 3.5. ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 Pass GA. --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) **Be nice. Be informative.** Closes #6095 from pan3793/paimon-spark-3.5. Closes #6095 f55801b7f [Cheng Pan] Enable PaimonTest for Spark 3.5 Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org> |
||
|
|
97f7987689
|
[KYUUBI #5991] Error on reading Atlas properties composed of multi values
# 🔍 Description ## Issue References 🔗 This pull request fixes # ## Describe Your Solution 🔧 Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change. ## Types of changes 🔖 - [x] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests --- # Checklist 📝 - [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) **Be nice. Be informative.** Closes #5993 from SwordyZhao/fix-issues-5991. Closes #5991 827007d06 [swordy_zhao] run dev/reformat fix code style. 600363dd9 [swordy_zhao] delete scala.List,Convert a java.List to scala.List 7b000e94a [swordy_zhao] fix 5991--kyuubi failed to read atlas.rest.address 5de05764e [swordy_zhao] fix 5991--kyuubi failed to read atlas.rest.address Authored-by: swordy_zhao <swordy_work@163.com> Signed-off-by: Cheng Pan <chengpan@apache.org> |
||
|
|
f67140e650
|
[KYUUBI #5594][AUTHZ] BuildQuery should respect normal node's input
# 🔍 Description ## Issue References 🔗 This pull request fixes #5594 ## Describe Your Solution 🔧 For case ``` def filter_func(iterator): for pdf in iterator: yield pdf[pdf.id == 1] df = spark.read.table("test_mapinpandas") execute_result = df.mapInPandas(filter_func, df.schema).show() ``` The logical plan is ``` GlobalLimit 21 +- LocalLimit 21 +- Project [cast(id#5 as string) AS id#11, name#6] +- MapInPandas filter_func(id#0, name#1), [id#5, name#6] +- HiveTableRelation [`default`.`test_mapinpandas`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, Data Cols: [id#0, name#1], Partition Cols: []] ``` When handle `MapInPandas`, we didn't match its input with `HiveTableRelation`, cause we miss input table's columns. This pr fix this In this pr, we remove the branch of each project such as `Project`, `Aggregate` etc, handle it together. ## Types of changes 🔖 - [x] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ For case ``` def filter_func(iterator): for pdf in iterator: yield pdf[pdf.id == 1] df = spark.read.table("test_mapinpandas") execute_result = df.mapInPandas(filter_func, df.schema).show() ``` We miss column info of table `test_mapinpandas` #### Behavior With This Pull Request 🎉 We got privilege object of table `test_mapinpandas` with it's column info. #### Related Unit Tests --- # Checklists ## 📝 Author Self Checklist - [x] My code follows the [style guidelines](https://kyuubi.readthedocs.io/en/master/contributing/code/style.html) of this project - [x] I have performed a self-review - [x] I have commented my code, particularly in hard-to-understand areas - [x] I have made corresponding changes to the documentation - [x] My changes generate no new warnings - [x] I have added tests that prove my fix is effective or that my feature works - [x] New and existing unit tests pass locally with my changes - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) ## 📝 Committer Pre-Merge Checklist - [x] Pull request title is okay. - [x] No license issues. - [x] Milestone correctly set? - [x] Test coverage is ok - [x] Assignees are selected. - [x] Minimum number of approvals - [x] No changes are requested **Be nice. Be informative.** Closes #5787 from AngersZhuuuu/KYUUBI-5594-approach2. Closes #5594 e08545599 [Angerszhuuuu] Update RangerSparkExtensionSuite.scala 49f09fb0a [Angerszhuuuu] Update RangerSparkExtensionSuite.scala 4781f75b9 [Angerszhuuuu] Update PrivilegesBuilderSuite.scala 9e9208d38 [Angerszhuuuu] Update V2JdbcTableCatalogRangerSparkExtensionSuite.scala 626d3dd88 [Angerszhuuuu] Update RangerSparkExtensionSuite.scala 3d69997de [Angerszhuuuu] Update PrivilegesBuilderSuite.scala 6eb4b8e1a [Angerszhuuuu] Update RangerSparkExtensionSuite.scala 61efb8ae3 [Angerszhuuuu] update 794ebb7be [Angerszhuuuu] Merge branch 'master' into KYUUBI-5594-approach2 a236da86b [Angerszhuuuu] Update PrivilegesBuilderSuite.scala 74bd3f4d5 [Angerszhuuuu] Update RangerSparkExtensionSuite.scala 4acbc4276 [Angerszhuuuu] Merge branch 'KYUUBI-5594-approach2' of https://github.com/AngersZhuuuu/incubator-kyuubi into KYUUBI-5594-approach2 266f7e877 [Angerszhuuuu] update a6c784546 [Angerszhuuuu] Update PrivilegesBuilder.scala d785d5fdf [Angerszhuuuu] Merge branch 'master' into KYUUBI-5594-approach2 014ef3b84 [Angerszhuuuu] Update PrivilegesBuilder.scala 7e1cd37a1 [Angerszhuuuu] Merge branch 'master' into KYUUBI-5594-approach2 71d266162 [Angerszhuuuu] update db9594170 [Angerszhuuuu] update 490eb95c2 [Angerszhuuuu] update 70d110e89 [Angerszhuuuu] Merge branch 'master' into KYUUBI-5594-approach2 e6a587718 [Angerszhuuuu] Update PrivilegesBuilder.scala 5ff22b103 [Angerszhuuuu] Update PrivilegesBuilder.scala e6843014b [Angerszhuuuu] Update PrivilegesBuilder.scala 594b202f7 [Angerszhuuuu] Update PrivilegesBuilder.scala 2f87c61e1 [Angerszhuuuu] Update RangerSparkExtensionSuite.scala 1de8c1c68 [Angerszhuuuu] Update PrivilegesBuilder.scala ad17255d7 [Angerszhuuuu] Update PrivilegesBuilderSuite.scala 4f5e8505f [Angerszhuuuu] update 64349ed97 [Angerszhuuuu] Update PrivilegesBuilder.scala 11b7a4c13 [Angerszhuuuu] Update PrivilegesBuilder.scala 9a58fb0c4 [Angerszhuuuu] update d0b022ec9 [Angerszhuuuu] Update RuleApplyPermanentViewMarker.scala e0f28a640 [Angerszhuuuu] Merge branch 'master' into KYUUBI-5594 0ebdd5de5 [Angerszhuuuu] Merge branch 'master' into KYUUBI-5594 8e53236ac [Angerszhuuuu] update 3bafa7ca5 [Angerszhuuuu] update d6e984e07 [Angerszhuuuu] update b00bf5e20 [Angerszhuuuu] Update PrivilegesBuilder.scala 821422852 [Angerszhuuuu] update 93fc6892b [Angerszhuuuu] Merge branch 'master' into KYUUBI-5594 04184e39d [Angerszhuuuu] update 0bb762467 [Angerszhuuuu] Revert "Revert "Update PrivilegesBuilder.scala"" f481283ae [Angerszhuuuu] Revert "Update PrivilegesBuilder.scala" 9f871822f [Angerszhuuuu] Revert "Update PrivilegesBuilder.scala" 29b67c457 [Angerszhuuuu] Update PrivilegesBuilder.scala 8785ad1ab [Angerszhuuuu] Update PrivilegesBuilder.scala 270f21dcc [Angerszhuuuu] Update RangerSparkExtensionSuite.scala 60872efcb [Angerszhuuuu] Update RangerSparkExtensionSuite.scala c34f32ea2 [Angerszhuuuu] Merge branch 'master' into KYUUBI-5594 86fc4756a [Angerszhuuuu] Update PrivilegesBuilder.scala 404f1ea4c [Angerszhuuuu] Update PrivilegesBuilder.scala dcca394e0 [Angerszhuuuu] Update PrivilegesBuilder.scala c2c6fa447 [Angerszhuuuu] Update PrivilegesBuilder.scala 6f6a36e5b [Angerszhuuuu] Merge branch 'master' into KYUUBI-5594]-AUTH]BuildQuery-should-respect-normal-node's-input 4dd47a124 [Angerszhuuuu] update c549b6a1a [Angerszhuuuu] update 80013b981 [Angerszhuuuu] Update PrivilegesBuilder.scala 3cbba422a [Angerszhuuuu] Update PrivilegesBuilder.scala Authored-by: Angerszhuuuu <angers.zhu@gmail.com> Signed-off-by: Cheng Pan <chengpan@apache.org> |
||
|
|
7806812cea |
[KYUUBI #6007] AuthZ should check hoodie procedures path resource privileges
# 🔍 Description ## Issue References 🔗 This pull request aims to make authz check hoodie procedures path resource privileges. ## Describe Your Solution 🔧 Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change. ## Types of changes 🔖 - [x] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ When the Hoodie procedure operation is on the path, the check can pass regardless of whether the path resource has permissions. #### Behavior With This Pull Request 🎉 Check the path permissions correctly. #### Related Unit Tests New tests added. --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) **Be nice. Be informative.** Closes #5972 from Yikf/hudi-call-path. Closes #6007 e7dd28be8 [yikaifei] AuthZ should check hoodie procedures path resource privileges Authored-by: yikaifei <yikaifei@apache.org> Signed-off-by: yikaifei <yikaifei@apache.org> |
||
|
|
b037325fcf
|
[KYUUBI #5964][BUG] Avoid check not fully optimized query for InsertIntoDataSourceDirCommand and InsertIntoDataSourceCommand
# 🔍 Description ## Issue References 🔗 This pull request fixes #5964 ## Describe Your Solution 🔧 InsertIntoDataSourceDirCommand and InsertIntoDataSourceCommand‘s query is not fully optimized, we direct check it's query will cause request privilege that we haven't used. We can directly ignore the query's check. Since we will check it's generated plan. Still will request the correct privilege of the SQL ## Types of changes 🔖 - [x] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests --- # Checklist 📝 - [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) **Be nice. Be informative.** Closes #5983 from AngersZhuuuu/KYUUBI-5964. Closes #5964 1adcf8dd8 [Angerszhuuuu] update 7204c9fe5 [Angerszhuuuu] [KYUUBI-5964][BUG] Avoid check not fully optimized query for InsertIntoDataSourceDirCommand and InsertIntoDataSourceCommand Authored-by: Angerszhuuuu <angers.zhu@gmail.com> Signed-off-by: Kent Yao <yao@apache.org> |
||
|
|
35d9b20969
|
[KYUUBI #5997][AUTHZ] Avoid unnecessary loop of RuleEliminateTypeOf
# 🔍 Description ## Issue References 🔗 This pull request fixes #5997 ## Describe Your Solution 🔧 Avoid unnecessary loop of RuleEliminateTypeOf, improve the catalyst performance ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests --- # Checklist 📝 - [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) **Be nice. Be informative.** Closes #5998 from AngersZhuuuu/KYUUBI-5997. Closes #5997 1db3b5f95 [Angerszhuuuu] [KYUUBI #5997][Improvement] Avoid unnecessary loop of RuleEliminateTypeOf Authored-by: Angerszhuuuu <angers.zhu@gmail.com> Signed-off-by: Cheng Pan <chengpan@apache.org> |
||
|
|
d3a38533e5
|
[KYUUBI #5937] PVM cause cache table not work
# 🔍 Description ## Issue References 🔗 This pull request fixes #5937 ## Describe Your Solution 🔧 If we cache a table with persist view in the query, since cache table use analyzed plan, so in kyuubi authz we will use PVM to wrap the view, but cache table use canonicalized plan, so we need to implement the `doCanonicalize()` method to ignore the impact of PVM, or it will cache cached table can't be matched. ## Types of changes 🔖 - [x] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests --- # Checklist 📝 - [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) **Be nice. Be informative.** Closes #5982 from AngersZhuuuu/KYUUBI-5937. Closes #5937 e28275f32 [Angerszhuuuu] Update PermanentViewMarker.scala c504103d2 [Angerszhuuuu] Update PermanentViewMarker.scala 19102ff53 [Angerszhuuuu] [KYUUBI-5937][Bug] PVM cause cache table not work Authored-by: Angerszhuuuu <angers.zhu@gmail.com> Signed-off-by: Cheng Pan <chengpan@apache.org> |
||
|
|
e9e2d189ba
|
[KYUUBI #5985] [AUTHZ][MINOR] Remove incorrect getUri method
# 🔍 Description ## Issue References 🔗 This pull request remove incorrect getUri method in authZ module, This method is currently not applicable in any context, and it is incorrect as it ought to return a List type rather than a String. ## Describe Your Solution 🔧 Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change. ## Types of changes 🔖 - [x] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) **Be nice. Be informative.** Closes #5985 from Yikf/remove-incorrect-getUrl. Closes #5985 93ee5498e [yikaifei] remove incorrect getUri Authored-by: yikaifei <yikaifei@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org> |
||
|
|
3af755115a |
[KYUUBI #5965] [AUTHZ] Supports check hoodie procedures show_commits resource privileges
# 🔍 Description ## Issue References 🔗 This pull request aims to make AuthZ supports check [hoodie procedures show_commits](https://hudi.apache.org/docs/procedures#show_commits) resource privileges ## Describe Your Solution 🔧 Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change. ## Types of changes 🔖 - [x] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ `CALL show_commits` passes permission checks whether they have permission or not #### Behavior With This Pull Request 🎉 `CALL show_commits` will not pass without permission #### Related Unit Tests New test added, extensions/spark/kyuubi-spark-authz/src/test/scala/org/apache/kyuubi/plugin/spark/authz/ranger/HudiCatalogRangerSparkExtensionSuite.scala#ShowCommitsProcedure --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) **Be nice. Be informative.** Closes #5965 from Yikf/hudi-showcommits. Closes #5965 4e609b09a [yikaifei] Supports check hoodie procedures show_commits resource privileges Authored-by: yikaifei <yikaifei@apache.org> Signed-off-by: yikaifei <yikaifei@apache.org> |
||
|
|
e6b1bf76cb
|
[KYUUBI #5933] Happy New Year 2024
# 🔍 Description ## Issue References 🔗 Update the NOTICE files for coming 2024. ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 Review --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) **Be nice. Be informative.** Closes #5933 from pan3793/notice-2024. Closes #5933 25e85e5f5 [Cheng Pan] notice Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org> |
||
|
|
62fb9cdfc2
|
[KYUUBI #5913][Bug] After resolve PVM should mark all nodes as checked
# 🔍 Description ## Issue References 🔗 This pull request fixes #5913 ## Describe Your Solution 🔧 We meet a case that seem after `RuleEliminatePermanentViewMarker` apply and optimize the PVM's child plan, seems some case node's tag was missed, then also check the PVM's source table again when `OptimizeSubqueries`. We should mark all node as checked for pvm's child and optimized child. 1. It's more stable 2. It's safe ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 Manuel tested in our prod, but didn't reproduce it in the UT since the case is too complex SQL. #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests --- # Checklists ## 📝 Author Self Checklist - [x] My code follows the [style guidelines](https://kyuubi.readthedocs.io/en/master/contributing/code/style.html) of this project - [x] I have performed a self-review - [ ] I have commented my code, particularly in hard-to-understand areas - [ ] I have made corresponding changes to the documentation - [x] My changes generate no new warnings - [ ] I have added tests that prove my fix is effective or that my feature works - [x] New and existing unit tests pass locally with my changes - [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) ## 📝 Committer Pre-Merge Checklist - [x] Pull request title is okay. - [x] No license issues. - [x] Milestone correctly set? - [x] Test coverage is ok - [x] Assignees are selected. - [x] Minimum number of approvals - [x] No changes are requested **Be nice. Be informative.** Closes #5915 from AngersZhuuuu/KYUUBI-5913. Closes #5913 38c04bd70 [Angerszhuuuu] [KYUUBI #5913][Bug] After resolve PVM should mark all nodes as checke Authored-by: Angerszhuuuu <angers.zhu@gmail.com> Signed-off-by: Cheng Pan <chengpan@apache.org> |
||
|
|
789b21fdbd
|
[KYUUBI #5903] PVM should override computeStats method
# 🔍 Description ## Issue References 🔗 This pull request fixes #5903 ## Describe Your Solution 🔧 PVM inherit from LeafNode, also need to override method `computeStats` ``` java.lang.UnsupportedOperationException at org.apache.spark.sql.catalyst.plans.logical.LeafNode.computeStats(LogicalPlan.scala:169) at org.apache.spark.sql.catalyst.plans.logical.LeafNode.computeStats$(LogicalPlan.scala:169) at org.apache.kyuubi.plugin.spark.authz.ram.rule.RamPermanentViewMarker.computeStats(RamPermanentViewMarker.scala:26) at org.apache.spark.sql.catalyst.plans.logical.statsEstimation.SizeInBytesOnlyStatsPlanVisitor$.default(SizeInBytesOnlyStatsPlanVisitor.scala:55) at org.apache.spark.sql.catalyst.plans.logical.statsEstimation.SizeInBytesOnlyStatsPlanVisitor$.default(SizeInBytesOnlyStatsPlanVisitor.scala:27) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlanVisitor.visit(LogicalPlanVisitor.scala:47) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlanVisitor.visit$(LogicalPlanVisitor.scala:25) at org.apache.spark.sql.catalyst.plans.logical.statsEstimation.SizeInBytesOnlyStatsPlanVisitor$.visit(SizeInBytesOnlyStatsPlanVisitor.scala:27) at org.apache.spark.sql.catalyst.plans.logical.statsEstimation.LogicalPlanStats.$anonfun$stats$1(LogicalPlanStats.scala:37) at scala.Option.getOrElse(Option.scala:189) at org.apache.spark.sql.catalyst.plans.logical.statsEstimation.LogicalPlanStats.stats(LogicalPlanStats.scala:33) at org.apache.spark.sql.catalyst.plans.logical.statsEstimation.LogicalPlanStats.stats$(LogicalPlanStats.scala:33) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.stats(LogicalPlan.scala:30) at org.apache.spark.sql.catalyst.plans.logical.statsEstimation.SizeInBytesOnlyStatsPlanVisitor$.visitUnaryNode(SizeInBytesOnlyStatsPlanVisitor.scala:39) at org.apache.spark.sql.catalyst.plans.logical.statsEstimation.SizeInBytesOnlyStatsPlanVisitor$.visitFilter(SizeInBytesOnlyStatsPlanVisitor.scala:79) at org.apache.spark.sql.catalyst.plans.logical.statsEstimation.SizeInBytesOnlyStatsPlanVisitor$.visitFilter(SizeInBytesOnlyStatsPlanVisitor.scala:27) ``` ## Types of changes 🔖 - [x] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests --- # Checklists ## 📝 Author Self Checklist - [x] My code follows the [style guidelines](https://kyuubi.readthedocs.io/en/master/contributing/code/style.html) of this project - [x] I have performed a self-review - [ ] I have commented my code, particularly in hard-to-understand areas - [ ] I have made corresponding changes to the documentation - [ ] My changes generate no new warnings - [ ] I have added tests that prove my fix is effective or that my feature works - [x] New and existing unit tests pass locally with my changes - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) ## 📝 Committer Pre-Merge Checklist - [x] Pull request title is okay. - [x] No license issues. - [x] Milestone correctly set? - [ ] Test coverage is ok - [x] Assignees are selected. - [x] Minimum number of approvals - [x] No changes are requested **Be nice. Be informative.** Closes #5904 from AngersZhuuuu/KYUUBI-5903. Closes #5903 03c7d642b [Angerszhuuuu] [KYUUBI #5903][Bug] PVM should override computeStats method Authored-by: Angerszhuuuu <angers.zhu@gmail.com> Signed-off-by: Cheng Pan <chengpan@apache.org> |
||
|
|
650ba5e323 |
[KYUUBI #5884] PVM should inherit MultiInstance and wrap with new exprId
# 🔍 Description ## Issue References 🔗 This pull request fixes #5884 ## Describe Your Solution 🔧 We meet a case that we create one temp view from a PVM, and setting `spark.sql.legacy.storeAnalyzedPlanForView=true` ``` CREATE OR REPLACE TEMPORARY VIEW tmp_view AS SELECT * FROM persist_view ``` Then we crease two new view based on `tmp_view` then join them, this cause a column exprId conflict on join's left and right side. This is because in spark side, to avoid view will cause a code exprId conflict, it will wrap a project with new exprId to view's child <img width="977" alt="截屏2023-12-20 20 45 59" src="https://github.com/apache/kyuubi/assets/46485123/00bab655-a10c-4e61-b3d9-51c9208dec73"> Before PVM is UnaryNode, this behavior still can work on PVM's child, now it's LeafNode then won't work. Like HiveTableRelation, PVM also need to inherit `MultiInstanceRelation ` and wrap child with new ExprId to avid below issue. This change works fine in our prod. ## Types of changes 🔖 - [x] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 Added UT #### Behavior Without This Pull Request ⚰️ Failed since vies's exprId conflict. ``` Caused by: org.apache.spark.sql.AnalysisException: cannot resolve '`a.id`' given input columns: [a.id, a.scope]; line 5 pos 3; 'Project [ArrayBuffer(a).*, 'b.scope AS new_scope#22] +- 'Join Inner, ('a.id = 'b.id) :- SubqueryAlias a : +- SubqueryAlias view2 : +- Project [id#18, scope#19] : +- Filter (scope#19 < 10) : +- SubqueryAlias view1 : +- Project [id#18, scope#19] : +- Filter (id#18 > 10) : +- SubqueryAlias spark_catalog.default.perm_view : +- PermanentViewMarker : +- View (`default`.`perm_view`, [id#18,scope#19]) : +- Project [id#20, scope#21] : +- SubqueryAlias spark_catalog.default.table1 : +- HiveTableRelation [`default`.`table1`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, Data Cols: [id#20, scope#21], Partition Cols: []] +- SubqueryAlias b +- SubqueryAlias view3 +- Project [id#18, scope#19] +- Filter isnotnull(scope#19) +- SubqueryAlias view1 +- Project [id#18, scope#19] +- Filter (id#18 > 10) +- SubqueryAlias spark_catalog.default.perm_view +- PermanentViewMarker +- View (`default`.`perm_view`, [id#18,scope#19]) +- Project [id#20, scope#21] +- SubqueryAlias spark_catalog.default.table1 +- HiveTableRelation [`default`.`table1`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, Data Cols: [id#20, scope#21], Partition Cols: []] ``` #### Behavior With This Pull Request 🎉 Can work well ``` Project [id#18, scope#19, scope#24 AS new_scope#22] +- Join Inner, (id#18 = id#23) :- Filter ((id#18 > 10) AND (scope#19 < 10)) : +- PermanentViewMarker : +- View (`default`.`perm_view`, [id#18,scope#19]) : +- Project [id#20, scope#21] : +- SubqueryAlias spark_catalog.default.table1 : +- HiveTableRelation [`default`.`table1`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, Data Cols: [id#20, scope#21], Partition Cols: []] +- Filter ((id#23 > 10) AND isnotnull(scope#24)) +- PermanentViewMarker +- Project [cast(id#18 as int) AS id#23, cast(scope#19 as int) AS scope#24] +- View (`default`.`perm_view`, [id#18,scope#19]) +- Project [id#20, scope#21] +- SubqueryAlias spark_catalog.default.table1 +- HiveTableRelation [`default`.`table1`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, Data Cols: [id#20, scope#21], Partition Cols: []] ``` #### Related Unit Tests --- # Checklists ## 📝 Author Self Checklist - [x] My code follows the [style guidelines](https://kyuubi.readthedocs.io/en/master/contributing/code/style.html) of this project - [x] I have performed a self-review - [ ] I have commented my code, particularly in hard-to-understand areas - [ ] I have made corresponding changes to the documentation - [ ] My changes generate no new warnings - [x] I have added tests that prove my fix is effective or that my feature works - [x] New and existing unit tests pass locally with my changes - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) ## 📝 Committer Pre-Merge Checklist - [x] Pull request title is okay. - [x] No license issues. - [x] Milestone correctly set? - [x] Test coverage is ok - [x] Assignees are selected. - [x] Minimum number of approvals - [x] No changes are requested **Be nice. Be informative.** Closes #5885 from AngersZhuuuu/KYUUBI-5884. Closes #5884 759afc140 [Cheng Pan] Update extensions/spark/kyuubi-spark-authz/src/test/scala/org/apache/kyuubi/plugin/spark/authz/ranger/RangerSparkExtensionSuite.scala e005b6745 [Angerszhuuuu] Update RangerSparkExtensionSuite.scala 826e7a4db [Angerszhuuuu] Update RangerSparkExtensionSuite.scala 6814a833c [Angerszhuuuu] [KYUUBI #5884][Bug] PVM should inherit MultiInstance and wrap with new exprId Lead-authored-by: Angerszhuuuu <angers.zhu@gmail.com> Co-authored-by: Cheng Pan <pan3793@gmail.com> Signed-off-by: Cheng Pan <chengpan@apache.org> |
||
|
|
58887dcb79
|
[KYUUBI #5827][TEST] Fix wrong test code about directory lineage
# 🔍 Description ## Issue References 🔗 This pull request fixes #5827 ## Describe Your Solution 🔧 Fix wrong test code about directory lineage ## Types of changes 🔖 - [x] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests --- # Checklists ## 📝 Author Self Checklist - [x] My code follows the [style guidelines](https://kyuubi.readthedocs.io/en/master/contributing/code/style.html) of this project - [x] I have performed a self-review - [ ] I have commented my code, particularly in hard-to-understand areas - [ ] I have made corresponding changes to the documentation - [x] My changes generate no new warnings - [ ] I have added tests that prove my fix is effective or that my feature works - [x] New and existing unit tests pass locally with my changes - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) ## 📝 Committer Pre-Merge Checklist - [x] Pull request title is okay. - [x] No license issues. - [x] Milestone correctly set? - [x] Test coverage is ok - [x] Assignees are selected. - [x] Minimum number of approvals - [x] No changes are requested **Be nice. Be informative.** Closes #5829 from AngersZhuuuu/KYUUBI-5827. Closes #5827 007981c57 [Angerszhuuuu] [KYUUBI #5827][TEST]Fix wrong test code about directory lineage Authored-by: Angerszhuuuu <angers.zhu@gmail.com> Signed-off-by: Cheng Pan <chengpan@apache.org> |
||
|
|
e779b424df |
[KYUUBI #5816] Change spark rule class to object or case class
# 🔍 Description ## Issue References 🔗 This pull request fixes #5816 ## Describe Your Solution 🔧 ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests --- # Checklists ## 📝 Author Self Checklist - [ ] My code follows the [style guidelines](https://kyuubi.readthedocs.io/en/master/contributing/code/style.html) of this project - [ ] I have performed a self-review - [ ] I have commented my code, particularly in hard-to-understand areas - [ ] I have made corresponding changes to the documentation - [ ] My changes generate no new warnings - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] New and existing unit tests pass locally with my changes - [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) ## 📝 Committer Pre-Merge Checklist - [ ] Pull request title is okay. - [ ] No license issues. - [ ] Milestone correctly set? - [ ] Test coverage is ok - [ ] Assignees are selected. - [ ] Minimum number of approvals - [ ] No changes are requested **Be nice. Be informative.** Closes #5817 from zml1206/KYUUBI-5816. Closes #5816 437dd1f27 [zml1206] Change spark rule class to object or case class Authored-by: zml1206 <zhuml1206@gmail.com> Signed-off-by: wforget <643348094@qq.com> |
||
|
|
762ccd8295 |
[KYUUBI #5786] Disable spark script transformation
# 🔍 Description ## Issue References 🔗 This pull request fixes #5786. ## Describe Your Solution 🔧 Add spark check rule. ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [x] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests org.apache.kyuubi.plugin.spark.authz.rule.AuthzUnsupportedOperationsCheckSuite.test("disable script transformation") --- # Checklists ## 📝 Author Self Checklist - [x] My code follows the [style guidelines](https://kyuubi.readthedocs.io/en/master/contributing/code/style.html) of this project - [x] I have performed a self-review - [ ] I have commented my code, particularly in hard-to-understand areas - [ ] I have made corresponding changes to the documentation - [x] My changes generate no new warnings - [x] I have added tests that prove my fix is effective or that my feature works - [ ] New and existing unit tests pass locally with my changes - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) ## 📝 Committer Pre-Merge Checklist - [ ] Pull request title is okay. - [ ] No license issues. - [ ] Milestone correctly set? - [ ] Test coverage is ok - [ ] Assignees are selected. - [ ] Minimum number of approvals - [ ] No changes are requested **Be nice. Be informative.** Closes #5788 from zml1206/KYUUBI-5786. Closes #5786 06c0098be [zml1206] fix e2c3fee22 [zml1206] fix 37744f4c3 [zml1206] move to spark extentions deb09fb30 [zml1206] add configuration cfea4845a [zml1206] Disable spark script transformation in Authz Authored-by: zml1206 <zhuml1206@gmail.com> Signed-off-by: wforget <643348094@qq.com> |
||
|
|
f4a739e855 |
[KYUUBI #5780][AUTHZ][FOLLOWUP] Format PermanentViewMarker tree string
# 🔍 Description ## Issue References 🔗 This pull request fixes #5780 ## Describe Your Solution 🔧 Format PermanentViewMarker tree string ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [x] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ ``` Project [new_id2#100] +- Project [new_id#102 AS new_id2#100] +- RamPermanentViewMarker View (`test_default`.`my_view`, [new_id#102]), `test_default`.`my_view`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe ``` #### Behavior With This Pull Request 🎉 ``` Project [new_id2#100] +- Project [new_id#102 AS new_id2#100] +- RamPermanentViewMarker +- View (`test_default`.`my_view`, [new_id#102]) +- Project [cast(new_id#101 as int) AS new_id#102] +- Project [id#103 AS new_id#101] +- SubqueryAlias spark_catalog.test_default.v1 +- HiveTableRelation [`test_default`.`v1`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, Data Cols: [id#103, name#104, grade#105, sex#106], Partition Cols: []] ``` #### Related Unit Tests --- # Checklists ## 📝 Author Self Checklist - [x] My code follows the [style guidelines](https://kyuubi.readthedocs.io/en/master/contributing/code/style.html) of this project - [x] I have performed a self-review - [x] I have commented my code, particularly in hard-to-understand areas - [x] I have made corresponding changes to the documentation - [x] My changes generate no new warnings - [x] I have added tests that prove my fix is effective or that my feature works - [x] New and existing unit tests pass locally with my changes - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) ## 📝 Committer Pre-Merge Checklist - [ ] Pull request title is okay. - [ ] No license issues. - [ ] Milestone correctly set? - [ ] Test coverage is ok - [ ] Assignees are selected. - [ ] Minimum number of approvals - [ ] No changes are requested **Be nice. Be informative.** Closes #5792 from AngersZhuuuu/KYUUBI-5780-FOLLOWUP. Closes #5780 d38b7d1fc [Angerszhuuuu] trigger 3073f6efd [Angerszhuuuu] Update PermanentViewMarker.scala a3f025bad [Angerszhuuuu] Update PermanentViewMarker.scala 432f1b5e1 [Angerszhuuuu] Update PermanentViewMarker.scala 6175e905c [Angerszhuuuu] [KYUUBI-5780][FOLLOWUP] Format PermanentViewMarker tree string Authored-by: Angerszhuuuu <angers.zhu@gmail.com> Signed-off-by: Cheng Pan <chengpan@apache.org> |
||
|
|
44d194dc40
|
[KYUUBI #5793][AUTHZ][BUG] PVM with nested scalar-subquery should not check src table privilege
# 🔍 Description ## Issue References 🔗 This pull request fixes #5793 ## Describe Your Solution 🔧 For SQL have nested scalar-subquery, since the scalar-subquery in scalar-subquery was not wrapped by `PVM`, this pr fix this. Note :This bug is not imported by #5780 ## Types of changes 🔖 - [x] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ ``` CREATE VIEW $db1.$view1 AS SELECT id, name, max(scope) as max_scope, sum(age) sum_age FROM $db1.$table2 WHERE scope in ( SELECT max(scope) max_scope FROM $db1.$table1 WHERE id IN (SELECT id FROM $db1.$table3) ) GROUP BY id, name ``` when we query `$db1.$view1` and if we have `view1`'s privilege, it will throw ``` Permission denied: user [user_perm_view_only] does not have [select] privilege on [default/table3/id] org.apache.kyuubi.plugin.spark.authz.AccessControlException: Permission denied: user [user_perm_view_only] does not have [select] privilege on [default/table3/id] at org.apache.kyuubi.plugin.spark.authz.ranger.SparkRangerAdminPlugin$.verify(SparkRangerAdminPlugin.scala:167) ``` #### Behavior With This Pull Request 🎉 Won't request `table3`'s privilege #### Related Unit Tests --- # Checklists ## 📝 Author Self Checklist - [x] My code follows the [style guidelines](https://kyuubi.readthedocs.io/en/master/contributing/code/style.html) of this project - [x] I have performed a self-review - [x] I have commented my code, particularly in hard-to-understand areas - [ ] I have made corresponding changes to the documentation - [x] My changes generate no new warnings - [x] I have added tests that prove my fix is effective or that my feature works - [x] New and existing unit tests pass locally with my changes - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) ## 📝 Committer Pre-Merge Checklist - [x] Pull request title is okay. - [x] No license issues. - [x] Milestone correctly set? - [x] Test coverage is ok - [x] Assignees are selected. - [x] Minimum number of approvals - [x] No changes are requested **Be nice. Be informative.** Closes #5796 from AngersZhuuuu/KYUUBI-5793. Closes #5793 0f5ebc14a [Angerszhuuuu] Update RuleEliminatePermanentViewMarker.scala f364d892b [Angerszhuuuu] [KYUUBI #5793][BUG] PVM with nested scala-subquery should not src table privilege" Authored-by: Angerszhuuuu <angers.zhu@gmail.com> Signed-off-by: Kent Yao <yao@apache.org> |
||
|
|
7fcbce3e9d
|
Revert "[KYUUBI #5793][BUG] PVM with nested scala-subquery should not src table privilege"
This reverts commit
|
||
|
|
30c9c1657f
|
[KYUUBI #5793][BUG] PVM with nested scala-subquery should not src table privilege
# 🔍 Description ## Issue References 🔗 This pull request fixes #5793 ## Describe Your Solution 🔧 For SQL have nested scalar-subquery, since the scalar-subquery in scalar-subquery was not wrapped by `PVM`, this pr fix this. Note :This bug is not imported by #5780 ## Types of changes 🔖 - [x] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ ``` CREATE VIEW $db1.$view1 AS SELECT id, name, max(scope) as max_scope, sum(age) sum_age FROM $db1.$table2 WHERE scope in ( SELECT max(scope) max_scope FROM $db1.$table1 WHERE id IN (SELECT id FROM $db1.$table3) ) GROUP BY id, name ``` when we query `$db1.$view1` and if we have `view1`'s privilege, it will throw ``` Permission denied: user [user_perm_view_only] does not have [select] privilege on [default/table3/id] org.apache.kyuubi.plugin.spark.authz.AccessControlException: Permission denied: user [user_perm_view_only] does not have [select] privilege on [default/table3/id] at org.apache.kyuubi.plugin.spark.authz.ranger.SparkRangerAdminPlugin$.verify(SparkRangerAdminPlugin.scala:167) ``` #### Behavior With This Pull Request 🎉 Won't request `table3`'s privilege #### Related Unit Tests --- # Checklists ## 📝 Author Self Checklist - [x] My code follows the [style guidelines](https://kyuubi.readthedocs.io/en/master/contributing/code/style.html) of this project - [x] I have performed a self-review - [x] I have commented my code, particularly in hard-to-understand areas - [x] I have made corresponding changes to the documentation - [x] My changes generate no new warnings - [x] I have added tests that prove my fix is effective or that my feature works - [x] New and existing unit tests pass locally with my changes - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) ## 📝 Committer Pre-Merge Checklist - [x] Pull request title is okay. - [x] No license issues. - [x] Milestone correctly set? - [x] Test coverage is ok - [x] Assignees are selected. - [x] Minimum number of approvals - [x] No changes are requested **Be nice. Be informative.** Closes #5794 from AngersZhuuuu/KYUUBI-5793. Closes #5793 f364d892b [Angerszhuuuu] [KYUUBI #5793][BUG] PVM with nested scala-subquery should not src table privilege" Authored-by: Angerszhuuuu <angers.zhu@gmail.com> Signed-off-by: Kent Yao <yao@apache.org> |
||
|
|
5761e83e7d
|
[KYUUBI#5760] Support alter path-based table for Delta Lake in Authz (#5760)
# 🔍 Description ## Issue References 🔗 <!-- Append the issue number after #. If there is no issue for you to link create one or --> <!-- If there are no issues to link, please provide details here. --> This pull request fixes #5757 ## Describe Your Solution 🔧 Add uriDescs. ## Types of changes 🔖 <!--- What types of changes does your code introduce? Put an `x` in all the boxes that apply: --> - [ ] Bugfix (non-breaking change which fixes an issue) - [x] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests --- # Checklists ## 📝 Author Self Checklist <!--- Go over all the following points, and put an `x` in all the boxes that apply. --> <!--- If you're unsure about any of these, don't hesitate to ask. We're here to help! --> - [x] My code follows the [style guidelines](https://kyuubi.readthedocs.io/en/master/contributing/code/style.html) of this project - [x] I have performed a self-review - [x] I have commented my code, particularly in hard-to-understand areas - [ ] I have made corresponding changes to the documentation - [x] My changes generate no new warnings - [x] I have added tests that prove my fix is effective or that my feature works - [ ] New and existing unit tests pass locally with my changes - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) ## 📝 Committer Pre-Merge Checklist - [x] Pull request title is okay. - [x] No license issues. - [x] Milestone correctly set? - [x] Test coverage is ok - [x] Assignees are selected. - [x] Minimum number of approvals - [x] No changes are requested **Be nice. Be informative.** |
||
|
|
39de210161
|
Revert "[KYUUBI #5757][AUTHZ] Support alter path-based table for Delta Lake in Authz"
This reverts commit
|
||
|
|
31299347e3
|
[KYUUBI #5757][AUTHZ] Support alter path-based table for Delta Lake in Authz
# 🔍 Description
## Issue References 🔗
This pull request fixes #5757
## Describe Your Solution 🔧
Add uriDescs.
## Types of changes 🔖
- [ ] Bugfix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
## Test Plan 🧪
#### Behavior Without This Pull Request ⚰️
#### Behavior With This Pull Request 🎉
#### Related Unit Tests
---
# Checklists
## 📝 Author Self Checklist
- [x] My code follows the [style guidelines](https://kyuubi.readthedocs.io/en/master/contributing/code/style.html) of this project
- [x] I have performed a self-review
- [x] I have commented my code, particularly in hard-to-understand areas
- [ ] I have made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [x] I have added tests that prove my fix is effective or that my feature works
- [ ] New and existing unit tests pass locally with my changes
- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)
## 📝 Committer Pre-Merge Checklist
- [x] Pull request title is okay.
- [x] No license issues.
- [x] Milestone correctly set?
- [x] Test coverage is ok
- [x] Assignees are selected.
- [x] Minimum number of approvals
- [x] No changes are requested
**Be nice. Be informative.**
Closes #5760 from zml1206/KYUUBI-5757.
Closes #5757
d5d0dc3cc [zml1206] Support alter path-based table for Delta Lake in Authz
Authored-by: zml1206 <zhuml1206@gmail.com>
Signed-off-by: Kent Yao <yao@apache.org>
(cherry picked from commit
|
||
|
|
c1685c6cf2
|
[KYUUBI #5780][AUTHZ] Treating PermanentViewMarker as LeafNode make code simple and got correct privilege object
# 🔍 Description ## Issue References 🔗 This pull request fixes #5780 ## Describe Your Solution 🔧 Currently, we convert persist view to PermanentViewMaker, but after optimizer, it changed its child, making it hard to do column prune and get the right column privilege object of persist view. In this pr, we change PVM as a LeafNode, then we can directly treat it as a `HiveRelation` since we won't change its internal plan to make code simpler. But we need to re-optimize the child plan after do privilege check. ## Types of changes 🔖 - [x] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ For sql such as below table and view ``` CREATE TABLE IF NOT EXISTS $db1.$table1(id int, scope int); CREATE TABLE IF NOT EXISTS $db1.$table2( id int, name string, age int, scope int) .stripMargin); CREATE VIEW $db1.$view1 AS WITH temp AS ( SELECT max(scope) max_scope FROM $db1.$table1) SELECT id, name, max(scope) as max_scope, sum(age) sum_age FROM $db1.$table2 WHERE scope in (SELECT max_scope FROM temp) GROUP BY id, name ``` When we execute query on `$db1.$view1` ``` SELECT id as new_id, name, max_scope FROM $db1.$view1 ``` It will first execute the subquery in the query, then got a un-correct column privilege #### Behavior With This Pull Request 🎉 After this change, since PVM is a LeafNode, we won't execute the subquery under PVM, and we directly got the correct column privilege. #### Related Unit Tests Existed UT --- # Checklists ## 📝 Author Self Checklist - [x] My code follows the [style guidelines](https://kyuubi.readthedocs.io/en/master/contributing/code/style.html) of this project - [x] I have performed a self-review - [x] I have commented my code, particularly in hard-to-understand areas - [x] I have made corresponding changes to the documentation - [x] My changes generate no new warnings - [x] I have added tests that prove my fix is effective or that my feature works - [x] New and existing unit tests pass locally with my changes - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) ## 📝 Committer Pre-Merge Checklist - [x] Pull request title is okay. - [x] No license issues. - [x] Milestone correctly set? - [x] Test coverage is ok - [x] Assignees are selected. - [x] Minimum number of approvals - [x] No changes are requested **Be nice. Be informative.** Closes #5781 from AngersZhuuuu/KYUUBI-5780. Closes #5780 3c18bb76c [Angerszhuuuu] Merge branch 'master' into KYUUBI-5780 64f7947c1 [Angerszhuuuu] update b4f6fc02a [Angerszhuuuu] Merge branch 'master' into KYUUBI-5780 fbc989a7e [Angerszhuuuu] Update Authorization.scala 2113cf51b [Angerszhuuuu] Update RuleApplyPermanentViewMarker.scala 5dbe232fc [Angerszhuuuu] Update WithInternalChildren.scala 04a40c316 [Angerszhuuuu] update 57bf5ba33 [Angerszhuuuu] update 738d5062d [Angerszhuuuu] Update RuleApplyPermanentViewMarker.scala bade42791 [Angerszhuuuu] [KYUUBI #5780][AUTHZ] Kyuubi tread PVM ass LeafNode to make logic more simple Authored-by: Angerszhuuuu <angers.zhu@gmail.com> Signed-off-by: Kent Yao <yao@apache.org> |
||
|
|
b920836e03 |
Revert "[KYUUBI #5757][AUTHZ] Support alter path-based table for Delta Lake in Authz"
This reverts commit
|
||
|
|
e08750a4cb
|
[KYUUBI #5757][AUTHZ] Support alter path-based table for Delta Lake in Authz
# 🔍 Description ## Issue References 🔗 This pull request fixes #5757 ## Describe Your Solution 🔧 Add uriDescs. ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [x] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests --- # Checklists ## 📝 Author Self Checklist - [x] My code follows the [style guidelines](https://kyuubi.readthedocs.io/en/master/contributing/code/style.html) of this project - [x] I have performed a self-review - [x] I have commented my code, particularly in hard-to-understand areas - [ ] I have made corresponding changes to the documentation - [x] My changes generate no new warnings - [x] I have added tests that prove my fix is effective or that my feature works - [ ] New and existing unit tests pass locally with my changes - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) ## 📝 Committer Pre-Merge Checklist - [x] Pull request title is okay. - [x] No license issues. - [x] Milestone correctly set? - [x] Test coverage is ok - [x] Assignees are selected. - [x] Minimum number of approvals - [x] No changes are requested **Be nice. Be informative.** Closes #5760 from zml1206/KYUUBI-5757. Closes #5757 d5d0dc3cc [zml1206] Support alter path-based table for Delta Lake in Authz Authored-by: zml1206 <zhuml1206@gmail.com> Signed-off-by: Kent Yao <yao@apache.org> |
||
|
|
7f02809e54 |
[KYUUBI #5768][AUTHZ] Authz internal place holder should skip privilege check
# 🔍 Description ## Issue References 🔗 This pull request fixes #5768 ## Describe Your Solution 🔧 Currently all UT have a `ShowNamespace command` and wrapped by `ObjectFilterPlaceHolder` <img width="1196" alt="截屏2023-11-24 下午3 29 53" src="https://github.com/apache/kyuubi/assets/46485123/ab7a93ec-22aa-425f-bbbc-894d3d8f19c0"> And `ObjectFilterPlaceHolder` such command will go through `buildQuery()`, it's noisy when dev to debug and unnecessary, we should just skip it since we have check privilege when executing. ## Types of changes 🔖 - [x] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests --- # Checklists ## 📝 Author Self Checklist - [x] My code follows the [style guidelines](https://kyuubi.readthedocs.io/en/master/contributing/code/style.html) of this project - [x] I have performed a self-review - [x] I have commented my code, particularly in hard-to-understand areas - [x] I have made corresponding changes to the documentation - [x] My changes generate no new warnings - [x] I have added tests that prove my fix is effective or that my feature works - [x] New and existing unit tests pass locally with my changes - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) ## 📝 Committer Pre-Merge Checklist - [x] Pull request title is okay. - [x] No license issues. - [x] Milestone correctly set? - [x] Test coverage is ok - [x] Assignees are selected. - [x] Minimum number of approvals - [x] No changes are requested **Be nice. Be informative.** Closes #5769 from AngersZhuuuu/KYUUBI-5768. Closes #5768 2018e784f [Angerszhuuuu] Update RuleAuthorization.scala a51172c14 [Angerszhuuuu] Update PrivilegesBuilder.scala 4a0cdaa6d [Angerszhuuuu] [KYUUBI #5768][AUTHZ] Authz internal place holder should skip privilege check Authored-by: Angerszhuuuu <angers.zhu@gmail.com> Signed-off-by: Cheng Pan <chengpan@apache.org> |
||
|
|
c8f4e9c6c5
|
[KYUUBI #5255] Add an optional comment field to the authz specs for better recognition
# 🔍 Description ## Issue References 🔗 This pull request fixes #5255 ## Describe Your Solution 🔧 Add Optional comment field for desc, so that we can get information when the command from third party ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [x] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests --- # Checklists ## 📝 Author Self Checklist - [x] My code follows the [style guidelines](https://kyuubi.readthedocs.io/en/master/contributing/code/style.html) of this project - [x] I have performed a self-review - [x] I have commented my code, particularly in hard-to-understand areas - [ ] I have made corresponding changes to the documentation - [x] My changes generate no new warnings - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] New and existing unit tests pass locally with my changes - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) ## 📝 Committer Pre-Merge Checklist - [x] Pull request title is okay. - [x] No license issues. - [x] Milestone correctly set? - [x] Test coverage is ok - [x] Assignees are selected. - [x] Minimum number of approvals - [x] No changes are requested **Be nice. Be informative.** Closes #5706 from davidyuan1223/add_option_comment_for_authz. Closes #5255 46234e10d [davidyuan] bugfix b9166bd33 [davidyuan] bugfix 2d4b920db [davidyuan] regenerate json file a125ee4b8 [david yuan] Merge branch 'master' into add_option_comment_for_authz 80eb9ff04 [davidyuan] json file bug fix bc6ec388d [davidyuan] style fix b2527a558 [davidyuan] Merge remote-tracking branch 'origin/add_option_comment_for_authz' into add_option_comment_for_authz 682727d77 [davidyuan] bugfix d79a38f12 [david yuan] Merge branch 'master' into add_option_comment_for_authz 6fed8b0c7 [davidyuan] issue #5255 e48c5a087 [davidyuan] issue #5255 Lead-authored-by: davidyuan <yuanfuyuan@mafengwo.com> Co-authored-by: david yuan <yuanfuyuan@mafengwo.com> Signed-off-by: Kent Yao <yao@apache.org> |
||
|
|
fdcb4568c4
|
[KYUUBI #5735][AUTHZ] Support vacuum path-based table for Delta Lake
# 🔍 Description ## Issue References 🔗 This pull request fixes #5735. ## Describe Your Solution 🔧 VacuumTableCommand add uriDescs. ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [x] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests org.apache.kyuubi.plugin.spark.authz.ranger.DeltaCatalogRangerSparkExtensionSuite.test("vacuum path-based table") --- # Checklists ## 📝 Author Self Checklist - [x] My code follows the [style guidelines](https://kyuubi.readthedocs.io/en/master/contributing/code/style.html) of this project - [x] I have performed a self-review - [ ] I have commented my code, particularly in hard-to-understand areas - [ ] I have made corresponding changes to the documentation - [x] My changes generate no new warnings - [x] I have added tests that prove my fix is effective or that my feature works - [ ] New and existing unit tests pass locally with my changes - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) ## 📝 Committer Pre-Merge Checklist - [ ] Pull request title is okay. - [ ] No license issues. - [ ] Milestone correctly set? - [ ] Test coverage is ok - [ ] Assignees are selected. - [ ] Minimum number of approvals - [ ] No changes are requested **Be nice. Be informative.** Closes #5751 from zml1206/KYUUBI-5735. Closes #5735 463c01b47 [zml1206] Support vacuum path-based table for Delta Lake in Authz Authored-by: zml1206 <zhuml1206@gmail.com> Signed-off-by: Kent Yao <yao@apache.org> |
||
|
|
84a9686103
|
[KYUUBI #5743][AUTHZ] Improve AccessControlException verification of RangerSparkExtensionSuite
# 🔍 Description ## Issue References 🔗 This pull request fixes #5743. ## Describe Your Solution 🔧 Add and use new function AssertionUtils.interceptEndswith. ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [x] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests Exists test cases. --- # Checklists ## 📝 Author Self Checklist - [x] My code follows the [style guidelines](https://kyuubi.readthedocs.io/en/master/contributing/code/style.html) of this project - [x] I have performed a self-review - [ ] I have commented my code, particularly in hard-to-understand areas - [ ] I have made corresponding changes to the documentation - [x] My changes generate no new warnings - [x] I have added tests that prove my fix is effective or that my feature works - [x] New and existing unit tests pass locally with my changes - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) ## 📝 Committer Pre-Merge Checklist - [x] Pull request title is okay. - [x] No license issues. - [x] Milestone correctly set? - [x] Test coverage is ok - [x] Assignees are selected. - [x] Minimum number of approvals - [x] No changes are requested **Be nice. Be informative.** Closes #5744 from zml1206/KYUUBI-5743. Closes #5743 fe58cc277 [zml1206] fix a3560b0d8 [zml1206] Improve AccessControlException verification of RangerSparkExtensionSuite Authored-by: zml1206 <zhuml1206@gmail.com> Signed-off-by: Kent Yao <yao@apache.org> |
||
|
|
096be3917e
|
[KYUUBI #5743][AUTHZ] Improve AccessControlException verification of DeltaCatalogRangerSparkExtensionSuite
# 🔍 Description ## Issue References 🔗 This pull request fixes #5743. ## Describe Your Solution 🔧 Add and use new function AssertionUtils.interceptEndswith. ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [x] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests Exists test cases. --- # Checklists ## 📝 Author Self Checklist - [x] My code follows the [style guidelines](https://kyuubi.readthedocs.io/en/master/contributing/code/style.html) of this project - [x] I have performed a self-review - [x] I have commented my code, particularly in hard-to-understand areas - [x] I have made corresponding changes to the documentation - [x] My changes generate no new warnings - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] New and existing unit tests pass locally with my changes - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) ## 📝 Committer Pre-Merge Checklist - [x] Pull request title is okay. - [x] No license issues. - [x] Milestone correctly set? - [x] Test coverage is ok - [x] Assignees are selected. - [x] Minimum number of approvals - [x] No changes are requested **Be nice. Be informative.** Closes #5747 from zml1206/KYUUBI-5743-delta. Closes #5743 00d13b65f [zml1206] Improve AccessControlException verification of DeltaCatalogRangerSparkExtensionSuite Authored-by: zml1206 <zhuml1206@gmail.com> Signed-off-by: Kent Yao <yao@apache.org> |