kyuubi

Author	SHA1	Message	Date
Cheng Pan	5f4b1f0de5	[KYUUBI #7139 ] Fix Spark extension rules to support RebalancePartitions ### Why are the changes needed? As title. ### How was this patch tested? UT are modified. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #7139 from pan3793/rebalance. Closes #7139 edb070afd [Cheng Pan] fix 4d3984a92 [Cheng Pan] Fix Spark extension rules to support RebalancePartitions Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>	2025-07-18 11:46:36 +08:00
wuziyi	2080c2186c	[KYUUBI #6990 ] Add rebalance before InsertIntoHiveDirCommand and InsertIntoDataSourceDirCommand to align with behaviors of hive ### Why are the changes needed? When users switch from Hive to Spark, for sql like INSERT OVERWRITE DIRECTORY AS SELECT, it would be great if small files could be automatically merged through simple configuration, just like in Hive. ### How was this patch tested? UnitTest ### Was this patch authored or co-authored using generative AI tooling? No Closes #6991 from Z1Wu/feat/add_insert_dir_rebalance_support. Closes #6990 2820bb2d2 [wuziyi] [fix] nit a69c04191 [wuziyi] [fix] nit 951a7738f [wuziyi] [fix] nit f75dfcb3a [wuziyi] [Feat] add rebalance before InsertIntoHiveDirCommand and InsertIntoDataSourceDirCommand to align with behaviors of hive Authored-by: wuziyi <wuziyi02@corp.netease.com> Signed-off-by: Cheng Pan <chengpan@apache.org>	2025-03-25 00:52:55 +08:00
Cheng Pan	3f4d7ca734	[KYUUBI #6983 ] Remove support for spark.sql.watchdog.forcedMaxOutputRows ### Why are the changes needed? The feature `spark.sql.watchdog.forcedMaxOutputRows` is a little bit hacky, it's actually a manually implemented "limit pushdown", we already have a simple and more reliable way to achieve that by using `kyuubi.operation.result.max.rows`. ### How was this patch tested? Pass GHA. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #6983 from pan3793/rm-forcedMaxOutputRows. Closes #6983 5e0707955 [Cheng Pan] Remove support for spark.sql.watchdog.forcedMaxOutputRows Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>	2025-03-17 16:02:27 +08:00
Cheng Pan	0b1a34d149	[KYUUBI #6975 ] Clean up code for Spark 3.5 extension ### Why are the changes needed? Simple refactoring to clean up the code for the Spark 3.5 extension, e.g., remove unnecessary `Base` `Helper` abstraction layers, remove code for legacy Spark versions. Note: I don't touch `ForcedMaxOutputRows*` because I'm going to remove it in the next PR. Preparation for Spark 4.0 support. ### How was this patch tested? Pass GHA. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #6975 from pan3793/spark-ext-35-cleanup. Closes #6975 b5a94a680 [Cheng Pan] nit c729e268c [Cheng Pan] fix 1087ac709 [Cheng Pan] Clean up code for Spark 3.5 extension Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>	2025-03-12 11:44:50 +08:00
zhaohehuhu	117e56c7cb	[KYUUBI #6862 ] Spark 3.3: MaxScanStrategy supports DSv2 ### Why are the changes needed? Backport https://github.com/apache/kyuubi/pull/5852 to Spark 3.3, to enhance MaxScanStrategy to include support for the datasourcev2 in Spark 3.3 ### How was this patch tested? Add some UTs ### Was this patch authored or co-authored using generative AI tooling? No Closes #6862 from zhaohehuhu/dev-1225. Closes #6862 c745eda14 [zhaohehuhu] MaxScanStrategy supports DSv2 in Spark 3.3 Authored-by: zhaohehuhu <luoyedeyi459@163.com> Signed-off-by: Cheng Pan <chengpan@apache.org>	2024-12-25 17:21:23 +08:00
Bowen Liang	d3520ddbce	[KYUUBI #6769 ] [RELEASE] Bump 1.11.0-SNAPSHOT # 🔍 Description ## Issue References 🔗 This pull request fixes # ## Describe Your Solution 🔧 Preparing v1.11.0-SNAPSHOT after branch-1.10 cut ```shell build/mvn versions:set -DgenerateBackupPoms=false -DnewVersion="1.11.0-SNAPSHOT" (cd kyuubi-server/web-ui && npm version "1.11.0-SNAPSHOT") ``` ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests --- # Checklist 📝 - [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6769 from bowenliang123/bump-1.11. Closes #6769 6db219d28 [Bowen Liang] get latest_branch by sorting version in branch name 465276204 [Bowen Liang] update package.json 81f2865e5 [Bowen Liang] bump Authored-by: Bowen Liang <liangbowen@gf.com.cn> Signed-off-by: Bowen Liang <liangbowen@gf.com.cn>	2024-10-23 17:10:56 +08:00
xorsum	d414535cb6	[KYUUBI #6582 ] [KYUUBI-6581] Zorder clause syntax does not support special characters # 🔍 Description ## Issue References 🔗 This pull request fixes #6581 ## Describe Your Solution 🔧 Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change. I modified `KyuubiSparkSQLAstBuilder#visitMultipartIdentifier` and implemented `KyuubiSparkSQLAstBuilder#visitQuotedIdentifier` to process the quoted identifiers. ## Types of changes 🔖 - [x] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests ``` extensions/spark/kyuubi-extension-spark-3-3/src/test/scala/org/apache/spark/sql/ZorderSuiteBase.scala test("optimize sort by backquoted column name") ``` --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6582 from XorSum/features/zorder-backquote. Closes #6582 16ffa1238 [xorsum] zorder by support quote Authored-by: xorsum <xorsum@outlook.com> Signed-off-by: Cheng Pan <chengpan@apache.org>	2024-08-06 13:39:25 +08:00
huangxiaoping	0f6d7643ae	[KYUUBI #6554 ] Delete redundant code related to zorder # 🔍 Description ## Issue References 🔗 This pull request fixes #6554 ## Describe Your Solution 🔧 - Delete `/kyuubi/extensions/spark/kyuubi-extension-spark-3-x/src/main/scala/org/apache/kyuubi/sql/zorder/InsertZorderBeforeWritingBase.scala` file - Rename `InsertZorderBeforeWriting33.scala` to `InsertZorderBeforeWriting.scala` - Rename `InsertZorderHelper33, InsertZorderBeforeWritingDatasource33, InsertZorderBeforeWritingHive33, ZorderSuiteSpark33` to `InsertZorderHelper, InsertZorderBeforeWritingDatasource, InsertZorderBeforeWritingHive, ZorderSuiteSpark` ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6555 from huangxiaopingRD/6554. Closes #6554 26de4fa09 [huangxiaoping] [KYUUBI #6554] Delete redundant code related to zorder Authored-by: huangxiaoping <1754789345@qq.com> Signed-off-by: Cheng Pan <chengpan@apache.org>	2024-07-23 12:14:55 +08:00
huangxiaoping	ec232c18b5	[KYUUBI #6551 ] Allow insert zorder when global sort is false and the plan is Repartition or RepartitionByExpression. # 🔍 Description ## Issue References 🔗 This pull request fixes #6551 ## Describe Your Solution 🔧 Update `canInsertZorder` to allow insert zorder when global sort is `false` and the plan is `Repartition` or `RepartitionByExpression`. ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests /kyuubi-extension-spark-common/src/test/scala/org/apache/spark/sql/ZorderSuiteBase.scala --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6552 from huangxiaopingRD/6551. Closes #6551 b597443c3 [huangxiaoping] Fix code style 618594667 [huangxiaoping] [KYUUBI #6551] Allow insert zorder when when the plan is Repartition or RepartitionByExpression Authored-by: huangxiaoping <1754789345@qq.com> Signed-off-by: ulyssesyou <ulyssesyou@apache.org>	2024-07-23 09:36:21 +08:00
Cheng Pan	063a192c7a	[KYUUBI #6545 ] Deprecate and remove building support for Spark 3.2 # 🔍 Description This pull request aims to remove building support for Spark 3.2, while still keeping the engine support for Spark 3.2. Mailing list discussion: https://lists.apache.org/thread/l74n5zl1w7s0bmr5ovxmxq58yqy8hqzc - Remove Maven profile `spark-3.2`, and references on docs, release scripts, etc. - Keep the cross-version verification to ensure that the Spark SQL engine built on the default Spark version (3.5) still works well on Spark 3.2 runtime. - Merge `kyuubi-extension-spark-common` into `kyuubi-extension-spark-3-3` - Remove `log4j.properties` as Spark moves to Log4j2 since 3.3 (SPARK-37814) ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [x] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 Pass GHA. --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6545 from pan3793/deprecate-spark-3.2. Closes #6545 54c172528 [Cheng Pan] fix f4602e805 [Cheng Pan] Deprecate and remove building support for Spark 3.2 2e083f89f [Cheng Pan] fix style 458a92c53 [Cheng Pan] nit 929e1df36 [Cheng Pan] Deprecate and remove building support for Spark 3.2 Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>	2024-07-22 11:59:34 +08:00
Cheng Pan	6bdf2bdaf8	[KYUUBI #6392 ] Support javax.servlet and jakarta.servlet co-exist # 🔍 Description This PR makes `javax.servlet` and `jakarta.servlet` co-exist, by introducing `javax.servlet-api-4.0.1` and upgrade `jakarta.servlet-api` to 5.0.0. (6.0.0 requires JDK 11) Spark 4.0 migrated from `javax.servlet` to `jakarta.servlet` in SPARK-47118 while Kyuubi still uses `javax.servlet` in other modules, we should allow them to co-exist for a while. ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 Pass GHA. --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6392 from pan3793/servlet. Closes #6392 27d412599 [Cheng Pan] fix 9f1e72272 [Cheng Pan] other spark modules f4545dc76 [Cheng Pan] fix 313826fa7 [Cheng Pan] exclude 7d5028154 [Cheng Pan] Support javax.servlet and jakarta.servlet co-exist Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>	2024-05-20 21:09:30 +08:00
Cheng Pan	4fcc5c72a2	[KYUUBI #6260 ] Clean up and improve comments for spark extensions # 🔍 Description This pull request - improves comments for SPARK-33832 - removes unused `spark.sql.analyzer.classification.enabled` (I didn't update the migration rules because this configuration seems never to work properly) ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 Review --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6260 from pan3793/nit. Closes #6260 d762d30e9 [Cheng Pan] update comment 4ebaa04ea [Cheng Pan] nit b303f05bb [Cheng Pan] remove spark.sql.analyzer.classification.enabled b021cbc0a [Cheng Pan] Improve docs for SPARK-33832 Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>	2024-04-07 18:20:14 +08:00
wforget	9114e507c4	[KYUUBI #6211 ] Check memory offHeap enabled for CustomResourceProfileExec # 🔍 Description ## Issue References 🔗 This pull request fixes # ## Describe Your Solution 🔧 We should check `spark.memory.offHeap.enabled` when applying for `executorOffHeapMemory`. ## Types of changes 🔖 - [X] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests --- # Checklist 📝 - [X] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6211 from wForget/hotfix. Closes #6211 1c7c8cd75 [wforget] Check memory offHeap enabled for CustomResourceProfileExec Authored-by: wforget <643348094@qq.com> Signed-off-by: wforget <643348094@qq.com>	2024-03-28 13:17:59 +08:00
Binjie Yang	eb278c562d	[RELEASE] Bump 1.10.0-SNAPSHOT	2024-03-13 14:24:49 +08:00
Cheng Pan	f1cf1e42de	[KYUUBI #6131 ] Simplify Maven dependency management after dropping building support for Spark 3.1 # 🔍 Description ## Issue References 🔗 SPARK-33212 (fixed in 3.2.0) moves from `hadoop-client` to shaded hadoop client, to simplify the dependency management, previously , we add some workaround to handle Spark 3.1 dependency issues. As we removed building support for Spark 3.1 now, we can remove those workaround to simplify `pom.xml` ## Describe Your Solution 🔧 As above. ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 Pass GA. --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6131 from pan3793/3-1-cleanup. Closes #6131 1341065a7 [Cheng Pan] nit 1d7323f6e [Cheng Pan] fix 9e2e3b747 [Cheng Pan] nit 271166b58 [Cheng Pan] test Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>	2024-03-06 22:31:06 +08:00
Cheng Pan	0a0af165e3	[KYUUBI #6125 ] Drop Kyuubi extension for Spark 3.1 # 🔍 Description ## Issue References 🔗 This pull request is the next step of deprecating and removing support of Spark 3.1 VOTE: https://lists.apache.org/thread/670fx1qx7rm0vpvk8k8094q2d0fthw5b VOTE RESULT: https://lists.apache.org/thread/0zdxg5zjnc1wpxmw9mgtsxp1ywqt6qvb ## Describe Your Solution 🔧 Drop module `kyuubi-extension-spark-3-1` and delete Spark 3.1 specific codes. ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 Pass GA. --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6125 from pan3793/drop-spark-ext-3-1. Closes #6125 212012f18 [Cheng Pan] fix style 021532ccd [Cheng Pan] doc 329f69ab9 [Cheng Pan] address comments 43fac4201 [Cheng Pan] fix a12c8062c [Cheng Pan] fix dcf51c1a1 [Cheng Pan] minor 814a187a6 [Cheng Pan] Drop Kyuubi extension for Spark 3.1 Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>	2024-03-05 17:07:12 +08:00
zml1206	e779b424df	[KYUUBI #5816 ] Change spark rule class to object or case class # 🔍 Description ## Issue References 🔗 This pull request fixes #5816 ## Describe Your Solution 🔧 ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests --- # Checklists ## 📝 Author Self Checklist - [ ] My code follows the [style guidelines](https://kyuubi.readthedocs.io/en/master/contributing/code/style.html) of this project - [ ] I have performed a self-review - [ ] I have commented my code, particularly in hard-to-understand areas - [ ] I have made corresponding changes to the documentation - [ ] My changes generate no new warnings - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] New and existing unit tests pass locally with my changes - [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) ## 📝 Committer Pre-Merge Checklist - [ ] Pull request title is okay. - [ ] No license issues. - [ ] Milestone correctly set? - [ ] Test coverage is ok - [ ] Assignees are selected. - [ ] Minimum number of approvals - [ ] No changes are requested Be nice. Be informative. Closes #5817 from zml1206/KYUUBI-5816. Closes #5816 437dd1f27 [zml1206] Change spark rule class to object or case class Authored-by: zml1206 <zhuml1206@gmail.com> Signed-off-by: wforget <643348094@qq.com>	2023-12-06 11:00:33 +08:00
zml1206	762ccd8295	[KYUUBI #5786 ] Disable spark script transformation # 🔍 Description ## Issue References 🔗 This pull request fixes #5786. ## Describe Your Solution 🔧 Add spark check rule. ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [x] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests org.apache.kyuubi.plugin.spark.authz.rule.AuthzUnsupportedOperationsCheckSuite.test("disable script transformation") --- # Checklists ## 📝 Author Self Checklist - [x] My code follows the [style guidelines](https://kyuubi.readthedocs.io/en/master/contributing/code/style.html) of this project - [x] I have performed a self-review - [ ] I have commented my code, particularly in hard-to-understand areas - [ ] I have made corresponding changes to the documentation - [x] My changes generate no new warnings - [x] I have added tests that prove my fix is effective or that my feature works - [ ] New and existing unit tests pass locally with my changes - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) ## 📝 Committer Pre-Merge Checklist - [ ] Pull request title is okay. - [ ] No license issues. - [ ] Milestone correctly set? - [ ] Test coverage is ok - [ ] Assignees are selected. - [ ] Minimum number of approvals - [ ] No changes are requested Be nice. Be informative. Closes #5788 from zml1206/KYUUBI-5786. Closes #5786 06c0098be [zml1206] fix e2c3fee22 [zml1206] fix 37744f4c3 [zml1206] move to spark extentions deb09fb30 [zml1206] add configuration cfea4845a [zml1206] Disable spark script transformation in Authz Authored-by: zml1206 <zhuml1206@gmail.com> Signed-off-by: wforget <643348094@qq.com>	2023-12-05 11:16:30 +08:00
ITzhangqiang	e51095edaa	[KYUUBI #5365 ] Don't use Log4j2's extended throwable conversion pattern in default logging configurations ### _Why are the changes needed?_ The Apache Spark Community found a performance regression with log4j2. See https://github.com/apache/spark/pull/36747. This PR to fix the performance issue on our side. ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [ ] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request ### _Was this patch authored or co-authored using generative AI tooling?_ No. Closes #5400 from ITzhangqiang/KYUUBI_5365. Closes #5365 dbb9d8b32 [ITzhangqiang] [KYUUBI #5365] Don't use Log4j2's extended throwable conversion pattern in default logging configurations Authored-by: ITzhangqiang <itzhangqiang@163.com> Signed-off-by: Cheng Pan <chengpan@apache.org>	2023-10-11 21:41:22 +08:00
Cheng Pan	6061a05f24	Bump 1.9.0-SNAPSHOT	2023-09-04 14:23:12 +08:00
liangbowen	4213e20945	[KYUUBI #5177 ] Use Scala binary version placeholder in Maven module's artifactId suffix ### _Why are the changes needed?_ - Change hardcoded Scala's version 2.12 in Maven module's `artifactId` to placeholder `scala.binary.version` which is defined in project parent pom as 2.12 - Preparation for Scala 2.13/3.x support in the future - No impact on using or building Maven modules - Some ignorable warning messages for unstable artifactId will be thrown by Maven. ``` Warning: Some problems were encountered while building the effective model for org.apache.kyuubi:kyuubi-server_2.12🫙1.8.0-SNAPSHOT Warning: 'artifactId' contains an expression but should be a constant ``` ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [x] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request ### _Was this patch authored or co-authored using generative AI tooling?_ No. Closes #5175 from bowenliang123/artifactId-scala. Closes #5177 2eba29cfa [liangbowen] use placeholder of scala binary version for artifactId Authored-by: liangbowen <liangbowen@gf.com.cn> Signed-off-by: Cheng Pan <chengpan@apache.org>	2023-08-20 16:03:23 +00:00
zhouyifan279	d513f1f1e6	[KYUUBI #5136 ][Bug] Spark App may hang forever if FinalStageResourceManager killed all executors ### _Why are the changes needed?_ In minor cases, Spark Stage hangs forever when spark.sql.finalWriteStage.eagerlyKillExecutors.enabled is true. The bug occurs if two conditions are met in the same time: 1. All executors are either removed because of idle time out or killed by FinalStageResourceManager. Target executor num in YarnAllocator will be set to 0 and no more executor will be launched. 2. Target executor num in ExecutorAllocationManager equals to the executor num needed by final stage. Then ExecutorAllocationManager will not sync target executor num to YarnAllocator. ### _How was this patch tested?_ - [x] Add a new test suite `FinalStageResourceManagerSuite` Closes #5141 from zhouyifan279/adjust-executors. Closes #5136 c4403eefa [zhouyifan279] assert adjustedTargetExecutors == 1 ea8f24733 [zhouyifan279] Add comment 5f3ca1d9c [zhouyifan279] [KYUUBI #5136][Bug] Spark App may hang forever if FinalStageResourceManager killed all executors 12687eee7 [zhouyifan279] [KYUUBI #5136][Bug] Spark App may hang forever if FinalStageResourceManager killed all executors 9dcbc780d [zhouyifan279] [KYUUBI #5136][Bug] Spark App may hang forever if FinalStageResourceManager killed all executors Authored-by: zhouyifan279 <zhouyifan279@gmail.com> Signed-off-by: Cheng Pan <chengpan@apache.org>	2023-08-16 16:09:17 +08:00
Fu Chen	11dcd30e88	[KYUUBI #1265 ] `OPTIMIZE` where clause expression support ### _Why are the changes needed?_ to close #1265 After this PR, the following case will work ```sql CREATE TABLE p (c1 INT, c2 INT, c3 INT) PARTITIONED BY (event_date DATE); OPTIMIZE p where event_date = current_date() ZORDER BY c1, c2; ``` ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [x] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request Closes #2893 from cfmcgrady/where-expression-support. Closes #1265 97ac710f0 [Fu Chen] Merge remote-tracking branch 'apache/master' into where-expression-support c188f0b3d [Fu Chen] fix style e5f7409d6 [Fu Chen] move verifyPartitionPredicates to KyuubiSparkSQLAstBuilder f7234abba [Fu Chen] fix style 95d314122 [Fu Chen] fork PredicateHelper.isLikelySelective 1e596e3dd [Fu Chen] partition predicates constraint 541e373cc [Fu Chen] fix 06d9efdf0 [Fu Chen] adapt to spark-3.1/spark-3.2 suite 867263673 [Fu Chen] fix style b6801b279 [Fu Chen] add test case 79ab60554 [Fu Chen] fix suite bug cf1b16ee7 [Fu Chen] fix style dc0ebd908 [Fu Chen] add ut 286d94cc6 [Fu Chen] fix style 1736d18f6 [Fu Chen] adapt to spark-3.1/spark-3.2 04e88a5aa [Fu Chen] fix nep 59103095b [Fu Chen] simplify logical 59fba01e4 [Fu Chen] adapt to spark-3.1 e6477a9c5 [Fu Chen] remove unused 855283e20 [Fu Chen] where clause expression support Authored-by: Fu Chen <cfmcgrady@gmail.com> Signed-off-by: Fu Chen <cfmcgrady@gmail.com>	2023-07-05 10:21:49 +08:00
zhouyifan279	9ff46a3c63	[KYUUBI #4935 ] More than target num of executors may survive after FinalStageResourceManager did kill ### _Why are the changes needed?_ When FinalStageResourceManager chooses executors to be killed, it may add dead executors to the kill list. This will leave more than target num of executors survived and cause resource waste. ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request Closes #4936 from zhouyifan279/kill-executor. Closes #4936 2aaa84cb1 [zhouyifan279] [KYUUBI#4935][Improvement] More than target num of executors may survive after FinalStageResourceManager did kill Authored-by: zhouyifan279 <zhouyifan279@gmail.com> Signed-off-by: Cheng Pan <chengpan@apache.org>	2023-06-08 20:18:19 +08:00
Cheng Pan	01d80eb272	[KYUUBI #4870 ] Add kyuubi-util and kyuubi-util-scala modules ### _Why are the changes needed?_ Close #4870 ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request Closes #4872 from pan3793/util. Closes #4870 0b9fe3cba [Cheng Pan] nit ecc5ee4f2 [Cheng Pan] fix 63be7a20c [Cheng Pan] test 85363c187 [Cheng Pan] style 2227247dd [Cheng Pan] fix package 11d10a081 [Cheng Pan] Add kyuubi-util and kyuubi-util-scala modules Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>	2023-05-22 22:13:56 +08:00
wforget	19d5a9a371	[KYUUBI #4641 ] Add MaxFileSizeStrategy to limit max scan file size ### _Why are the changes needed?_ Add MaxFileSizeStrategy to limit max scan file size. close #4641 ### _How was this patch tested?_ - [X] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [ ] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request Closes #4642 from wForget/KYUUBI-4641. Closes #4641 14a680f8e [wforget] comment d2a393d97 [wforget] comment b1ef4c52c [wforget] fix d9e94bd8e [wforget] fix style 8a9121131 [wforget] use optional value 094eb61e3 [wforget] combine 89e2cb4d0 [wforget] [KYUUBI-4641] Add MaxFileSizeStrategy to limit max scan file size Authored-by: wforget <643348094@qq.com> Signed-off-by: ulyssesyou <ulyssesyou@apache.org>	2023-04-23 17:51:44 +08:00
ulysses-you	91a2ab3665	[KYUUBI #4678 ] Improve FinalStageResourceManager kill executors ### _Why are the changes needed?_ This pr change two things: 1. add a config to kill executors if the plan contains table caches. It's not always safe to kill executors if the cache is referenced by two write-like plan. 2. force adjustTargetNumExecutors when killing executors. YarnAllocator` might re-request original target executors if DRA has not updated target executors yet. Note, DRA would re-adjust executors if there are more tasks to be executed, so we are safe. It's better to adjuest target num executor once we kill executors. ### _How was this patch tested?_ These issues are found during my POC Closes #4678 from ulysses-you/skip-cache. Closes #4678 b12620954 [ulysses-you] Improve kill executors Authored-by: ulysses-you <ulyssesyou18@gmail.com> Signed-off-by: ulyssesyou <ulyssesyou@apache.org>	2023-04-10 11:41:37 +08:00
ulysses-you	061545b2bd	[KYUUBI #4664 ] Fix empty relation when kill executors ### _Why are the changes needed?_ This pr fixes a corner case when repartition on a local relation. e.g., ``` Repartition \| LocalRelation ``` it would throw exception since there is no a actually shuffle happen ``` java.util.NoSuchElementException: key not found: 3 at scala.collection.MapLike.default(MapLike.scala:235) at scala.collection.MapLike.default$(MapLike.scala:234) at scala.collection.AbstractMap.default(Map.scala:63) at scala.collection.MapLike.apply(MapLike.scala:144) at scala.collection.MapLike.apply$(MapLike.scala:143) at scala.collection.AbstractMap.apply(Map.scala:63) at org.apache.spark.sql.FinalStageResourceManager.findExecutorToKill(FinalStageResourceManager.scala:122) at org.apache.spark.sql.FinalStageResourceManager.killExecutors(FinalStageResourceManager.scala:175) ``` ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [ ] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request Closes #4664 from ulysses-you/kill-executors-followup. Closes #4664 3811eaee9 [ulysses-you] Fix empty relation Authored-by: ulysses-you <ulyssesyou18@gmail.com> Signed-off-by: ulyssesyou <ulyssesyou@apache.org>	2023-04-04 17:06:57 +08:00
ulysses-you	97aedf5048	[KYUUBI #4636 ] Improve eagerly kill redundant executors ### _Why are the changes needed?_ This pr improves the behavoir of kill redundant executors. 1. support kill executors even if AQE can not optimize shuffle read. e.g., people call `.repartition(2)` 2. fix a issue that avoid always kill executors which holds shuffle data ### _How was this patch tested?_ test manually Closes #4636 from ulysses-you/kill-executors. Closes #4636 19ac808d3 [ulysses-you] Improve eagerly kill redundant executors Authored-by: ulysses-you <ulyssesyou18@gmail.com> Signed-off-by: ulyssesyou <ulyssesyou@apache.org>	2023-03-30 15:39:44 +08:00
ulysses-you	9ca00c8aa7	[KYUUBI #4615 ] Support stage level schedule for final write stage ### _Why are the changes needed?_ Add a new rule `InjectCustomResourceProfile` to support custom resource profile for final write stage. It now supports executor configs: ``` executor core executor memory executor memory overhead executor off heap memory ``` ### _How was this patch tested?_ add test and manully test <img width="778" alt="image" src="https://user-images.githubusercontent.com/12025282/226606147-82a29b8c-1a31-4842-97a7-fe702d80e190.png"> Closes #4615 from ulysses-you/resource-profile. Closes #4615 852b207cd [ulysses-you] Support stage level schedule for final write stag Authored-by: ulysses-you <ulyssesyou18@gmail.com> Signed-off-by: ulyssesyou <ulyssesyou@apache.org>	2023-03-30 13:04:51 +08:00
ulysses-you	b8f452692a	[KYUUBI #4592 ] Support eagerly kill redundant executors ### _Why are the changes needed?_ This pr adds a new rule `FinalStageResourceManager` to eagerly kill redundant executors We first get the final stage partition which is the actually required cores, then kill the redundant executors. The priority of kill executors follow: 1. kill executor who is younger than other (The older the JIT works better) 2. kill executor who produces less shuffle data first The reason why add this feature is that, if the previous stage contains lots executors but final stage has less, then the tasks of final stage would be scheduled randomly in all exists executors which may cause resource waste. e.g., each executor only run 1 or 2 tasks but holds 4 or 5 cores. ### _How was this patch tested?_ test manually - test for the kill executor <img width="755" alt="image" src="https://user-images.githubusercontent.com/12025282/227203809-9fe0731c-f97f-40d2-ac7f-b892a2a35289.png"> Closes #4592 from ulysses-you/eagerly-kill-executors. Closes #4592 f35208bfd [ulysses-you] nit ec627ee4f [ulysses-you] nit 28d4230f8 [ulysses-you] address comments f2492cec6 [ulysses-you] nit f44e48451 [ulysses-you] Support eagerly kill redundant executors Authored-by: ulysses-you <ulyssesyou18@gmail.com> Signed-off-by: ulyssesyou <ulyssesyou@apache.org>	2023-03-24 18:24:53 +08:00
Cheng Pan	4e226ac3cc	Bump 1.8.0-SNAPSHOT	2023-02-10 15:25:49 +08:00
liangbowen	faecd8f23d	[KYUUBI #4127 ] Align ScalaTest Plus plugin versions and bump ScalaTest from 3.2.9 to 3.2.15 ### _Why are the changes needed?_ - bump `ScalaTest` version from `3.2.9` to `3.2.15`, updated to use same scala version `2.12.17` in Kyuubi. (Release notes: https://github.com/scalatest/scalatest/releases/tag/release-3.2.15) - bump `scalatest-maven-plugin` from `2.0.2` to `2.2.0` (https://github.com/scalatest/scalatest-maven-plugin/releases/tag/release-2.2.0) - align `scalatestplus` versions to the version above, removing the misleading `scalacheck.version` property, (ScalaTest + ScalaCheck Version: https://www.scalatest.org/plus/scalacheck/versions) - bump scalatestplus plugins to `3.2.15.0` with bumping dependency - scalatestplus-scalacheck (https://github.com/scalatest/scalatestplus-scalacheck/releases/tag/release-3.2.15.0-for-scalacheck-1.17) - scalatestplus-mockito (https://github.com/scalatest/scalatestplus-mockito/releases/tag/release-3.2.15.0-for-mockito-4.6) - mockito from `3.4` to `4.6` (https://github.com/mockito/mockito/releases/tag/v4.6.0) - scalacheck from `1.15` to `1.17` (https://github.com/typelevel/scalacheck/releases/tag/v1.17.0) ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [x] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request Closes #4127 from bowenliang123/scalatest-3.2.15. Closes #4127 ac661a55 [liangbowen] bump scalatest and plugin versions Authored-by: liangbowen <liangbowen@gf.com.cn> Signed-off-by: Cheng Pan <chengpan@apache.org>	2023-01-11 16:08:12 +08:00
ulysses-you	0495350082	[KYUUBI #3988 ] Final stage config isolation support write only ### _Why are the changes needed?_ Detect and inject a tag if plan is for writing, then skip doing final stage isolation at query preparation phase. To make final stage config more flexible with complex Spark application. ### _How was this patch tested?_ - [x] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [ ] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request Closes #3988 from ulysses-you/final-stage. Closes #3988 d0f2b622 [ulysses-you] fix e5351fd5 [ulysses-you] nit 39082b20 [ulysses-you] Final stage config isolation support write only Authored-by: ulysses-you <ulyssesyou18@gmail.com> Signed-off-by: ulysses-you <ulyssesyou@apache.org>	2022-12-29 15:35:42 +08:00
ulysses-you	fa9e6be663	[KYUUBI #3962 ] Add two conditions to decide if add shuffle before writing ### _Why are the changes needed?_ add two conditions to decide if we should add shuffle. 1. make sure AQE is enabled, otherwise it is no meaning to add a shuffle 2. try to reduce the performance regression if add a shuffle for condition 2: we do not add shuffle if the original plan does not have shuffle ### _How was this patch tested?_ - [x] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [ ] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request Closes #3962 from ulysses-you/no-shuffle. Closes #3962 a084cccc [ulysses-you] address comment 9d0aab1b [ulysses-you] address comment 09fc9b21 [ulysses-you] fix ut 06f249a2 [ulysses-you] Reduce the performance regression Authored-by: ulysses-you <ulyssesyou18@gmail.com> Signed-off-by: ulysses-you <ulyssesyou@apache.org>	2022-12-12 20:22:10 +08:00
firefox	8ef6494e4a	[KYUUBI #3893 ] [BUG] Fix spark extension: UnspecifiedDistribution does not have default partitioning. ### _Why are the changes needed?_ 1. to fix #3893 ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [x] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request Closes #3894 from FireFoxAhri/master. Closes #3893 da15a000 [firefox] [KYUUBI #3893] [BUG] Fix spark extension: UnspecifiedDistribution does not have default partitioning. Authored-by: firefox <309637962@qq.com> Signed-off-by: ulysses-you <ulyssesyou@apache.org>	2022-12-05 18:09:56 +08:00
liangbowen	2ac10f91d5	[KYUUBI #3842 ] [Improvement] Support maven pom.xml code style check with spotless plugin ### _Why are the changes needed?_ Introduce code style check support for Maven's pom.xml with sortPom in spotless maven plugin. ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [x] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request Closes #3843 from bowenliang123/spotless-pom. Closes #3842 3c654597 [liangbowen] apply to pom.xml fd1536f7 [liangbowen] set expandEmptyElements to true e498423f [liangbowen] apply spotless:apply to all pom.xml e46bcfec [liangbowen] add pom style check support in spotless Authored-by: liangbowen <liangbowen@gf.com.cn> Signed-off-by: Cheng Pan <chengpan@apache.org>	2022-11-23 22:08:00 +08:00
ulysses-you	2acee9ea97	[KYUUBI #3601 ] [SPARK] Support infer columns for rebalance and sort ### _Why are the changes needed?_ Improve the rebalance before writing rule. The rebalance before writing rule adds a rebalance at the top of query for data writing command, however the default partitioning of rebalance uses RoundRobinPartitioning which would break the original partitioning of data. It may cause the the output data size bigger than before. This pr supports infer the columns from join and aggregate for rebalance and sort to improve the compression ratio. Note that, this improvement only works for static partition writing. ### _How was this patch tested?_ - [x] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [ ] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request Closes #3601 from ulysses-you/smart-order. Closes #3601 c190dc1a [ulysses-you] docs 995969b5 [ulysses-you] view ea23c417 [ulysses-you] Support infer columns for rebalance and sort Authored-by: ulysses-you <ulyssesyou18@gmail.com> Signed-off-by: ulysses-you <ulyssesyou@apache.org>	2022-10-17 18:13:50 +08:00
SteNicholas	77b036f3a8	[KYUUBI #3264 ] [RELEASE] Bump 1.7.0-SNAPSHOT ### _Why are the changes needed?_ Preparing v1.7.0-SNAPSHOT with branch-1.6 cut ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [ ] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request Closes #3264 from SteNicholas/prepare-1.7.0-snapshot. Closes #3264 374d56bf [SteNicholas] preparing v1.7.0-SNAPSHOT with branch-1.6 cut Authored-by: SteNicholas <programgeek@163.com> Signed-off-by: Cheng Pan <chengpan@apache.org>	2022-08-18 11:23:54 +08:00
Fu Chen	0acf9717d0	[KYUUBI #2247 ] Change log4j2 properties to xml ### _Why are the changes needed?_ - change log4j2-test.properties to log4j2-test.xml - add the unit test log4j2.xml for spark relative submodule, and remove the log4j.properties ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [x] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request Closes #2850 from cfmcgrady/kyuubi-2247. Closes #2247 a33d4d80 [Fu Chen] style f99dadac [Fu Chen] fix style 49c99dea [Fu Chen] add log4j2.xml for spark relative submodule a8a38561 [Fu Chen] change log4j2-test.properties to log4j2-test.xml Authored-by: Fu Chen <cfmcgrady@gmail.com> Signed-off-by: Cheng Pan <chengpan@apache.org>	2022-06-10 18:57:25 +08:00
ulysses-you	9d706e55ed	[KYUUBI #2830 ] Imporve Z-Order with Spark3.3 ### _Why are the changes needed?_ We can inject rebalance before Z-Order to avoid data skew. ### _How was this patch tested?_ - [x] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [ ] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request Closes #2830 from ulysses-you/improve-zorder. Closes #2830 789aba45 [ulysses-you] cleanup e169a202 [ulysses-you] resolver 9134496c [ulysses-you] style 048fe294 [ulysses-you] docs e06f1ef8 [ulysses-you] imporve zorder Authored-by: ulysses-you <ulyssesyou18@gmail.com> Signed-off-by: ulysses-you <ulyssesyou@apache.org>	2022-06-09 11:16:24 +08:00
Fu Chen	85cbea400c	[KYUUBI #2706 ] Spark extensions support Spark-3.3 ### _Why are the changes needed?_ to close #2706 Spark extensions support Spark-3.3, part of #2620 ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [x] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request Closes #2707 from cfmcgrady/kyuubi-2706. Closes #2706 0b07b6e4 [Fu Chen] spark extensions support spark 3.3 Authored-by: Fu Chen <cfmcgrady@gmail.com> Signed-off-by: Cheng Pan <chengpan@apache.org>	2022-05-23 11:13:18 +08:00

42 Commits