kyuubi

Author	SHA1	Message	Date
tian bao	47063d9264	[KYUUBI #7129 ] Support PARQUET hive table pushdown filter ### Why are the changes needed? Previously, the `HiveScan` class was used to read data. If it is determined to be PARQUET type, the `ParquetScan` from Spark datasourcev2 can be used. `ParquetScan` supports pushfilter down, but `HiveScan` does not yet support it. The conversation can be controlled by setting `spark.sql.kyuubi.hive.connector.read.convertMetastoreParquet`. When enabled, the data source PARQUET reader is used to process PARQUET tables created by using the HiveQL syntax, instead of Hive SerDe. close https://github.com/apache/kyuubi/issues/7129 ### How was this patch tested? added unit test ### Was this patch authored or co-authored using generative AI tooling? No Closes #7130 from flaming-archer/master_parquet_filterdown. Closes #7129 d7059dca4 [tian bao] Support PARQUET hive table pushdown filter Authored-by: tian bao <2011xuesong@gmail.com> Signed-off-by: Cheng Pan <chengpan@apache.org>	2025-07-17 14:42:46 +08:00
tian bao	60371b5dd5	[KYUUBI #7122 ] Support ORC hive table pushdown filter ### Why are the changes needed? Previously, the `HiveScan` class was used to read data. If it is determined to be ORC type, the `ORCScan` from Spark datasourcev2 can be used. `ORCScan` supports pushfilter down, but `HiveScan` does not yet support it. In our testing, we are able to achieve approximately 2x performance improvement. The conversation can be controlled by setting `spark.sql.kyuubi.hive.connector.read.convertMetastoreOrc`. When enabled, the data source ORC reader is used to process ORC tables created by using the HiveQL syntax, instead of Hive SerDe. close https://github.com/apache/kyuubi/issues/7122 ### How was this patch tested? added unit test ### Was this patch authored or co-authored using generative AI tooling? No Closes #7123 from flaming-archer/master_scanbuilder_new. Closes #7122 c3f412f90 [tian bao] add case _ 2be48909f [tian bao] Merge branch 'master_scanbuilder_new' of github.com:flaming-archer/kyuubi into master_scanbuilder_new c825d0f8c [tian bao] review change 8a26d6a8a [tian bao] Update extensions/spark/kyuubi-spark-connector-hive/src/main/scala/org/apache/kyuubi/spark/connector/hive/KyuubiHiveConnectorConf.scala 68d41969f [tian bao] review change bed007fea [tian bao] review change b89e6e67a [tian bao] Optimize UT 5a8941b2d [tian bao] fix failed ut dc1ba47e3 [tian bao] orc pushdown version 0 Authored-by: tian bao <2011xuesong@gmail.com> Signed-off-by: Cheng Pan <chengpan@apache.org>	2025-07-09 13:38:51 +08:00
Cheng Pan	e366b0950f	[KYUUBI #6920 ][FOLLOWUP] Spark SQL engine supports Spark 4.0 ### Why are the changes needed? There were some breaking changes after we fixed compatibility for Spark 4.0.0 RC1 in #6920, but now Spark has reached 4.0.0 RC6, which has less chance to receive more breaking changes. ### How was this patch tested? Changes are extracted from https://github.com/apache/kyuubi/pull/6928, which passed CI with Spark 4.0.0 RC6 ### Was this patch authored or co-authored using generative AI tooling? No. Closes #7061 from pan3793/6920-followup. Closes #6920 17a1bd9e5 [Cheng Pan] [KYUUBI #6920][FOLLOWUP] Spark SQL engine supports Spark 4.0 Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>	2025-05-16 11:47:35 +08:00
Cheng Pan	d5b01fa3e2	[KYUUBI #6939 ] Bump Spark 3.5.5 ### Why are the changes needed? Test Spark 3.5.5 Release Notes https://spark.apache.org/releases/spark-release-3-5-5.html ### How was this patch tested? Pass GHA. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #6939 from pan3793/spark-3.5.5. Closes #6939 8c0288ae5 [Cheng Pan] ga 78b0e72db [Cheng Pan] nit 686a7b0a9 [Cheng Pan] fix d40cc5bba [Cheng Pan] Bump Spark 3.5.5 Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>	2025-03-03 13:42:09 +08:00
Bowen Liang	d3520ddbce	[KYUUBI #6769 ] [RELEASE] Bump 1.11.0-SNAPSHOT # 🔍 Description ## Issue References 🔗 This pull request fixes # ## Describe Your Solution 🔧 Preparing v1.11.0-SNAPSHOT after branch-1.10 cut ```shell build/mvn versions:set -DgenerateBackupPoms=false -DnewVersion="1.11.0-SNAPSHOT" (cd kyuubi-server/web-ui && npm version "1.11.0-SNAPSHOT") ``` ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests --- # Checklist 📝 - [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6769 from bowenliang123/bump-1.11. Closes #6769 6db219d28 [Bowen Liang] get latest_branch by sorting version in branch name 465276204 [Bowen Liang] update package.json 81f2865e5 [Bowen Liang] bump Authored-by: Bowen Liang <liangbowen@gf.com.cn> Signed-off-by: Bowen Liang <liangbowen@gf.com.cn>	2024-10-23 17:10:56 +08:00
Cheng Pan	1bfc8c5840	[KYUUBI #6699 ] Bump Spark 4.0.0-preview2 # 🔍 Description Spark 4.0.0-preview2 RC1 passed the vote https://lists.apache.org/thread/4ctj2mlgs4q2yb4hdw2jy4z34p5yw2b1 ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [x] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 Pass GHA. --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6699 from pan3793/spark-4.0.0-preview2. Closes #6699 2db1f645d [Cheng Pan] 4.0.0-preview2 42055bb1e [Cheng Pan] fix d29c0ef83 [Cheng Pan] disable delta test 98d323b95 [Cheng Pan] fix 2e782c00b [Cheng Pan] log4j-slf4j2-impl fde4bb6ba [Cheng Pan] spark-4.0.0-preview2 Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>	2024-09-23 17:42:48 +08:00
Cheng Pan	03dcedd89e	[KYUUBI #6453 ] Make KSHC support Spark 4.0 and enable CI for Spark 4.0 # 🔍 Description This PR makes KSHC support Spark 4.0, and also makes sure that the KSHC jar compiled against Spark 3.5 is binary compatible with Spark 4.0. We are ready to enable CI for Spark 4.0, except for authZ module. ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [x] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 Pass GHA. --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6453 from pan3793/spark4-ci. Closes #6453 695e3d7f7 [Cheng Pan] Update pom.xml 2eaa0f88a [Cheng Pan] Update .github/workflows/master.yml b1f540a34 [Cheng Pan] cross test 562839982 [Cheng Pan] fix 9f0c2e1be [Cheng Pan] fix 45f182462 [Cheng Pan] kshc 227ef5bae [Cheng Pan] fix 690a3b8b2 [Cheng Pan] Revert "fix" 87fe7678b [Cheng Pan] fix 60f55dbed [Cheng Pan] CI for Spark 4. Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>	2024-06-07 11:01:24 +08:00
Cheng Pan	1fb1f854eb	[KYUUBI #6439 ] kyuubi-util-scala test jar leaked to compile scope # 🔍 Description The `kyuubi-util-scala_2.12-<version>-tests.jar` accidentally leaked to the compile scope but should be in the test scope. ## Types of changes 🔖 - [x] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 Run `build/dist` and check `dist/jars` --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6439 from pan3793/util-scala-test. Closes #6439 0576248f5 [Cheng Pan] fix 2bf2408f5 [Cheng Pan] fix f7151dfc6 [Cheng Pan] kyuubi-util-scala test jar leaked to compile scope Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>	2024-06-04 11:31:58 +08:00
zhouyifan279	3ed912f5de	[KYUUBI #6247 ] Make KSHC binary compatible with multiple Spark versions # 🔍 Description ## Issue References 🔗 This pull request closes #6247 This also closes #6431 ## Describe Your Solution 🔧 Add a job `spark-connector-cross-version-test` in GitHub Actions to: 1. Build KSHC package with maven opt `-Pspark-3.5` 2. Run KSHC tests with maven opt `-Pspark-3.3` and `-Pspark-3.4` and KSHC package built in step 1 3. Fix the binary-compatible issue via reflection. ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [x] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 Pass GHA. --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6436 from zhouyifan279/kshc-cross-version-test. Closes #6247 d3ac2ef47 [zhouyifan279] Tune the KSHC code to fix binary-compatible issues 4e14edcb5 [zhouyifan279] Fix invalid unit-tests-log name 56ca45d18 [zhouyifan279] Fix invalid unit-tests-log name 4c5ab7b9e [zhouyifan279] Update test log name 8a84e8812 [zhouyifan279] Add matrix scala 17cb67155 [zhouyifan279] [KYUUBI #6247] KSHC cross-version test Authored-by: zhouyifan279 <zhouyifan279@gmail.com> Signed-off-by: Cheng Pan <chengpan@apache.org>	2024-06-01 20:13:41 +08:00
Cheng Pan	6bdf2bdaf8	[KYUUBI #6392 ] Support javax.servlet and jakarta.servlet co-exist # 🔍 Description This PR makes `javax.servlet` and `jakarta.servlet` co-exist, by introducing `javax.servlet-api-4.0.1` and upgrade `jakarta.servlet-api` to 5.0.0. (6.0.0 requires JDK 11) Spark 4.0 migrated from `javax.servlet` to `jakarta.servlet` in SPARK-47118 while Kyuubi still uses `javax.servlet` in other modules, we should allow them to co-exist for a while. ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 Pass GHA. --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6392 from pan3793/servlet. Closes #6392 27d412599 [Cheng Pan] fix 9f1e72272 [Cheng Pan] other spark modules f4545dc76 [Cheng Pan] fix 313826fa7 [Cheng Pan] exclude 7d5028154 [Cheng Pan] Support javax.servlet and jakarta.servlet co-exist Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>	2024-05-20 21:09:30 +08:00
Binjie Yang	eb278c562d	[RELEASE] Bump 1.10.0-SNAPSHOT	2024-03-13 14:24:49 +08:00
Cheng Pan	8cc9b98e25	[KYUUBI #5384 ][KSCH] Hive connector supports Spark 3.5 # 🔍 Description ## Issue References 🔗 This pull request fixes #5384 ## Describe Your Solution 🔧 ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [x] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests --- # Checklist 📝 - [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6133 from Kwafoor/kyuubi_6073. Closes #5384 9234e35ad [Cheng Pan] fix 7766dfda5 [Cheng Pan] nit e9da162f8 [Cheng Pan] nit 676bfb26e [Cheng Pan] pretty c241859af [Cheng Pan] pretty 0eedcf82c [wangjunbo] compat with spark 3.3 3d866546c [wangjunbo] format code a0898f50f [wangjunbo] delete Unused import 9577f7fe8 [wangjunbo] [KYUUBI #5384] kyuubi-spark-connector-hive supports Spark 3.5 Lead-authored-by: Cheng Pan <chengpan@apache.org> Co-authored-by: wangjunbo <wangjunbo@qiyi.com> Signed-off-by: Cheng Pan <chengpan@apache.org>	2024-03-07 17:56:30 +08:00
yikaifei	5bee05e45f	[KYUUBI #6078 ] KSHC should handle the commit of the partitioned table as dynamic partition at write path # 🔍 Description ## Issue References 🔗 This pull request fixes https://github.com/apache/kyuubi/issues/6078, KSHC should handle the commit of the partitioned table as dynamic partition at write path, that's beacuse the process of writing with Apache Spark DataSourceV2 using dynamic partitioning to handle static partitions. ## Describe Your Solution 🔧 Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change. ## Types of changes 🔖 - [x] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6082 from Yikf/KYUUBI-6078. Closes #6078 2ae183672 [yikaifei] KSHC should handle the commit of the partitioned table as dynamic partition at write path Authored-by: yikaifei <yikaifei@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>	2024-03-07 17:47:14 +08:00
Cheng Pan	f1cf1e42de	[KYUUBI #6131 ] Simplify Maven dependency management after dropping building support for Spark 3.1 # 🔍 Description ## Issue References 🔗 SPARK-33212 (fixed in 3.2.0) moves from `hadoop-client` to shaded hadoop client, to simplify the dependency management, previously , we add some workaround to handle Spark 3.1 dependency issues. As we removed building support for Spark 3.1 now, we can remove those workaround to simplify `pom.xml` ## Describe Your Solution 🔧 As above. ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 Pass GA. --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6131 from pan3793/3-1-cleanup. Closes #6131 1341065a7 [Cheng Pan] nit 1d7323f6e [Cheng Pan] fix 9e2e3b747 [Cheng Pan] nit 271166b58 [Cheng Pan] test Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>	2024-03-06 22:31:06 +08:00
Cheng Pan	0a0af165e3	[KYUUBI #6125 ] Drop Kyuubi extension for Spark 3.1 # 🔍 Description ## Issue References 🔗 This pull request is the next step of deprecating and removing support of Spark 3.1 VOTE: https://lists.apache.org/thread/670fx1qx7rm0vpvk8k8094q2d0fthw5b VOTE RESULT: https://lists.apache.org/thread/0zdxg5zjnc1wpxmw9mgtsxp1ywqt6qvb ## Describe Your Solution 🔧 Drop module `kyuubi-extension-spark-3-1` and delete Spark 3.1 specific codes. ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 Pass GA. --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) Be nice. Be informative. Closes #6125 from pan3793/drop-spark-ext-3-1. Closes #6125 212012f18 [Cheng Pan] fix style 021532ccd [Cheng Pan] doc 329f69ab9 [Cheng Pan] address comments 43fac4201 [Cheng Pan] fix a12c8062c [Cheng Pan] fix dcf51c1a1 [Cheng Pan] minor 814a187a6 [Cheng Pan] Drop Kyuubi extension for Spark 3.1 Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>	2024-03-05 17:07:12 +08:00
yikaifei	47555eb900	[KYUUBI #5414 ][KSHC] Reader should not pollut the global hiveConf instance ### _Why are the changes needed?_ This pr aims to fix https://github.com/apache/kyuubi/issues/5414. `HiveReader` initialization incorrectly uses the global hadoopConf as hiveconf, which causes reader to pollut the global hadoopConf and cause job read failure. ### _How was this patch tested?_ - [x] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [x] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request ### _Was this patch authored or co-authored using generative AI tooling?_ No Closes #5424 from Yikf/orc-read. Closes #5414 d6bdf7be4 [yikaifei] [KYUUBI #5414] Reader should not polluted the global hiveconf instance Authored-by: yikaifei <yikaifei@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>	2023-10-17 13:09:18 +08:00
sychen	32b6dc3b74	[KYUUBI #5426 ] [MINOR][KSHC] Avoid use class.newInstance directly ### _Why are the changes needed?_ Remove the deprecated usage. `c780db754e/src/java.base/share/classes/java/lang/Class.java (L534-L535)` ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [ ] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request ### _Was this patch authored or co-authored using generative AI tooling?_ No. Closes #5426 from cxzl25/newInstance. Closes #5426 dcb679b95 [sychen] avoid use class.newInstance directly Authored-by: sychen <sychen@ctrip.com> Signed-off-by: Cheng Pan <chengpan@apache.org>	2023-10-16 21:25:39 +08:00
ITzhangqiang	e51095edaa	[KYUUBI #5365 ] Don't use Log4j2's extended throwable conversion pattern in default logging configurations ### _Why are the changes needed?_ The Apache Spark Community found a performance regression with log4j2. See https://github.com/apache/spark/pull/36747. This PR to fix the performance issue on our side. ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [ ] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request ### _Was this patch authored or co-authored using generative AI tooling?_ No. Closes #5400 from ITzhangqiang/KYUUBI_5365. Closes #5365 dbb9d8b32 [ITzhangqiang] [KYUUBI #5365] Don't use Log4j2's extended throwable conversion pattern in default logging configurations Authored-by: ITzhangqiang <itzhangqiang@163.com> Signed-off-by: Cheng Pan <chengpan@apache.org>	2023-10-11 21:41:22 +08:00
zhaomin	167e6c1ca3	[KYUUBI #5317 ] [Bug] Hive Connector throws NotSerializableException on reading Hive Avro partitioned table ### _Why are the changes needed?_ close https://github.com/apache/kyuubi/issues/5317#issue-1904751001 ### _How was this patch tested?_ - [x] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [x] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request ### _Was this patch authored or co-authored using generative AI tooling?_ No Closes #5319 from zhaomin1423/fixhive-connector. Closes #5317 02e5321dc [Cheng Pan] nit cadabf4ab [Cheng Pan] nit d38832f40 [zhaomin] improve ee5b62d84 [zhaomin] improve 794473468 [zhaomin] improve e3eca91fb [zhaomin] add tests d9302e2ba [zhaomin] [KYUUBI #5317] [Bug] Hive Connector throws NotSerializableException on reading Hive Avro partitioned table 0bc8ec16f [zhaomin] [KYUUBI #5317] [Bug] Hive Connector throws NotSerializableException on reading Hive Avro partitioned table Lead-authored-by: zhaomin <zhaomin1423@163.com> Co-authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>	2023-09-21 17:05:24 +08:00
Cheng Pan	6061a05f24	Bump 1.9.0-SNAPSHOT	2023-09-04 14:23:12 +08:00
yikaifei	0c987e96fa	[KYUUBI #5225 ] [KSHC] Unify the exception handling of v1 and v2 during dropDatabase ### _Why are the changes needed?_ This PR aims to unify the exception handling of v1 and v2 during dropDatabase ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [ ] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request ### _Was this patch authored or co-authored using generative AI tooling?_ No Closes #5225 from Yikf/hive-connector. Closes #5225 3be33af76 [yikaifei] [KSHC] Improve test Authored-by: yikaifei <yikaifei@apache.org> Signed-off-by: Bowen Liang <liangbowen@gf.com.cn>	2023-09-01 12:17:33 +08:00
liangbowen	bdf867b19a	[KYUUBI #5193 ] Make Spark hive connector plugin compilable on Scala 2.13 ### _Why are the changes needed?_ - to make Spark SQL hive connector plugin compilable on Scala 2.13 with Spark 3.3/3.4 - rename class name `FilePartitionReader` which is copied from Spark to `SparkFilePartitionReader`to fix the class mismatch error ``` [ERROR] [Error] /Users/bw/dev/incubator-kyuubi/extensions/spark/kyuubi-spark-connector-hive/src/main/scala/org/apache/kyuubi/spark/connector/hive/read/HivePartitionReaderFactory.scala:83: type mismatch; found : Iterator[org.apache.kyuubi.spark.connector.hive.read.HivePartitionedFileReader[org.apache.spark.sql.catalyst.InternalRow]] required: Iterator[org.apache.spark.sql.execution.datasources.v2.PartitionedFileReader[org.apache.spark.sql.catalyst.InternalRow]] ``` ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [ ] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request ### _Was this patch authored or co-authored using generative AI tooling?_ No. Closes #5193 from bowenliang123/scala213-hivecon. Closes #5193 d8c6bf5f0 [liangbowen] defer toMap b20ad4eb1 [liangbowen] adapt spark hive connector plugin to Scala 2.13 Authored-by: liangbowen <liangbowen@gf.com.cn> Signed-off-by: yikaifei <yikaifei@apache.org>	2023-08-23 13:58:17 +08:00
liangbowen	4213e20945	[KYUUBI #5177 ] Use Scala binary version placeholder in Maven module's artifactId suffix ### _Why are the changes needed?_ - Change hardcoded Scala's version 2.12 in Maven module's `artifactId` to placeholder `scala.binary.version` which is defined in project parent pom as 2.12 - Preparation for Scala 2.13/3.x support in the future - No impact on using or building Maven modules - Some ignorable warning messages for unstable artifactId will be thrown by Maven. ``` Warning: Some problems were encountered while building the effective model for org.apache.kyuubi:kyuubi-server_2.12🫙1.8.0-SNAPSHOT Warning: 'artifactId' contains an expression but should be a constant ``` ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [x] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request ### _Was this patch authored or co-authored using generative AI tooling?_ No. Closes #5175 from bowenliang123/artifactId-scala. Closes #5177 2eba29cfa [liangbowen] use placeholder of scala binary version for artifactId Authored-by: liangbowen <liangbowen@gf.com.cn> Signed-off-by: Cheng Pan <chengpan@apache.org>	2023-08-20 16:03:23 +00:00
liangbowen	6ec326adb4	[KYUUBI #5039 ] [Improvement] Use semantic versions and remove redundant version comparison methods ### _Why are the changes needed?_ - Support initializing or comparing version with major version only, e.g "3" equivalent to "3.0" - Remove redundant version comparison methods by using semantic versions of Spark, Flink and Kyuubi - adding common `toDouble` method ### _How was this patch tested?_ - [x] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [x] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request Closes #5039 from bowenliang123/improve-semanticversion. Closes #5039 b6868264f [liangbowen] nit d39646b7d [liangbowen] SPARK_ENGINE_RUNTIME_VERSION 9148caad0 [liangbowen] use semantic versions ecc3b4af6 [mans2singh] [KYUUBI #5086] [KYUUBI # 5085] Update config section of deploy on kubernetes Lead-authored-by: liangbowen <liangbowen@gf.com.cn> Co-authored-by: mans2singh <mans2singh@yahoo.com> Signed-off-by: liangbowen <liangbowen@gf.com.cn>	2023-07-25 18:04:45 +08:00
yikaifei	5915d682b5	[KYUUBI #5022 ] [KSHC] CreateTable should use the correct provider ### _Why are the changes needed?_ This PR aims to fix a bug, In KSHC, `catalog.createTable` should use the correct provider. ### _How was this patch tested?_ - [x] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [ ] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request Closes #5022 from Yikf/KSHC-createTable. Closes #5022 cd8cb1cf2 [yikaifei] CreateTable should use the correct provider Authored-by: yikaifei <yikaifei@apache.org> Signed-off-by: yikaifei <yikaifei@apache.org>	2023-07-07 12:04:55 +08:00
yikaifei	46f8e0ca94	[KYUUBI #5017 ] [KSHC] Support Parquet/Orc provider is splitable ### _Why are the changes needed?_ This PR amins to support Parquet/Orc provider is splitable. ### _How was this patch tested?_ - [x] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [x] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request Closes #5017 from Yikf/KSHC-support-split. Closes #5017 9dc3d3d56 [yikaifei] Support Parquet/Orc provider is splitable Authored-by: yikaifei <yikaifei@apache.org> Signed-off-by: yikaifei <yikaifei@apache.org>	2023-07-06 19:21:05 +08:00
yikaifei	da82217388	[KYUUBI #5023 ] [KSHC] TableIdentify don't attach catalog ### _Why are the changes needed?_ As title, In KSHC, HiveTable's identify does not attach the catalog to prevent an incorrect catalogName. default catalog is "spark_catalog" ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [ ] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request Closes #5023 from Yikf/tableName2. Closes #5023 86b6a58d0 [yikaifei] KSHC v1IdentifierNoCatalog in spark3.4 Authored-by: yikaifei <yikaifei@apache.org> Signed-off-by: ulyssesyou <ulyssesyou@apache.org>	2023-07-06 18:26:37 +08:00
zhaomin	7feb535668	[KYUUBI #5028 ] Update session hadoop conf to catalog hadoop conf ### _Why are the changes needed?_ ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [x] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request Closes #5028 from zhaomin1423/fix_hive_connector. Closes #5028 d9c7e9c8a [zhaomin] Update session hadoop conf to catalog hadoop conf Authored-by: zhaomin <zhaomin1423@163.com> Signed-off-by: ulyssesyou <ulyssesyou@apache.org>	2023-07-06 18:25:12 +08:00
Cheng Pan	1d5ac07dfc	[KYUUBI #4999 ] [KSHC] Kyuubi-Spark-Hive-Connector support Apache Spark 3.4 ### _Why are the changes needed?_ This pr amins to make KSHC support Apache Spark 3.4. - KSHC support Apache Spark 3.4 - Make Apache kyuubi `codecov` module contain the spark-3.4 profile. so that Apache kyubbi CI can cover some modules. ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [x] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request Closes #4999 from Yikf/kudu-spark3.4. Closes #4999 6a35e54b8 [Cheng Pan] Update extensions/spark/kyuubi-spark-connector-hive/src/main/scala/org/apache/kyuubi/spark/connector/hive/HiveConnectorUtils.scala 66bb742eb [Cheng Pan] Update extensions/spark/kyuubi-spark-connector-hive/src/main/scala/org/apache/kyuubi/spark/connector/hive/HiveConnectorUtils.scala 7be517c7f [Cheng Pan] Update extensions/spark/kyuubi-spark-connector-hive/src/main/scala/org/apache/kyuubi/spark/connector/hive/HiveConnectorUtils.scala ae23133d1 [Cheng Pan] Update extensions/spark/kyuubi-spark-connector-hive/src/main/scala/org/apache/kyuubi/spark/connector/hive/HiveConnectorUtils.scala dda5e6521 [Cheng Pan] Update extensions/spark/kyuubi-spark-connector-hive/src/main/scala/org/apache/kyuubi/spark/connector/hive/HiveConnectorUtils.scala e43a25dff [Cheng Pan] Update extensions/spark/kyuubi-spark-connector-hive/src/main/scala/org/apache/kyuubi/spark/connector/hive/HiveConnectorUtils.scala 54f52f16d [Cheng Pan] Update extensions/spark/kyuubi-spark-connector-hive/src/main/scala/org/apache/kyuubi/spark/connector/hive/HiveConnectorUtils.scala 0955b544b [Cheng Pan] Update pom.xml 38a1383d9 [yikaifei] codecov module should contain the spark 3.4 profile Lead-authored-by: Cheng Pan <pan3793@gmail.com> Co-authored-by: yikaifei <yikaifei@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>	2023-07-04 17:25:57 +08:00
zhaomin	80bc028e6d	[KYUUBI #4995 ] Use hadoop conf and hive conf from catalog options ### _Why are the changes needed?_ There are hdfs-site.xml, hive-site, etc in spark job classpath, but we should use hadoop conf and hive conf from catalog options. ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [x] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request Closes #4995 from zhaomin1423/fix_hive_connector. Closes #4995 64429fdcb [Xiao Zhao] Update extensions/spark/kyuubi-spark-connector-hive/src/main/scala/org/apache/kyuubi/spark/connector/hive/HiveTableCatalog.scala d921be750 [zhaomin] fix 375934d65 [zhaomin] Using hadoop conf and hive conf from catalog options Lead-authored-by: zhaomin <zhaomin1423@163.com> Co-authored-by: Xiao Zhao <zhaomin1423@163.com> Signed-off-by: Cheng Pan <chengpan@apache.org>	2023-06-26 15:04:39 +08:00
liangbowen	eeee5c1ae3	[KYUUBI #4959 ] [MINOR] Code improvements for Scala ### _Why are the changes needed?_ - To improve Scala code with corrections, simplification, scala style, redundancy cleaning-up. No feature changes introduced. Corrections: - Class doesn't correspond to file name (SparkListenerExtensionTest) - Correct package name in ResultSetUtil and PySparkTests Improvements: - 'var' could be a 'val' - GetOrElse(null) to orNull Cleanup & Simplification: - Redundant cast inspection - Redundant collection conversion - Simplify boolean expression - Redundant new on case class - Redundant return - Unnecessary parentheses - Unnecessary partial function - Simplifiable empty check - Anonymous function convertible to a method value Scala Style: - Constructing range for seq indices - Get and getOrElse to getOrElse - Convert expression to Single Abstract Method (SAM) - Scala unnecessary semicolon inspection - Map and getOrElse(false) to exists - Map and flatten to flatMap - Null initializer can be replaced by _ - scaladoc link to method Other Improvements: - Replace map and getOrElse(true) with forall - Unit return type in the argument of map - Size to length on arrays and strings - Type check can be pattern matching - Java mutator method accessed as parameterless - Procedure syntax in method definition ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [ ] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request Closes #4959 from bowenliang123/scala-Improve. Closes #4959 2d36ff351 [liangbowen] code improvement for Scala Authored-by: liangbowen <liangbowen@gf.com.cn> Signed-off-by: liangbowen <liangbowen@gf.com.cn>	2023-06-16 21:20:17 +08:00
Cheng Pan	01d80eb272	[KYUUBI #4870 ] Add kyuubi-util and kyuubi-util-scala modules ### _Why are the changes needed?_ Close #4870 ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request Closes #4872 from pan3793/util. Closes #4870 0b9fe3cba [Cheng Pan] nit ecc5ee4f2 [Cheng Pan] fix 63be7a20c [Cheng Pan] test 85363c187 [Cheng Pan] style 2227247dd [Cheng Pan] fix package 11d10a081 [Cheng Pan] Add kyuubi-util and kyuubi-util-scala modules Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>	2023-05-22 22:13:56 +08:00
Cheng Pan	b2fe49343e	[KYUUBI #4620 ] [KSHC] Cut off transitive dependencies ### _Why are the changes needed?_ Remove all transitive dependencies to make the down stream project easy to consume. ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request Closes #4620 from pan3793/kshc. Closes #4620 407f669f5 [Cheng Pan] [KSHC] Cut off transitive dependenices Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>	2023-03-27 18:35:30 +08:00
Cheng Pan	6f803c0015	[KYUUBI #4560 ] [KSHC] Support Kerberized HMS in cluster mode w/o keytab ### _Why are the changes needed?_ This PR aims to make Kyuubi Spark Hive Connector(KSHC) support kerberized HMS in cluster mode w/o keytab(which is the typical use case in Kyuubi) by implementing a `HadoopDelegationTokenProvider`. To enable access to an kerberized HMS using KSHC, the minimal configurations are ``` spark.sql.catalog.warm=org.apache.kyuubi.spark.connector.hive.HiveTableCatalog spark.sql.catalog.warm.hive.metastore.uris=<thrift-uris> ``` then it's able to run federation query across metastores ``` SELECT * FROM spark_catalog.db1.tbl1 JOIN warm.db2.tbl2 ON ... ``` In addition, it allows disabling token renewal for each catalog explicitly ``` spark.sql.catalog.warm.delegation.token.renewal.enabled=false ``` The current implementation has some limitations: the catalog configuration must be present on the Spark application bootstrap, which means the catalog configurations should be set in `spark-defaults.conf` or append as `--conf` like: ``` spark-[sql\|shell\|submit] \ --conf spark.sql.catalog.xxx=org.apache.kyuubi.spark.connector.hive.HiveTableCatalog --conf spark.sql.catalog.xxx.hive.abc=xyz ``` but does not work for dynamic registering through SET statement, e.g. `SET spark.sql.catalog.xxx=` ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [x] Add screenshots for manual tests if appropriate ``` > (select count() from hive_2.mammut.test_7) union ( select count() from spark_catalog.test.test01 limit 1); +-----------+ \| count(1) \| +-----------+ \| 4 \| \| 1 \| +-----------+ 2 rows selected (8.378 seconds) ``` - [ ] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request Closes #4560 from pan3793/shc-token. Closes #4560 fe8cd0c6d [Cheng Pan] Centralized metastore token signature fallback logic 851159559 [Cheng Pan] comments fc3b4d596 [Cheng Pan] hive.metastore.token.signature fallback to hive.metastore.uris fb7eb033f [Cheng Pan] unused import 858b39024 [Cheng Pan] New catalog property delegation.token.renewal.enabled 28ec5a543 [Cheng Pan] disable hms client retry 52044d474 [Cheng Pan] update comments 33b241831 [Cheng Pan] [KSHC] Support Kerberos by implementing KyuubiHiveConnectorDelegationTokenProvider Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>	2023-03-24 11:34:08 +08:00
Yikf	41e9505722	[KYUUBI #4525 ][KSHC] Partitioning predicates should take effect to filter data ### _Why are the changes needed?_ This PR aims to close https://github.com/apache/kyuubi/issues/4525. The root cause of this problem is that Apache Spark does predicate push-down in `V2ScanRelationPushDown`, but the spark-hive-connector does not apply push-down predicates for data filtering. ### _How was this patch tested?_ - [x] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request Closes #4528 from Yikf/KYUUBI-4525. Closes #4525 a65a1873f [Yikf] Partitioning predicates should take effect to filter data Authored-by: Yikf <yikaifei@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>	2023-03-16 10:12:44 +08:00
Cheng Pan	dd9b58ae81	[KYUUBI #4488 ] [KSHC] Keep object original name defined in HiveBridgeHelper ### _Why are the changes needed?_ Respect Java/Scala coding conventions in KSHC (Kyuubi Spark Hive Connector). For singleton(`object` in Scala) invoking, use `AbcUtils.method(...)` instead of `abcUtils.method(...)` ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request Closes #4488 from pan3793/shc-rename. Closes #4488 ec9a80198 [Cheng Pan] nit 84d3bb413 [Cheng Pan] Keep object orignal name defined in HiveBridgeHelper Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>	2023-03-09 19:02:08 +08:00
Yikf	19b4b0a3fd	[KYUUBI #4432 ] jobId across tasks should be consistent to meet the contract expected by Hadoop committers ### _Why are the changes needed?_ jobId across tasks should be consistent to meet the contract expected by Hadoop committers ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [ ] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request Closes #4432 from Yikf/jobid. Closes #4432 4e7401c91 [Yikf] jobId across tasks should be consistent Authored-by: Yikf <yikaifei@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>	2023-03-01 16:16:55 +08:00
Yikf	3b73e1d64a	[KYUUBI #4391 ] Improve code for hive-connector FileWriterFactory ### _Why are the changes needed?_ This pr aims to improve code for hive-connector FileWriterFactory, the main goal is to reduce duplicate copies of spark code. ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request Closes #4391 from Yikf/improve-code. Closes #4391 7991f145 [Yikf] improve code for hive-connector FileWriterFactory Authored-by: Yikf <yikaifei@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>	2023-02-21 17:24:53 +08:00
Yikf	4feb83d0f3	[KYUUBI #4359 ] Workaround for SPARK-41448 to keep FileWriterFactory serializable ### _Why are the changes needed?_ [SPARK-41448](https://issues.apache.org/jira/browse/SPARK-41448) make consistent MR job IDs in FileBatchWriter and FileFormatWriter in Apache Spark 3.3.2, but it breaks a serializable issue, JobId is non-serializable. And this pr aims to rewrite `FileWriterFactory` to circumvent the problem ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request Closes #4359 from Yikf/FileWriterFactory. Closes #4359 dd8c90fe [Cheng Pan] Update extensions/spark/kyuubi-spark-connector-hive/src/main/scala/org/apache/kyuubi/spark/connector/hive/write/FileWriterFactory.scala 1e5164ec [Yikf] Make a serializable jobTrackerId instead of a non-serializable JobID in FileWriterFactory Lead-authored-by: Yikf <yikaifei@apache.org> Co-authored-by: Cheng Pan <pan3793@gmail.com> Signed-off-by: Cheng Pan <chengpan@apache.org>	2023-02-18 22:12:53 +08:00
Cheng Pan	4e226ac3cc	Bump 1.8.0-SNAPSHOT	2023-02-10 15:25:49 +08:00
jiaoqingbo	c1e2e57dd9	[KYUUBI #4222 ] Use hiveTableCatalog to updateTableStats instead of sessionCatalog fix #4222 - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [x] Add screenshots for manual tests if appropriate - [ ] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request Closes #4237 from jiaoqingbo/kyuubi4222. Closes #4237 7677d69f [jiaoqingbo] code review 538d436a [jiaoqingbo] [Kyuubi #4222] Use hiveTableCatalog to updateTableStats instead of sessionCatalog Authored-by: jiaoqingbo <1178404354@qq.com> Signed-off-by: Cheng Pan <chengpan@apache.org>	2023-02-03 05:43:36 +00:00
liangbowen	faecd8f23d	[KYUUBI #4127 ] Align ScalaTest Plus plugin versions and bump ScalaTest from 3.2.9 to 3.2.15 ### _Why are the changes needed?_ - bump `ScalaTest` version from `3.2.9` to `3.2.15`, updated to use same scala version `2.12.17` in Kyuubi. (Release notes: https://github.com/scalatest/scalatest/releases/tag/release-3.2.15) - bump `scalatest-maven-plugin` from `2.0.2` to `2.2.0` (https://github.com/scalatest/scalatest-maven-plugin/releases/tag/release-2.2.0) - align `scalatestplus` versions to the version above, removing the misleading `scalacheck.version` property, (ScalaTest + ScalaCheck Version: https://www.scalatest.org/plus/scalacheck/versions) - bump scalatestplus plugins to `3.2.15.0` with bumping dependency - scalatestplus-scalacheck (https://github.com/scalatest/scalatestplus-scalacheck/releases/tag/release-3.2.15.0-for-scalacheck-1.17) - scalatestplus-mockito (https://github.com/scalatest/scalatestplus-mockito/releases/tag/release-3.2.15.0-for-mockito-4.6) - mockito from `3.4` to `4.6` (https://github.com/mockito/mockito/releases/tag/v4.6.0) - scalacheck from `1.15` to `1.17` (https://github.com/typelevel/scalacheck/releases/tag/v1.17.0) ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [x] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request Closes #4127 from bowenliang123/scalatest-3.2.15. Closes #4127 ac661a55 [liangbowen] bump scalatest and plugin versions Authored-by: liangbowen <liangbowen@gf.com.cn> Signed-off-by: Cheng Pan <chengpan@apache.org>	2023-01-11 16:08:12 +08:00
Cheng Pan	40ef3d624c	[KYUUBI #3864 ] Add missing log4j2-test.xml for Kyuubi Spark Hive Connector ### _Why are the changes needed?_ Avoid too much logs on console in CI. ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [ ] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request Closes #3864 from pan3793/nit. Closes #3864 39687176 [Cheng Pan] Add missing log4j2-test.xml for Kyuubi Spark Hive Connector Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>	2022-11-28 19:53:30 +08:00
liangbowen	2ac10f91d5	[KYUUBI #3842 ] [Improvement] Support maven pom.xml code style check with spotless plugin ### _Why are the changes needed?_ Introduce code style check support for Maven's pom.xml with sortPom in spotless maven plugin. ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [x] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request Closes #3843 from bowenliang123/spotless-pom. Closes #3842 3c654597 [liangbowen] apply to pom.xml fd1536f7 [liangbowen] set expandEmptyElements to true e498423f [liangbowen] apply spotless:apply to all pom.xml e46bcfec [liangbowen] add pom style check support in spotless Authored-by: liangbowen <liangbowen@gf.com.cn> Signed-off-by: Cheng Pan <chengpan@apache.org>	2022-11-23 22:08:00 +08:00
yikf	bbf916d1de	[KYUUBI #3529 ] Supple DDL tests for Spark Hive connector and fix consistent issues w/ V1 implementation ### _Why are the changes needed?_ Fix https://github.com/apache/incubator-kyuubi/issues/3529 The intent of this PR is the following: - Add tests related to catalog, including the listTables, loadTable, and listNamespaces methods; - Initialize the DDL test framework. - Add CreateNamespaceSuite, DropNamespaceSuite and ShowTablesSuite to check for consistency with V1 in hive connector. - Rectify the fault that namespaces are deleted in cascades. During cascades, ignore the exception that the table exists in the namespace. - Fix the tableName problem of HiveTable, which should contain namespace name. ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [x] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request Closes #3530 from Yikf/hivev2-test. Closes #3529 d0af0760 [yikf] Add tests to check for consistency with V1 Authored-by: yikf <yikaifei1@gmail.com> Signed-off-by: Cheng Pan <chengpan@apache.org>	2022-10-15 22:45:22 +08:00
yikf	c3c7707203	[KYUUBI #3464 ] Support for pooling external catalog ### _Why are the changes needed?_ Fix https://github.com/apache/incubator-kyuubi/issues/3464, currently, Kyuubi supports hive connector for read/write hive table, it is implemented based on the [Apache Spark DataSource V2](https://www.databricks.com/session/apache-spark-data-source-v2), but there's a potential issue; Kyuubi use `kyuubi.engine.single.spark.session`=[false](https://kyuubi.apache.org/docs/latest/deployment/settings.html#:~:text=kyuubi.engine.single.spark.session) to provide concurrency sql execution in context isolation, this cause spark.newSession invoked for each transaction, in spark v1 catalog, externalCatalog is shared in the mutiple session, but in catalog v2 architecture, it's big different with v1, v2 catalogs are managed by `CatalogManager` which is [session level](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/internal/SessionState.scala#L97), this means that each session will have a separate catalogManager, so in the case of multiple sessions, hivecatalog will be initialized multiple times. this causes two problems: 1 multiple sessions may be wasted initializing multiple HiveExternalCatalog, which may cause the JVM namespace to swell. 2 multiple HiveClient connections may be initialized; This issue aims to pool externalCatalog to address the potential issues mentioned above. ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [ ] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request Closes #3465 from Yikf/catalog-pool. Closes #3464 5e8a94dd [yikf] Support for pooling external catalog Authored-by: yikf <yikaifei1@gmail.com> Signed-off-by: ulysses-you <ulyssesyou@apache.org>	2022-09-15 13:18:28 +08:00
yikf	3808dbdea5	[KYUUBI #3437 ] Refactory class location of the hive connector ### _Why are the changes needed?_ Fix https://github.com/apache/incubator-kyuubi/issues/3437 This pr aims to refactory class location of the hive connector ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [ ] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request Closes #3438 from Yikf/hive-connector-rename. Closes #3437 a41dd15b [yikf] Refactory class location of the hive connector Authored-by: yikf <yikaifei1@gmail.com> Signed-off-by: Cheng Pan <chengpan@apache.org>	2022-09-07 11:55:23 +00:00
yikf	996fbc6905	[KYUUBI #3366 ] Support hive write code path ### _Why are the changes needed?_ Fix https://github.com/apache/incubator-kyuubi/issues/3366 This PR is a subtask of Kyuubi's support for Hive data sources, and aims to support write code path ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [ ] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request Closes #3367 from Yikf/support-hive-write. Closes #3366 85197a65 [yikf] Support hive write code path Authored-by: yikf <yikaifei1@gmail.com> Signed-off-by: ulysses-you <ulyssesyou@apache.org>	2022-09-07 10:02:38 +08:00
yikf	3adcebd557	[KYUUBI #3378 ][SUBTASK] Improve hive-connector module tests ### _Why are the changes needed?_ Fix https://github.com/apache/incubator-kyuubi/issues/3378 This pr aims to improve hive-connector module tests and make CI perform tests for the hive connector ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [ ] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request Closes #3379 from Yikf/hive-connector-test. Closes #3378 72ad050c [yikf] CI test for hive-connector Authored-by: yikf <yikaifei1@gmail.com> Signed-off-by: Cheng Pan <chengpan@apache.org>	2022-09-03 04:44:01 +08:00
yikf	d822b3eba3	[KYUUBI #3259 ] Initial implementation of the Hive Connector based on the Spark datasource V2 ### _Why are the changes needed?_ In a modern database architecture, users may have a strong need for federated queries. Since there are a large number of Hive warehouse in the history database, we tried to implement the Hive V2 Datasource based on Spark Datasource V2 to meet this need. for the discussion, see :https://lists.apache.org/thread/fq8ywr58rzf9bycflj1q4fl1xyz2rq2w This PR is the first step in fixing https://github.com/apache/incubator-kyuubi/issues/3259, having - initialization implementation - support read code path ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [ ] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request Closes #3260 from yikf/hive-v2-connector. Closes #3259 753aca30 [yikf] Initial implementation of the Hive Connector based on the Spark datasource V2 Authored-by: yikf <yikaifei1@gmail.com> Signed-off-by: Cheng Pan <chengpan@apache.org>	2022-08-23 13:48:21 +08:00

50 Commits