Commit Graph

50 Commits

Author SHA1 Message Date
tian bao
47063d9264
[KYUUBI #7129] Support PARQUET hive table pushdown filter
### Why are the changes needed?

Previously, the `HiveScan` class was used to read data. If it is determined to be PARQUET type, the `ParquetScan` from Spark datasourcev2 can be used. `ParquetScan` supports pushfilter down, but `HiveScan` does not yet support it.

The conversation can be controlled by setting `spark.sql.kyuubi.hive.connector.read.convertMetastoreParquet`. When enabled, the data source PARQUET reader is used to process PARQUET tables created by using the HiveQL syntax, instead of Hive SerDe.

close https://github.com/apache/kyuubi/issues/7129

### How was this patch tested?

added unit test

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #7130 from flaming-archer/master_parquet_filterdown.

Closes #7129

d7059dca4 [tian bao] Support PARQUET hive table pushdown filter

Authored-by: tian bao <2011xuesong@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-07-17 14:42:46 +08:00
tian bao
60371b5dd5
[KYUUBI #7122] Support ORC hive table pushdown filter
### Why are the changes needed?

Previously, the `HiveScan` class was used to read data. If it is determined to be ORC type, the `ORCScan` from Spark datasourcev2 can be used. `ORCScan` supports pushfilter down, but `HiveScan` does not yet support it.

In our testing, we are able to achieve approximately 2x performance improvement.

The conversation can be controlled by setting `spark.sql.kyuubi.hive.connector.read.convertMetastoreOrc`. When enabled, the data source ORC reader is used to process ORC tables created by using the HiveQL syntax, instead of Hive SerDe.

close https://github.com/apache/kyuubi/issues/7122

### How was this patch tested?

added unit test

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #7123 from flaming-archer/master_scanbuilder_new.

Closes #7122

c3f412f90 [tian bao] add case _
2be48909f [tian bao] Merge branch 'master_scanbuilder_new' of github.com:flaming-archer/kyuubi into master_scanbuilder_new
c825d0f8c [tian bao] review change
8a26d6a8a [tian bao] Update extensions/spark/kyuubi-spark-connector-hive/src/main/scala/org/apache/kyuubi/spark/connector/hive/KyuubiHiveConnectorConf.scala
68d41969f [tian bao] review change
bed007fea [tian bao] review change
b89e6e67a [tian bao] Optimize UT
5a8941b2d [tian bao] fix failed ut
dc1ba47e3 [tian bao] orc pushdown version 0

Authored-by: tian bao <2011xuesong@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-07-09 13:38:51 +08:00
Cheng Pan
e366b0950f
[KYUUBI #6920][FOLLOWUP] Spark SQL engine supports Spark 4.0
### Why are the changes needed?

There were some breaking changes after we fixed compatibility for Spark 4.0.0 RC1 in #6920, but now Spark has reached 4.0.0 RC6, which has less chance to receive more breaking changes.

### How was this patch tested?

Changes are extracted from https://github.com/apache/kyuubi/pull/6928, which passed CI with Spark 4.0.0 RC6

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7061 from pan3793/6920-followup.

Closes #6920

17a1bd9e5 [Cheng Pan] [KYUUBI #6920][FOLLOWUP] Spark SQL engine supports Spark 4.0

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-05-16 11:47:35 +08:00
Cheng Pan
d5b01fa3e2
[KYUUBI #6939] Bump Spark 3.5.5
### Why are the changes needed?

Test Spark 3.5.5 Release Notes

https://spark.apache.org/releases/spark-release-3-5-5.html

### How was this patch tested?

Pass GHA.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #6939 from pan3793/spark-3.5.5.

Closes #6939

8c0288ae5 [Cheng Pan] ga
78b0e72db [Cheng Pan] nit
686a7b0a9 [Cheng Pan] fix
d40cc5bba [Cheng Pan] Bump Spark 3.5.5

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-03-03 13:42:09 +08:00
Bowen Liang
d3520ddbce [KYUUBI #6769] [RELEASE] Bump 1.11.0-SNAPSHOT
# 🔍 Description
## Issue References 🔗

This pull request fixes #

## Describe Your Solution 🔧

Preparing v1.11.0-SNAPSHOT after branch-1.10 cut

```shell
build/mvn versions:set -DgenerateBackupPoms=false -DnewVersion="1.11.0-SNAPSHOT"
(cd kyuubi-server/web-ui && npm version "1.11.0-SNAPSHOT")
```

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

#### Behavior Without This Pull Request ⚰️

#### Behavior With This Pull Request 🎉

#### Related Unit Tests

---

# Checklist 📝

- [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6769 from bowenliang123/bump-1.11.

Closes #6769

6db219d28 [Bowen Liang] get latest_branch by sorting version in branch name
465276204 [Bowen Liang] update package.json
81f2865e5 [Bowen Liang] bump

Authored-by: Bowen Liang <liangbowen@gf.com.cn>
Signed-off-by: Bowen Liang <liangbowen@gf.com.cn>
2024-10-23 17:10:56 +08:00
Cheng Pan
1bfc8c5840
[KYUUBI #6699] Bump Spark 4.0.0-preview2
# 🔍 Description

Spark 4.0.0-preview2 RC1 passed the vote
https://lists.apache.org/thread/4ctj2mlgs4q2yb4hdw2jy4z34p5yw2b1

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

Pass GHA.

---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6699 from pan3793/spark-4.0.0-preview2.

Closes #6699

2db1f645d [Cheng Pan] 4.0.0-preview2
42055bb1e [Cheng Pan] fix
d29c0ef83 [Cheng Pan] disable delta test
98d323b95 [Cheng Pan] fix
2e782c00b [Cheng Pan] log4j-slf4j2-impl
fde4bb6ba [Cheng Pan] spark-4.0.0-preview2

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2024-09-23 17:42:48 +08:00
Cheng Pan
03dcedd89e
[KYUUBI #6453] Make KSHC support Spark 4.0 and enable CI for Spark 4.0
# 🔍 Description

This PR makes KSHC support Spark 4.0, and also makes sure that the KSHC jar compiled against Spark 3.5 is binary compatible with Spark 4.0.

We are ready to enable CI for Spark 4.0, except for authZ module.

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

Pass GHA.

---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6453 from pan3793/spark4-ci.

Closes #6453

695e3d7f7 [Cheng Pan] Update pom.xml
2eaa0f88a [Cheng Pan] Update .github/workflows/master.yml
b1f540a34 [Cheng Pan] cross test
562839982 [Cheng Pan] fix
9f0c2e1be [Cheng Pan] fix
45f182462 [Cheng Pan] kshc
227ef5bae [Cheng Pan] fix
690a3b8b2 [Cheng Pan] Revert "fix"
87fe7678b [Cheng Pan] fix
60f55dbed [Cheng Pan] CI for Spark 4.

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2024-06-07 11:01:24 +08:00
Cheng Pan
1fb1f854eb
[KYUUBI #6439] kyuubi-util-scala test jar leaked to compile scope
# 🔍 Description

The `kyuubi-util-scala_2.12-<version>-tests.jar` accidentally leaked to the compile scope but should be in the test scope.

## Types of changes 🔖

- [x] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

Run `build/dist` and check `dist/jars`

---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6439 from pan3793/util-scala-test.

Closes #6439

0576248f5 [Cheng Pan] fix
2bf2408f5 [Cheng Pan] fix
f7151dfc6 [Cheng Pan] kyuubi-util-scala test jar leaked to compile scope

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2024-06-04 11:31:58 +08:00
zhouyifan279
3ed912f5de
[KYUUBI #6247] Make KSHC binary compatible with multiple Spark versions
# 🔍 Description
## Issue References 🔗

This pull request closes #6247

This also closes #6431

## Describe Your Solution 🔧
Add a job `spark-connector-cross-version-test` in GitHub Actions to:
1. Build KSHC package with maven opt `-Pspark-3.5`
2. Run KSHC tests with maven opt `-Pspark-3.3` and `-Pspark-3.4` and KSHC package built in step 1
3. Fix the binary-compatible issue via reflection.

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

Pass GHA.

---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6436 from zhouyifan279/kshc-cross-version-test.

Closes #6247

d3ac2ef47 [zhouyifan279] Tune the KSHC code to fix binary-compatible issues
4e14edcb5 [zhouyifan279] Fix invalid unit-tests-log name
56ca45d18 [zhouyifan279] Fix invalid unit-tests-log name
4c5ab7b9e [zhouyifan279] Update test log name
8a84e8812 [zhouyifan279] Add matrix scala
17cb67155 [zhouyifan279] [KYUUBI #6247] KSHC cross-version test

Authored-by: zhouyifan279 <zhouyifan279@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2024-06-01 20:13:41 +08:00
Cheng Pan
6bdf2bdaf8
[KYUUBI #6392] Support javax.servlet and jakarta.servlet co-exist
# 🔍 Description

This PR makes `javax.servlet` and `jakarta.servlet` co-exist, by introducing `javax.servlet-api-4.0.1` and upgrade `jakarta.servlet-api` to 5.0.0. (6.0.0 requires JDK 11)

Spark 4.0 migrated from `javax.servlet` to `jakarta.servlet` in SPARK-47118 while Kyuubi still uses `javax.servlet` in other modules, we should allow them to co-exist for a while.

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

Pass GHA.

---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6392 from pan3793/servlet.

Closes #6392

27d412599 [Cheng Pan] fix
9f1e72272 [Cheng Pan] other spark modules
f4545dc76 [Cheng Pan] fix
313826fa7 [Cheng Pan] exclude
7d5028154 [Cheng Pan] Support javax.servlet and jakarta.servlet co-exist

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2024-05-20 21:09:30 +08:00
Binjie Yang
eb278c562d
[RELEASE] Bump 1.10.0-SNAPSHOT 2024-03-13 14:24:49 +08:00
Cheng Pan
8cc9b98e25
[KYUUBI #5384][KSCH] Hive connector supports Spark 3.5
# 🔍 Description
## Issue References 🔗

This pull request fixes #5384

## Describe Your Solution 🔧

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

#### Behavior Without This Pull Request ⚰️

#### Behavior With This Pull Request 🎉

#### Related Unit Tests

---

# Checklist 📝

- [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6133 from Kwafoor/kyuubi_6073.

Closes #5384

9234e35ad [Cheng Pan] fix
7766dfda5 [Cheng Pan] nit
e9da162f8 [Cheng Pan] nit
676bfb26e [Cheng Pan] pretty
c241859af [Cheng Pan] pretty
0eedcf82c [wangjunbo] compat with spark 3.3
3d866546c [wangjunbo] format code
a0898f50f [wangjunbo] delete Unused import
9577f7fe8 [wangjunbo] [KYUUBI #5384] kyuubi-spark-connector-hive supports Spark 3.5

Lead-authored-by: Cheng Pan <chengpan@apache.org>
Co-authored-by: wangjunbo <wangjunbo@qiyi.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2024-03-07 17:56:30 +08:00
yikaifei
5bee05e45f
[KYUUBI #6078] KSHC should handle the commit of the partitioned table as dynamic partition at write path
# 🔍 Description
## Issue References 🔗

This pull request fixes https://github.com/apache/kyuubi/issues/6078, KSHC should handle the commit of the partitioned table as dynamic partition at write path, that's beacuse the process of writing with Apache Spark DataSourceV2 using dynamic partitioning to handle static partitions.

## Describe Your Solution 🔧

Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.

## Types of changes 🔖

- [x] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

#### Behavior Without This Pull Request ⚰️

#### Behavior With This Pull Request 🎉

#### Related Unit Tests

---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6082 from Yikf/KYUUBI-6078.

Closes #6078

2ae183672 [yikaifei] KSHC should handle the commit of the partitioned table as dynamic partition at write path

Authored-by: yikaifei <yikaifei@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2024-03-07 17:47:14 +08:00
Cheng Pan
f1cf1e42de
[KYUUBI #6131] Simplify Maven dependency management after dropping building support for Spark 3.1
# 🔍 Description
## Issue References 🔗

SPARK-33212 (fixed in 3.2.0) moves from `hadoop-client` to shaded hadoop client, to simplify the dependency management, previously , we add some workaround to handle Spark 3.1 dependency issues. As we removed building support for Spark 3.1 now, we can remove those workaround to simplify `pom.xml`

## Describe Your Solution 🔧

As above.

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

Pass GA.

---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6131 from pan3793/3-1-cleanup.

Closes #6131

1341065a7 [Cheng Pan] nit
1d7323f6e [Cheng Pan] fix
9e2e3b747 [Cheng Pan] nit
271166b58 [Cheng Pan] test

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2024-03-06 22:31:06 +08:00
Cheng Pan
0a0af165e3
[KYUUBI #6125] Drop Kyuubi extension for Spark 3.1
# 🔍 Description
## Issue References 🔗

This pull request is the next step of deprecating and removing support of Spark 3.1

VOTE: https://lists.apache.org/thread/670fx1qx7rm0vpvk8k8094q2d0fthw5b
VOTE RESULT: https://lists.apache.org/thread/0zdxg5zjnc1wpxmw9mgtsxp1ywqt6qvb

## Describe Your Solution 🔧

Drop module `kyuubi-extension-spark-3-1` and delete Spark 3.1 specific codes.

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

Pass GA.

---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6125 from pan3793/drop-spark-ext-3-1.

Closes #6125

212012f18 [Cheng Pan] fix style
021532ccd [Cheng Pan] doc
329f69ab9 [Cheng Pan] address comments
43fac4201 [Cheng Pan] fix
a12c8062c [Cheng Pan] fix
dcf51c1a1 [Cheng Pan] minor
814a187a6 [Cheng Pan] Drop Kyuubi extension for Spark 3.1

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2024-03-05 17:07:12 +08:00
yikaifei
47555eb900
[KYUUBI #5414][KSHC] Reader should not pollut the global hiveConf instance
### _Why are the changes needed?_

This pr aims to fix https://github.com/apache/kyuubi/issues/5414.

`HiveReader` initialization incorrectly uses the global hadoopConf as hiveconf, which causes reader to pollut the global hadoopConf and cause job read failure.

### _How was this patch tested?_
- [x] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request

### _Was this patch authored or co-authored using generative AI tooling?_

No

Closes #5424 from Yikf/orc-read.

Closes #5414

d6bdf7be4 [yikaifei] [KYUUBI #5414] Reader should not polluted the global hiveconf instance

Authored-by: yikaifei <yikaifei@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2023-10-17 13:09:18 +08:00
sychen
32b6dc3b74
[KYUUBI #5426] [MINOR][KSHC] Avoid use class.newInstance directly
### _Why are the changes needed?_

Remove the deprecated usage.

c780db754e/src/java.base/share/classes/java/lang/Class.java (L534-L535)

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [ ] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request

### _Was this patch authored or co-authored using generative AI tooling?_

No.

Closes #5426 from cxzl25/newInstance.

Closes #5426

dcb679b95 [sychen] avoid use class.newInstance directly

Authored-by: sychen <sychen@ctrip.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2023-10-16 21:25:39 +08:00
ITzhangqiang
e51095edaa
[KYUUBI #5365] Don't use Log4j2's extended throwable conversion pattern in default logging configurations
### _Why are the changes needed?_

The Apache Spark Community found a performance regression with log4j2. See https://github.com/apache/spark/pull/36747.

This PR to fix the performance issue on our side.

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [ ] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request

### _Was this patch authored or co-authored using generative AI tooling?_
No.

Closes #5400 from ITzhangqiang/KYUUBI_5365.

Closes #5365

dbb9d8b32 [ITzhangqiang] [KYUUBI #5365] Don't use Log4j2's extended throwable conversion pattern in default logging configurations

Authored-by: ITzhangqiang <itzhangqiang@163.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2023-10-11 21:41:22 +08:00
zhaomin
167e6c1ca3
[KYUUBI #5317] [Bug] Hive Connector throws NotSerializableException on reading Hive Avro partitioned table
### _Why are the changes needed?_

close https://github.com/apache/kyuubi/issues/5317#issue-1904751001

### _How was this patch tested?_
- [x] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request

### _Was this patch authored or co-authored using generative AI tooling?_

No

Closes #5319 from zhaomin1423/fixhive-connector.

Closes #5317

02e5321dc [Cheng Pan] nit
cadabf4ab [Cheng Pan] nit
d38832f40 [zhaomin] improve
ee5b62d84 [zhaomin] improve
794473468 [zhaomin] improve
e3eca91fb [zhaomin] add tests
d9302e2ba [zhaomin] [KYUUBI #5317] [Bug] Hive Connector throws NotSerializableException on reading Hive Avro partitioned table
0bc8ec16f [zhaomin] [KYUUBI #5317] [Bug] Hive Connector throws NotSerializableException on reading Hive Avro partitioned table

Lead-authored-by: zhaomin <zhaomin1423@163.com>
Co-authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2023-09-21 17:05:24 +08:00
Cheng Pan
6061a05f24
Bump 1.9.0-SNAPSHOT 2023-09-04 14:23:12 +08:00
yikaifei
0c987e96fa [KYUUBI #5225] [KSHC] Unify the exception handling of v1 and v2 during dropDatabase
### _Why are the changes needed?_

This PR aims to unify the exception handling of v1 and v2 during dropDatabase

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [ ] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request

### _Was this patch authored or co-authored using generative AI tooling?_

No

Closes #5225 from Yikf/hive-connector.

Closes #5225

3be33af76 [yikaifei] [KSHC] Improve test

Authored-by: yikaifei <yikaifei@apache.org>
Signed-off-by: Bowen Liang <liangbowen@gf.com.cn>
2023-09-01 12:17:33 +08:00
liangbowen
bdf867b19a
[KYUUBI #5193] Make Spark hive connector plugin compilable on Scala 2.13
### _Why are the changes needed?_

- to make Spark SQL hive connector plugin compilable on Scala 2.13 with Spark 3.3/3.4
- rename class name `FilePartitionReader` which is copied from Spark to `SparkFilePartitionReader`to fix the class mismatch error
```
[ERROR] [Error] /Users/bw/dev/incubator-kyuubi/extensions/spark/kyuubi-spark-connector-hive/src/main/scala/org/apache/kyuubi/spark/connector/hive/read/HivePartitionReaderFactory.scala:83: type mismatch;
 found   : Iterator[org.apache.kyuubi.spark.connector.hive.read.HivePartitionedFileReader[org.apache.spark.sql.catalyst.InternalRow]]
 required: Iterator[org.apache.spark.sql.execution.datasources.v2.PartitionedFileReader[org.apache.spark.sql.catalyst.InternalRow]]

```

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [ ] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request

### _Was this patch authored or co-authored using generative AI tooling?_

No.

Closes #5193 from bowenliang123/scala213-hivecon.

Closes #5193

d8c6bf5f0 [liangbowen] defer toMap
b20ad4eb1 [liangbowen] adapt spark hive connector plugin to Scala 2.13

Authored-by: liangbowen <liangbowen@gf.com.cn>
Signed-off-by: yikaifei <yikaifei@apache.org>
2023-08-23 13:58:17 +08:00
liangbowen
4213e20945 [KYUUBI #5177] Use Scala binary version placeholder in Maven module's artifactId suffix
### _Why are the changes needed?_

- Change hardcoded Scala's version 2.12 in Maven module's `artifactId` to placeholder `scala.binary.version` which is defined in project parent pom as 2.12
- Preparation for Scala 2.13/3.x support in the future
- No impact on using or building Maven modules
- Some ignorable warning messages for unstable artifactId will be thrown by Maven.
```
Warning:  Some problems were encountered while building the effective model for org.apache.kyuubi:kyuubi-server_2.12🫙1.8.0-SNAPSHOT
Warning:  'artifactId' contains an expression but should be a constant
```
### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request

### _Was this patch authored or co-authored using generative AI tooling?_

No.

Closes #5175 from bowenliang123/artifactId-scala.

Closes #5177

2eba29cfa [liangbowen] use placeholder of scala binary version for artifactId

Authored-by: liangbowen <liangbowen@gf.com.cn>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2023-08-20 16:03:23 +00:00
liangbowen
6ec326adb4 [KYUUBI #5039] [Improvement] Use semantic versions and remove redundant version comparison methods
### _Why are the changes needed?_

- Support initializing or comparing version with major version only, e.g "3" equivalent to  "3.0"
- Remove redundant version comparison methods by using semantic versions of Spark, Flink and Kyuubi
- adding common `toDouble` method

### _How was this patch tested?_
- [x] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request

Closes #5039 from bowenliang123/improve-semanticversion.

Closes #5039

b6868264f [liangbowen] nit
d39646b7d [liangbowen] SPARK_ENGINE_RUNTIME_VERSION
9148caad0 [liangbowen] use semantic versions
ecc3b4af6 [mans2singh] [KYUUBI #5086] [KYUUBI # 5085] Update config section of deploy on kubernetes

Lead-authored-by: liangbowen <liangbowen@gf.com.cn>
Co-authored-by: mans2singh <mans2singh@yahoo.com>
Signed-off-by: liangbowen <liangbowen@gf.com.cn>
2023-07-25 18:04:45 +08:00
yikaifei
5915d682b5
[KYUUBI #5022] [KSHC] CreateTable should use the correct provider
### _Why are the changes needed?_

This PR aims to fix a bug, In KSHC, `catalog.createTable` should use the correct provider.

### _How was this patch tested?_
- [x] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [ ] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request

Closes #5022 from Yikf/KSHC-createTable.

Closes #5022

cd8cb1cf2 [yikaifei] CreateTable should use the correct provider

Authored-by: yikaifei <yikaifei@apache.org>
Signed-off-by: yikaifei <yikaifei@apache.org>
2023-07-07 12:04:55 +08:00
yikaifei
46f8e0ca94
[KYUUBI #5017] [KSHC] Support Parquet/Orc provider is splitable
### _Why are the changes needed?_

This PR amins to support Parquet/Orc provider is splitable.

### _How was this patch tested?_
- [x] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request

Closes #5017 from Yikf/KSHC-support-split.

Closes #5017

9dc3d3d56 [yikaifei] Support Parquet/Orc provider is splitable

Authored-by: yikaifei <yikaifei@apache.org>
Signed-off-by: yikaifei <yikaifei@apache.org>
2023-07-06 19:21:05 +08:00
yikaifei
da82217388
[KYUUBI #5023] [KSHC] TableIdentify don't attach catalog
### _Why are the changes needed?_

As title, In KSHC, HiveTable's identify does not attach the catalog to prevent an incorrect catalogName. default catalog is "spark_catalog"

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [ ] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request

Closes #5023 from Yikf/tableName2.

Closes #5023

86b6a58d0 [yikaifei] KSHC v1IdentifierNoCatalog in spark3.4

Authored-by: yikaifei <yikaifei@apache.org>
Signed-off-by: ulyssesyou <ulyssesyou@apache.org>
2023-07-06 18:26:37 +08:00
zhaomin
7feb535668
[KYUUBI #5028] Update session hadoop conf to catalog hadoop conf
### _Why are the changes needed?_

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request

Closes #5028 from zhaomin1423/fix_hive_connector.

Closes #5028

d9c7e9c8a [zhaomin] Update session hadoop conf to catalog hadoop conf

Authored-by: zhaomin <zhaomin1423@163.com>
Signed-off-by: ulyssesyou <ulyssesyou@apache.org>
2023-07-06 18:25:12 +08:00
Cheng Pan
1d5ac07dfc [KYUUBI #4999] [KSHC] Kyuubi-Spark-Hive-Connector support Apache Spark 3.4
### _Why are the changes needed?_

This pr amins to make KSHC support Apache Spark 3.4.

- KSHC support Apache Spark 3.4
- Make Apache kyuubi `codecov` module contain the spark-3.4 profile. so that Apache kyubbi CI can cover some modules.

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request

Closes #4999 from Yikf/kudu-spark3.4.

Closes #4999

6a35e54b8 [Cheng Pan] Update extensions/spark/kyuubi-spark-connector-hive/src/main/scala/org/apache/kyuubi/spark/connector/hive/HiveConnectorUtils.scala
66bb742eb [Cheng Pan] Update extensions/spark/kyuubi-spark-connector-hive/src/main/scala/org/apache/kyuubi/spark/connector/hive/HiveConnectorUtils.scala
7be517c7f [Cheng Pan] Update extensions/spark/kyuubi-spark-connector-hive/src/main/scala/org/apache/kyuubi/spark/connector/hive/HiveConnectorUtils.scala
ae23133d1 [Cheng Pan] Update extensions/spark/kyuubi-spark-connector-hive/src/main/scala/org/apache/kyuubi/spark/connector/hive/HiveConnectorUtils.scala
dda5e6521 [Cheng Pan] Update extensions/spark/kyuubi-spark-connector-hive/src/main/scala/org/apache/kyuubi/spark/connector/hive/HiveConnectorUtils.scala
e43a25dff [Cheng Pan] Update extensions/spark/kyuubi-spark-connector-hive/src/main/scala/org/apache/kyuubi/spark/connector/hive/HiveConnectorUtils.scala
54f52f16d [Cheng Pan] Update extensions/spark/kyuubi-spark-connector-hive/src/main/scala/org/apache/kyuubi/spark/connector/hive/HiveConnectorUtils.scala
0955b544b [Cheng Pan] Update pom.xml
38a1383d9 [yikaifei] codecov module should contain the spark 3.4 profile

Lead-authored-by: Cheng Pan <pan3793@gmail.com>
Co-authored-by: yikaifei <yikaifei@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2023-07-04 17:25:57 +08:00
zhaomin
80bc028e6d [KYUUBI #4995] Use hadoop conf and hive conf from catalog options
### _Why are the changes needed?_

There are hdfs-site.xml, hive-site, etc in spark job classpath, but we should use hadoop conf and hive conf from catalog options.

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request

Closes #4995 from zhaomin1423/fix_hive_connector.

Closes #4995

64429fdcb [Xiao Zhao] Update extensions/spark/kyuubi-spark-connector-hive/src/main/scala/org/apache/kyuubi/spark/connector/hive/HiveTableCatalog.scala
d921be750 [zhaomin] fix
375934d65 [zhaomin] Using hadoop conf and hive conf from catalog options

Lead-authored-by: zhaomin <zhaomin1423@163.com>
Co-authored-by: Xiao Zhao <zhaomin1423@163.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2023-06-26 15:04:39 +08:00
liangbowen
eeee5c1ae3 [KYUUBI #4959] [MINOR] Code improvements for Scala
### _Why are the changes needed?_

- To improve Scala code with corrections, simplification, scala style, redundancy cleaning-up. No feature changes introduced.

Corrections:
- Class doesn't correspond to file name (SparkListenerExtensionTest)
- Correct package name in ResultSetUtil and PySparkTests

Improvements:
- 'var' could be a 'val'
- GetOrElse(null) to orNull

Cleanup & Simplification:
- Redundant cast inspection
- Redundant collection conversion
- Simplify boolean expression
- Redundant new on case class
- Redundant return
- Unnecessary parentheses
- Unnecessary partial function
- Simplifiable empty check
- Anonymous function convertible to a method value

Scala Style:
- Constructing range for seq indices
- Get and getOrElse to getOrElse
- Convert expression to Single Abstract Method (SAM)
- Scala unnecessary semicolon inspection
- Map and getOrElse(false) to exists
- Map and flatten to flatMap
- Null initializer can be replaced by _
- scaladoc link to method

Other Improvements:
- Replace map and getOrElse(true) with forall
- Unit return type in the argument of map
- Size to length on arrays and strings
- Type check can be pattern matching
- Java mutator method accessed as parameterless
- Procedure syntax in method definition

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [ ] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #4959 from bowenliang123/scala-Improve.

Closes #4959

2d36ff351 [liangbowen] code improvement for Scala

Authored-by: liangbowen <liangbowen@gf.com.cn>
Signed-off-by: liangbowen <liangbowen@gf.com.cn>
2023-06-16 21:20:17 +08:00
Cheng Pan
01d80eb272
[KYUUBI #4870] Add kyuubi-util and kyuubi-util-scala modules
### _Why are the changes needed?_

Close #4870

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #4872 from pan3793/util.

Closes #4870

0b9fe3cba [Cheng Pan] nit
ecc5ee4f2 [Cheng Pan] fix
63be7a20c [Cheng Pan] test
85363c187 [Cheng Pan] style
2227247dd [Cheng Pan] fix package
11d10a081 [Cheng Pan] Add kyuubi-util and kyuubi-util-scala modules

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2023-05-22 22:13:56 +08:00
Cheng Pan
b2fe49343e
[KYUUBI #4620] [KSHC] Cut off transitive dependencies
### _Why are the changes needed?_

Remove all transitive dependencies to make the down stream project easy to consume.

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #4620 from pan3793/kshc.

Closes #4620

407f669f5 [Cheng Pan] [KSHC] Cut off transitive dependenices

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2023-03-27 18:35:30 +08:00
Cheng Pan
6f803c0015
[KYUUBI #4560] [KSHC] Support Kerberized HMS in cluster mode w/o keytab
### _Why are the changes needed?_

This PR aims to make Kyuubi Spark Hive Connector(KSHC) support kerberized HMS in cluster mode w/o keytab(which is the typical use case in Kyuubi) by implementing a `HadoopDelegationTokenProvider`.

To enable access to an kerberized HMS using KSHC, the minimal configurations are
```
spark.sql.catalog.warm=org.apache.kyuubi.spark.connector.hive.HiveTableCatalog
spark.sql.catalog.warm.hive.metastore.uris=<thrift-uris>
```
then it's able to run federation query across metastores
```
SELECT * FROM spark_catalog.db1.tbl1 JOIN warm.db2.tbl2 ON ...
```

In addition, it allows disabling token renewal for each catalog explicitly
```
spark.sql.catalog.warm.delegation.token.renewal.enabled=false
```

The current implementation has some limitations:

the catalog configuration must be present on the Spark application bootstrap, which means the catalog configurations should be set in `spark-defaults.conf` or append as `--conf` like:
```
spark-[sql|shell|submit] \
  --conf spark.sql.catalog.xxx=org.apache.kyuubi.spark.connector.hive.HiveTableCatalog
  --conf spark.sql.catalog.xxx.hive.abc=xyz
```

but does not work for dynamic registering through SET statement, e.g. `SET spark.sql.catalog.xxx=`

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [x] Add screenshots for manual tests if appropriate

```
> (select count(*) from hive_2.mammut.test_7) union ( select count(*) from spark_catalog.test.test01 limit 1);
+-----------+
| count(1)  |
+-----------+
| 4         |
| 1         |
+-----------+
2 rows selected (8.378 seconds)
```

- [ ] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #4560 from pan3793/shc-token.

Closes #4560

fe8cd0c6d [Cheng Pan] Centralized metastore token signature fallback logic
851159559 [Cheng Pan] comments
fc3b4d596 [Cheng Pan] hive.metastore.token.signature fallback to hive.metastore.uris
fb7eb033f [Cheng Pan] unused import
858b39024 [Cheng Pan] New catalog property delegation.token.renewal.enabled
28ec5a543 [Cheng Pan] disable hms client retry
52044d474 [Cheng Pan] update comments
33b241831 [Cheng Pan] [KSHC] Support Kerberos by implementing KyuubiHiveConnectorDelegationTokenProvider

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2023-03-24 11:34:08 +08:00
Yikf
41e9505722
[KYUUBI #4525][KSHC] Partitioning predicates should take effect to filter data
### _Why are the changes needed?_

This PR aims to close https://github.com/apache/kyuubi/issues/4525.

The root cause of this problem is that Apache Spark does predicate push-down in `V2ScanRelationPushDown`, but the spark-hive-connector does not apply push-down predicates for data filtering.

### _How was this patch tested?_
- [x] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #4528 from Yikf/KYUUBI-4525.

Closes #4525

a65a1873f [Yikf] Partitioning predicates should take effect to filter data

Authored-by: Yikf <yikaifei@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2023-03-16 10:12:44 +08:00
Cheng Pan
dd9b58ae81
[KYUUBI #4488] [KSHC] Keep object original name defined in HiveBridgeHelper
### _Why are the changes needed?_

Respect Java/Scala coding conventions in KSHC (Kyuubi Spark Hive Connector).

For singleton(`object` in Scala) invoking, use `AbcUtils.method(...)` instead of `abcUtils.method(...)`

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #4488 from pan3793/shc-rename.

Closes #4488

ec9a80198 [Cheng Pan] nit
84d3bb413 [Cheng Pan] Keep object orignal name defined in HiveBridgeHelper

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2023-03-09 19:02:08 +08:00
Yikf
19b4b0a3fd
[KYUUBI #4432] jobId across tasks should be consistent to meet the contract expected by Hadoop committers
### _Why are the changes needed?_

jobId across tasks should be consistent to meet the contract expected by Hadoop committers

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [ ] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #4432 from Yikf/jobid.

Closes #4432

4e7401c91 [Yikf] jobId across tasks should be consistent

Authored-by: Yikf <yikaifei@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2023-03-01 16:16:55 +08:00
Yikf
3b73e1d64a
[KYUUBI #4391] Improve code for hive-connector FileWriterFactory
### _Why are the changes needed?_

This pr aims to improve code for hive-connector FileWriterFactory, the main goal is to reduce duplicate copies of spark code.

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #4391 from Yikf/improve-code.

Closes #4391

7991f145 [Yikf] improve code for hive-connector FileWriterFactory

Authored-by: Yikf <yikaifei@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2023-02-21 17:24:53 +08:00
Yikf
4feb83d0f3
[KYUUBI #4359] Workaround for SPARK-41448 to keep FileWriterFactory serializable
### _Why are the changes needed?_

[SPARK-41448](https://issues.apache.org/jira/browse/SPARK-41448) make consistent MR job IDs in FileBatchWriter and FileFormatWriter in Apache Spark 3.3.2, but it breaks a serializable issue, JobId is non-serializable.

And this pr aims to rewrite `FileWriterFactory` to circumvent the problem

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #4359 from Yikf/FileWriterFactory.

Closes #4359

dd8c90fe [Cheng Pan] Update extensions/spark/kyuubi-spark-connector-hive/src/main/scala/org/apache/kyuubi/spark/connector/hive/write/FileWriterFactory.scala
1e5164ec [Yikf] Make a serializable jobTrackerId instead of a non-serializable JobID in FileWriterFactory

Lead-authored-by: Yikf <yikaifei@apache.org>
Co-authored-by: Cheng Pan <pan3793@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2023-02-18 22:12:53 +08:00
Cheng Pan
4e226ac3cc
Bump 1.8.0-SNAPSHOT 2023-02-10 15:25:49 +08:00
jiaoqingbo
c1e2e57dd9 [KYUUBI #4222] Use hiveTableCatalog to updateTableStats instead of sessionCatalog
fix #4222

- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [x] Add screenshots for manual tests if appropriate

- [ ] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #4237 from jiaoqingbo/kyuubi4222.

Closes #4237

7677d69f [jiaoqingbo] code review
538d436a [jiaoqingbo] [Kyuubi #4222] Use hiveTableCatalog to updateTableStats instead of sessionCatalog

Authored-by: jiaoqingbo <1178404354@qq.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2023-02-03 05:43:36 +00:00
liangbowen
faecd8f23d
[KYUUBI #4127] Align ScalaTest Plus plugin versions and bump ScalaTest from 3.2.9 to 3.2.15
### _Why are the changes needed?_

- bump `ScalaTest` version from `3.2.9` to `3.2.15`, updated to use same scala version `2.12.17` in Kyuubi. (Release notes: https://github.com/scalatest/scalatest/releases/tag/release-3.2.15)
- bump `scalatest-maven-plugin` from `2.0.2` to `2.2.0` (https://github.com/scalatest/scalatest-maven-plugin/releases/tag/release-2.2.0)
- align `scalatestplus` versions to the version above, removing the misleading `scalacheck.version` property, (ScalaTest + ScalaCheck Version: https://www.scalatest.org/plus/scalacheck/versions)
- bump scalatestplus plugins to `3.2.15.0` with bumping dependency
    - scalatestplus-scalacheck (https://github.com/scalatest/scalatestplus-scalacheck/releases/tag/release-3.2.15.0-for-scalacheck-1.17)
    - scalatestplus-mockito (https://github.com/scalatest/scalatestplus-mockito/releases/tag/release-3.2.15.0-for-mockito-4.6)
    -  mockito from `3.4` to `4.6` (https://github.com/mockito/mockito/releases/tag/v4.6.0)
    - scalacheck from `1.15` to `1.17` (https://github.com/typelevel/scalacheck/releases/tag/v1.17.0)

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #4127 from bowenliang123/scalatest-3.2.15.

Closes #4127

ac661a55 [liangbowen] bump scalatest and plugin versions

Authored-by: liangbowen <liangbowen@gf.com.cn>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2023-01-11 16:08:12 +08:00
Cheng Pan
40ef3d624c
[KYUUBI #3864] Add missing log4j2-test.xml for Kyuubi Spark Hive Connector
### _Why are the changes needed?_

Avoid too much logs on console in CI.

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [ ] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #3864 from pan3793/nit.

Closes #3864

39687176 [Cheng Pan] Add missing log4j2-test.xml for Kyuubi Spark Hive Connector

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2022-11-28 19:53:30 +08:00
liangbowen
2ac10f91d5
[KYUUBI #3842] [Improvement] Support maven pom.xml code style check with spotless plugin
### _Why are the changes needed?_

Introduce code style check support for Maven's pom.xml with sortPom in spotless maven plugin.

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #3843 from bowenliang123/spotless-pom.

Closes #3842

3c654597 [liangbowen] apply to pom.xml
fd1536f7 [liangbowen] set expandEmptyElements to true
e498423f [liangbowen] apply spotless:apply to all pom.xml
e46bcfec [liangbowen] add pom style check support in spotless

Authored-by: liangbowen <liangbowen@gf.com.cn>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2022-11-23 22:08:00 +08:00
yikf
bbf916d1de
[KYUUBI #3529] Supple DDL tests for Spark Hive connector and fix consistent issues w/ V1 implementation
### _Why are the changes needed?_

Fix https://github.com/apache/incubator-kyuubi/issues/3529

The intent of this PR is the following:
- Add tests related to catalog, including the listTables, loadTable, and listNamespaces methods;
- Initialize the DDL test framework.
- Add CreateNamespaceSuite, DropNamespaceSuite and ShowTablesSuite to check for consistency with V1 in hive connector.
- Rectify the fault that namespaces are deleted in cascades. During cascades, ignore the exception that the table exists in the namespace.
- Fix the tableName problem of HiveTable, which should contain namespace name.

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #3530 from Yikf/hivev2-test.

Closes #3529

d0af0760 [yikf] Add tests to check for consistency with V1

Authored-by: yikf <yikaifei1@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2022-10-15 22:45:22 +08:00
yikf
c3c7707203
[KYUUBI #3464] Support for pooling external catalog
### _Why are the changes needed?_

Fix https://github.com/apache/incubator-kyuubi/issues/3464, currently, Kyuubi supports hive connector for read/write hive table, it is implemented based on the [Apache Spark DataSource V2](https://www.databricks.com/session/apache-spark-data-source-v2), but there's a potential issue;

Kyuubi use `kyuubi.engine.single.spark.session`=[false](https://kyuubi.apache.org/docs/latest/deployment/settings.html#:~:text=kyuubi.engine.single.spark.session) to provide concurrency sql execution in context isolation, this cause spark.newSession invoked for each transaction, in spark v1 catalog, externalCatalog is shared in the mutiple session, but in catalog v2 architecture, it's big different with v1, v2 catalogs are managed by `CatalogManager` which is [session level](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/internal/SessionState.scala#L97), this means that each session will have a separate catalogManager, so in the case of multiple sessions, hivecatalog will be initialized multiple times. this causes two problems: 1  multiple sessions may be wasted initializing multiple HiveExternalCatalog, which may cause the JVM namespace to swell. 2  multiple HiveClient connections may be initialized;

This issue aims to pool externalCatalog to address the potential issues mentioned above.

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [ ] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #3465 from Yikf/catalog-pool.

Closes #3464

5e8a94dd [yikf] Support for pooling external catalog

Authored-by: yikf <yikaifei1@gmail.com>
Signed-off-by: ulysses-you <ulyssesyou@apache.org>
2022-09-15 13:18:28 +08:00
yikf
3808dbdea5 [KYUUBI #3437] Refactory class location of the hive connector
### _Why are the changes needed?_

Fix https://github.com/apache/incubator-kyuubi/issues/3437

This pr aims to refactory class location of the hive connector

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [ ] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #3438 from Yikf/hive-connector-rename.

Closes #3437

a41dd15b [yikf] Refactory class location of the hive connector

Authored-by: yikf <yikaifei1@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2022-09-07 11:55:23 +00:00
yikf
996fbc6905
[KYUUBI #3366] Support hive write code path
### _Why are the changes needed?_

Fix https://github.com/apache/incubator-kyuubi/issues/3366

This PR is a subtask of Kyuubi's support for Hive data sources, and aims to support write code path

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [ ] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #3367 from Yikf/support-hive-write.

Closes #3366

85197a65 [yikf] Support hive write code path

Authored-by: yikf <yikaifei1@gmail.com>
Signed-off-by: ulysses-you <ulyssesyou@apache.org>
2022-09-07 10:02:38 +08:00
yikf
3adcebd557
[KYUUBI #3378][SUBTASK] Improve hive-connector module tests
### _Why are the changes needed?_

Fix https://github.com/apache/incubator-kyuubi/issues/3378

This pr aims to improve hive-connector module tests and make CI perform tests for the hive connector

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [ ] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #3379 from Yikf/hive-connector-test.

Closes #3378

72ad050c [yikf] CI test for hive-connector

Authored-by: yikf <yikaifei1@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2022-09-03 04:44:01 +08:00
yikf
d822b3eba3
[KYUUBI #3259] Initial implementation of the Hive Connector based on the Spark datasource V2
### _Why are the changes needed?_

In a modern database architecture, users may have a strong need for federated queries. Since there are a large number of Hive warehouse in the history database, we tried to implement the Hive V2 Datasource based on Spark Datasource V2 to meet this need. for the discussion, see :https://lists.apache.org/thread/fq8ywr58rzf9bycflj1q4fl1xyz2rq2w

This PR is the first step in fixing https://github.com/apache/incubator-kyuubi/issues/3259, having
- initialization implementation
- support read code path

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [ ] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #3260 from yikf/hive-v2-connector.

Closes #3259

753aca30 [yikf] Initial implementation of the Hive Connector based on the Spark datasource V2

Authored-by: yikf <yikaifei1@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2022-08-23 13:48:21 +08:00