[KYUUBI #398] Move to hadoop shaded client

![pan3793](https://badgen.net/badge/Hello/pan3793/green) [![Closes #399](https://badgen.net/badge/Preview/Closes%20%23399/blue)](https://github.com/yaooqinn/kyuubi/pull/399) ![101](https://badgen.net/badge/%2B/101/red) ![41](https://badgen.net/badge/-/41/green) ![5](https://badgen.net/badge/commits/5/yellow) [&#10088;?&#10089;](https://pullrequestbadge.com/?utm_medium=github&utm_source=yaooqinn&utm_campaign=badge_info)<!-- PR-BADGE: PLEASE DO NOT REMOVE THIS COMMENT -->

<!--
Thanks for sending a pull request!

Here are some tips for you:
  1. If this is your first time, please read our contributor guidelines: https://kyuubi.readthedocs.io/en/latest/community/contributions.html
  2. If the PR is related to an issue in https://github.com/yaooqinn/kyuubi/issues, add '[KYUUBI #XXXX]' in your PR title, e.g., '[KYUUBI #XXXX] Your PR title ...'.
  3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP][KYUUBI #XXXX] Your PR title ...'.
-->

### _Why are the changes needed?_
<!--
Please clarify why the changes are needed. For instance,
  1. If you add a feature, you can talk about the use case of it.
  2. If you fix a bug, you can clarify why it is a bug.
-->
close #398

### _How was this patch tested?_
- [x] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [x] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/latest/tools/testing.html#running-tests) locally before make a pull request

manual test release workflow
```
➜  kyuubi git:(KYUUBI-398) ll | grep tar.gz
-rw-r--r--   1 chengpan  staff   265M Mar  4 21:38 kyuubi-1.1.0-SNAPSHOT-bin-spark-3.0-hadoop2.7.tar.gz
-rw-r--r--   1 chengpan  staff   269M Mar  4 21:40 kyuubi-1.1.0-SNAPSHOT-bin-spark-3.0-hadoop3.2.tar.gz
-rw-r--r--   1 chengpan  staff   269M Mar  4 21:46 kyuubi-1.1.0-SNAPSHOT-bin-spark-3.1-hadoop2.7.tar.gz
-rw-r--r--   1 chengpan  staff   273M Mar  4 21:44 kyuubi-1.1.0-SNAPSHOT-bin-spark-3.1-hadoop3.2.tar.gz
```

Closes #399 from pan3793/KYUUBI-398.

a9294a1 [Cheng Pan] fix ut
d1d816d [Cheng Pan] fix dist script
2e3bc20 [Cheng Pan] update release workflow and dist script
0428b1b [Cheng Pan] update travis.yml
4a9bc1b [Cheng Pan] [KYUUBI #398] move to hadoop shaded client

Authored-by: Cheng Pan <379377944@qq.com>
Signed-off-by: Kent Yao <yao@apache.org>
This commit is contained in:
Cheng Pan 2021-03-04 23:33:31 +08:00 committed by Kent Yao
parent 8dd54a29b3
commit 4abc3ff6a0
No known key found for this signature in database
GPG Key ID: F7051850A0AF904D
10 changed files with 101 additions and 41 deletions

View File

@ -17,10 +17,10 @@ jobs:
strategy:
matrix:
profiles:
- '-Pspark-3.0 -Phadoop-2.7'
- '-Pspark-3.0 -Phadoop-2.7 -Dspark.archive.mirror=https://archive.apache.org/dist/spark/spark-3.1.1 -Dspark.archive.name=spark-3.1.1-bin-hadoop2.7.tgz -Dmaven.plugin.scalatest.exclude.tags=org.apache.kyuubi.tags.DataLakeTest'
- '-Pspark-3.1 -Phadoop-2.7'
- '-Pspark-3.1 -Phadoop-3.2'
- '-Pspark-3.0 -Pspark-hadoop-2.7'
- '-Pspark-3.0 -Pspark-hadoop-2.7 -Dspark.archive.mirror=https://archive.apache.org/dist/spark/spark-3.1.1 -Dspark.archive.name=spark-3.1.1-bin-hadoop2.7.tgz -Dmaven.plugin.scalatest.exclude.tags=org.apache.kyuubi.tags.DataLakeTest'
- '-Pspark-3.1 -Pspark-hadoop-2.7'
- '-Pspark-3.1 -Pspark-hadoop-3.2'
steps:
- uses: actions/checkout@v2
- name: Setup JDK 1.8

View File

@ -12,12 +12,12 @@ jobs:
strategy:
matrix:
profiles:
- '-Pspark-3.0 -Phadoop-2.7'
- '--spark-provided -Pspark-3.0 -Phadoop-2.7'
- '-Pspark-3.1 -Phadoop-2.7'
- '--spark-provided -Pspark-3.1 -Phadoop-2.7'
- '-Pspark-3.1 -Phadoop-3.2'
- '--spark-provided -Pspark-3.1 -Phadoop-3.2'
- '-Pspark-3.0 -Pspark-hadoop-2.7'
- '--spark-provided -Pspark-3.0 -Pspark-hadoop-2.7'
- '-Pspark-3.1 -Pspark-hadoop-2.7'
- '--spark-provided -Pspark-3.1 -Pspark-hadoop-2.7'
- '-Pspark-3.1 -Pspark-hadoop-3.2'
- '--spark-provided -Pspark-3.1 -Pspark-hadoop-3.2'
steps:
- uses: actions/checkout@master
# We split caches because GitHub Action Cache has a 400MB-size limit.

View File

@ -28,21 +28,21 @@ cache:
matrix:
include:
- name: Test Kyuubi w/ -Pspark-3.0 -Phadoop-2.7
- name: Test Kyuubi w/ -Pspark-3.0 -Pspark-hadoop-2.7
env:
- PROFILE="-Pspark-3.0 -Phadoop-2.7"
- PROFILE="-Pspark-3.0 -Pspark-hadoop-2.7"
- EXCLUDE_TAGS=""
- name: Test Kyuubi w/ -Pspark-3.1 -Phadoop-2.7
- name: Test Kyuubi w/ -Pspark-3.1 -Pspark-hadoop-2.7
env:
- PROFILE="-Pspark-3.1 -Phadoop-2.7"
- PROFILE="-Pspark-3.1 -Pspark-hadoop-2.7"
- EXCLUDE_TAGS="org.apache.kyuubi.tags.DataLakeTest"
- name: Test Kyuubi w/ -Pspark-3.1 -Phadoop-3.2
- name: Test Kyuubi w/ -Pspark-3.1 -Pspark-hadoop-3.2
env:
- PROFILE="-Pspark-3.1 -Phadoop-3.2"
- PROFILE="-Pspark-3.1 -Pspark-hadoop-3.2"
- EXCLUDE_TAGS="org.apache.kyuubi.tags.DataLakeTest"
- name: Test Kyuubi w/ -Pspark-3.0 -Phadoop-2.7 by Spark 3.1 distribution
- name: Test Kyuubi w/ -Pspark-3.0 -Pspark-hadoop-2.7 by Spark 3.1 distribution
env:
- PROFILE="-Pspark-3.0 -Phadoop-2.7 -Dspark.archive.mirror=https://archive.apache.org/dist/spark/spark-3.1.1 -Dspark.archive.name=spark-3.1.1-bin-hadoop2.7.tgz"
- PROFILE="-Pspark-3.0 -Pspark-hadoop-2.7 -Dspark.archive.mirror=https://archive.apache.org/dist/spark/spark-3.1.1 -Dspark.archive.name=spark-3.1.1-bin-hadoop2.7.tgz"
- EXCLUDE_TAGS="org.apache.kyuubi.tags.DataLakeTest"
install:

View File

@ -128,6 +128,11 @@ SPARK_VERSION=$("$MVN" help:evaluate -Dexpression=spark.version $@ 2>/dev/null\
| grep -v "WARNING"\
| tail -n 1)
SPARK_HADOOP_VERSION=$("$MVN" help:evaluate -Dexpression=spark.hadoop.binary.version $@ 2>/dev/null\
| grep -v "INFO"\
| grep -v "WARNING"\
| tail -n 1)
HADOOP_VERSION=$("$MVN" help:evaluate -Dexpression=hadoop.version $@ 2>/dev/null\
| grep -v "INFO"\
| grep -v "WARNING"\
@ -141,17 +146,12 @@ HIVE_VERSION=$("$MVN" help:evaluate -Dexpression=hive.version $@ 2>/dev/null\
echo "Building Kyuubi package of version $VERSION against Spark version - $SPARK_VERSION"
if [[ "$NAME" == "none" ]]; then
if [[ ${HADOOP_VERSION:0:3} == "2.7" ]]; then
HADOOP_VERSION_SUFFIX=""
else
HADOOP_VERSION_SUFFIX="-hadoop${HADOOP_VERSION:0:3}"
fi
SPARK_HADOOP_VERSION_SUFFIX="-hadoop${SPARK_HADOOP_VERSION}"
if [[ "$SPARK_PROVIDED" == "true" ]]; then
NAME="without-spark"$HADOOP_VERSION_SUFFIX
NAME="without-spark"$SPARK_HADOOP_VERSION_SUFFIX
else
NAME="spark-"${SPARK_VERSION:0:3}$HADOOP_VERSION_SUFFIX
NAME="spark-"${SPARK_VERSION:0:3}$SPARK_HADOOP_VERSION_SUFFIX
fi
fi
@ -192,8 +192,8 @@ cp -r "$KYUUBI_HOME/kyuubi-assembly/target/scala-$SCALA_VERSION/jars/" "$DISTDIR
## cp engines
if [[ "$SPARK_PROVIDED" != "true" ]]; then
cp -r "$KYUUBI_HOME/externals/kyuubi-download/target/spark-$SPARK_VERSION-bin-hadoop${HADOOP_VERSION:0:3}$HIVE_VERSION_SUFFIX/" \
"$DISTDIR/externals/spark-$SPARK_VERSION-bin-hadoop${HADOOP_VERSION:0:3}$HIVE_VERSION_SUFFIX/"
cp -r "$KYUUBI_HOME/externals/kyuubi-download/target/spark-$SPARK_VERSION-bin-hadoop${SPARK_HADOOP_VERSION}$HIVE_VERSION_SUFFIX/" \
"$DISTDIR/externals/spark-$SPARK_VERSION-bin-hadoop${SPARK_HADOOP_VERSION}$HIVE_VERSION_SUFFIX/"
fi
cp "$KYUUBI_HOME/externals/kyuubi-spark-sql-engine/target/kyuubi-spark-sql-engine-$VERSION.jar" "$DISTDIR/externals/engines/spark"

View File

@ -64,6 +64,12 @@
<scope>test</scope>
</dependency>
<dependency>
<groupId>commons-collections</groupId>
<artifactId>commons-collections</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-hive_${scala.binary.version}</artifactId>
@ -106,7 +112,13 @@
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<artifactId>hadoop-client-api</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client-runtime</artifactId>
<scope>test</scope>
</dependency>

View File

@ -53,7 +53,12 @@
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<artifactId>hadoop-client-api</artifactId>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client-runtime</artifactId>
</dependency>
<dependency>

View File

@ -56,7 +56,17 @@
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<artifactId>hadoop-client-api</artifactId>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client-runtime</artifactId>
</dependency>
<dependency>
<groupId>commons-codec</groupId>
<artifactId>commons-codec</artifactId>
</dependency>
<dependency>

View File

@ -40,7 +40,13 @@
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<artifactId>hadoop-client-api</artifactId>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client-runtime</artifactId>
<scope>provided</scope>
</dependency>

View File

@ -45,7 +45,12 @@
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<artifactId>hadoop-client-api</artifactId>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client-runtime</artifactId>
</dependency>
<dependency>

38
pom.xml
View File

@ -60,16 +60,18 @@
<scala.binary.version>2.12</scala.binary.version>
<apacheds.version>2.0.0-M15</apacheds.version>
<commons-codec.version>1.15</commons-codec.version>
<commons-collections.version>3.2.2</commons-collections.version>
<commons-lang3.version>3.10</commons-lang3.version>
<commons.httpclient.version>4.5.6</commons.httpclient.version>
<commons.httpcore.version>4.4.12</commons.httpcore.version>
<guava.version>24.1.1-jre</guava.version>
<curator.version>2.12.0</curator.version>
<hadoop.version>2.7.4</hadoop.version>
<hadoop.binary.version>2.7</hadoop.binary.version>
<hadoop.version>3.2.2</hadoop.version>
<hive.version>2.3.7</hive.version>
<spark.version>3.0.2</spark.version>
<spark.archive.name>spark-${spark.version}-bin-hadoop${hadoop.binary.version}.tgz</spark.archive.name>
<spark.hadoop.binary.version>3.2</spark.hadoop.binary.version>
<spark.archive.name>spark-${spark.version}-bin-hadoop${spark.hadoop.binary.version}.tgz</spark.archive.name>
<spark.archive.mirror>https://archive.apache.org/dist/spark/spark-${spark.version}</spark.archive.mirror>
<spark.archive.download.skip>false</spark.archive.download.skip>
@ -301,7 +303,13 @@
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<artifactId>hadoop-client-api</artifactId>
<version>${hadoop.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client-runtime</artifactId>
<version>${hadoop.version}</version>
</dependency>
@ -318,6 +326,18 @@
</exclusions>
</dependency>
<dependency>
<groupId>commons-codec</groupId>
<artifactId>commons-codec</artifactId>
<version>${commons-codec.version}</version>
</dependency>
<dependency>
<groupId>commons-collections</groupId>
<artifactId>commons-collections</artifactId>
<version>${commons-collections.version}</version>
</dependency>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-lang3</artifactId>
@ -1367,14 +1387,16 @@
</profile>
<profile>
<id>hadoop-2.7</id>
<id>spark-hadoop-2.7</id>
<properties>
<spark.hadoop.binary.version>2.7</spark.hadoop.binary.version>
</properties>
</profile>
<profile>
<id>hadoop-3.2</id>
<id>spark-hadoop-3.2</id>
<properties>
<hadoop.version>3.2.2</hadoop.version>
<hadoop.binary.version>3.2</hadoop.binary.version>
<spark.hadoop.binary.version>3.2</spark.hadoop.binary.version>
</properties>
</profile>
<profile>