[KYUUBI #1460] Bump Hudi 0.10.0

### _Why are the changes needed?_

https://hudi.apache.org/releases/release-0.10.0

Switch test from Hudi 0.9.0 & Spark 3.0 to Hudi 1.10.0 & Spark 3.1. Drop test of Spark 3.0 because

1. Hudi 1.10.0 claims to support Spark 3.0 but requires to build by self with specific maven profile, the published jar only support Spark 3.1
2. Hudi has lots of transitive deps and there are different in different versions, it's hard to maintain multi hudi versions.

### _How was this patch tested?_
- [x] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/latest/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #1460 from pan3793/hudi.

Closes #1460

bac633a7 [Cheng Pan] Update comments
0ab8784d [Cheng Pan] Pin hadoop-common
c811e679 [Cheng Pan] Exclude scala-library
63cad3b9 [Cheng Pan] Exclude hudi-aws
a13708f9 [Cheng Pan] Disable Hudi test on Spark 3.0
4b179bae [Cheng Pan] Enable Hudi test in Spark 3.1
79dca54c [Cheng Pan] Bump Hudi 0.10.0
71982890 [Cheng Pan] Add default Pk col

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Kent Yao <yao@apache.org>
This commit is contained in:
Cheng Pan 2021-12-16 10:13:42 +08:00 committed by Kent Yao
parent 77a278707f
commit 022f52bc7e
No known key found for this signature in database
GPG Key ID: F7051850A0AF904D
2 changed files with 43 additions and 4 deletions

View File

@ -120,17 +120,22 @@ trait HudiMetadataTests extends HiveJDBCTestHelper with HudiSuiteMixin {
val cols = dataTypes.zipWithIndex.map { case (dt, idx) => s"c$idx" -> dt }
val (colNames, _) = cols.unzip
val reservedCols = Seq(
val metadataCols = Seq(
"_hoodie_commit_time",
"_hoodie_commit_seqno",
"_hoodie_record_key",
"_hoodie_partition_path",
"_hoodie_file_name")
val defaultPkCol = "uuid"
val reservedCols = metadataCols :+ defaultPkCol
val tableName = "hudi_get_col_operation"
val ddl =
s"""
|CREATE TABLE IF NOT EXISTS $catalog.$defaultSchema.$tableName (
| $defaultPkCol string,
| ${cols.map { case (cn, dt) => cn + " " + dt }.mkString(",\n")}
|)
|USING hudi""".stripMargin

40
pom.xml
View File

@ -109,7 +109,7 @@
<hadoop.version>3.3.1</hadoop.version>
<hadoop.binary.version>3.2</hadoop.binary.version>
<hive.version>2.3.9</hive.version>
<hudi.version>0.9.0</hudi.version>
<hudi.version>0.10.0</hudi.version>
<iceberg.name>iceberg-spark3-runtime</iceberg.name>
<iceberg.version>0.12.1</iceberg.version>
<jackson.version>2.12.5</jackson.version>
@ -777,6 +777,27 @@
</dependency>
<!-- Hudi dependency -->
<!--
We don't use hadoop-common directly, it's only for suppressing exception:
Failed to execute goal org.apache.maven.plugins:maven-shade-plugin:3.2.4:shade (default) on project
kyuubi-spark-sql-engine_2.12: Error creating shaded jar: Could not resolve following dependencies:
[jdk.tools:jdk.tools:jar:1.6 (system)]
The issue only occurs on GitHub Action environment with Hudi 0.10.0 and JDK 11.
After few days digging, only found one place introduces jdk.tools,
- org.apache.hudi:hudi-common:jar:0.10.0:test
- org.apache.hbase:hbase-server:jar:1.2.3:test
- org.apache.hadoop:hadoop-common:jar:2.5.1:test
- org.apache.hadoop:hadoop-annotations:jar:2.5.1:test
- jdk.tools:jdk.tools:jar:1.6:system
-->
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>${hadoop.version}</version>
</dependency>
<dependency>
<groupId>org.apache.parquet</groupId>
<artifactId>parquet-avro</artifactId>
@ -794,6 +815,10 @@
<artifactId>hudi-spark-common_${scala.binary.version}</artifactId>
<version>${hudi.version}</version>
<exclusions>
<exclusion>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.hudi</groupId>
<artifactId>hudi-timeline-service</artifactId>
@ -826,6 +851,10 @@
<groupId>org.apache.orc</groupId>
<artifactId>*</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.hudi</groupId>
<artifactId>hudi-aws</artifactId>
</exclusion>
</exclusions>
</dependency>
@ -834,6 +863,10 @@
<artifactId>hudi-spark_${scala.binary.version}</artifactId>
<version>${hudi.version}</version>
<exclusions>
<exclusion>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.hudi</groupId>
<artifactId>hudi-spark-common_2.11</artifactId>
@ -1677,7 +1710,8 @@
<properties>
<spark.version>3.0.3</spark.version>
<delta.version>0.8.0</delta.version>
<maven.plugin.scalatest.exclude.tags>org.apache.kyuubi.tags.ExtendedSQLTest</maven.plugin.scalatest.exclude.tags>
<!-- Hudi 0.10.0 still support Spark 3.0, but need build by user with specific profile -->
<maven.plugin.scalatest.exclude.tags>org.apache.kyuubi.tags.ExtendedSQLTest,org.apache.kyuubi.tags.HudiTest</maven.plugin.scalatest.exclude.tags>
</properties>
</profile>
@ -1686,7 +1720,7 @@
<properties>
<spark.version>3.1.2</spark.version>
<delta.version>1.0.0</delta.version>
<maven.plugin.scalatest.exclude.tags>org.apache.kyuubi.tags.ExtendedSQLTest,org.apache.kyuubi.tags.HudiTest</maven.plugin.scalatest.exclude.tags>
<maven.plugin.scalatest.exclude.tags>org.apache.kyuubi.tags.ExtendedSQLTest</maven.plugin.scalatest.exclude.tags>
</properties>
<modules>
<module>dev/kyuubi-extension-spark-common</module>