Commit Graph

237 Commits

Author SHA1 Message Date
hezhao2
dbbcf4f4cd
[KYUUBI #5073] Correct the method name in SparkSQLLineageParserHelperSuite
### _Why are the changes needed?_

There are some typos in SparkSQLLineageParserHelperSuite, so this pr is to fix these mistakes.

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [ ] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request

Closes #5073 from zhaohehuhu/Improvement-0720.

Closes #5073

b6346369e [hezhao2] correct the method name in SparkSQLLineageParserHelperSuite

Authored-by: hezhao2 <hezhao2@cisco.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2023-07-20 15:37:29 +08:00
Deng An
3abc7ce371 [KYUUBI #4167] [Authz] Introduce function support in PrivilegeBuilder with Serde layers
### _Why are the changes needed?_

to close #4167 which parent issue is #3632
Introduce HiveFunctionPrivilegeBuilder refactored with Serde layers.
The core logic to this is scanning logical plan by transformAllExpressions and build function privileges for UDFs.

As a SQL gateway, Kyuubi also needs to support control user UDF usage behavior.

Therefore, the Spark SQL Authz module needs to add suport for extracting Privilege Objects for UDF usage in queries from a spark logical plans.

In Spark SQL, the hive permanent UDF is wrapped into the following types based on different types:

	Org.apache.spark.sql.live.HiveSimpleUDF
	Org.apache.spark.sql.live.HiveGenericUDF
	Org. apache. park. SQL. live. HiveUDAFFunction
	Org.apache.spark.sql.live.HiveGenericUDTF

Based on the existing serde module, we can define a ScanSpec to extract hive permanent udf expression's info, the most important info is the udf type.

And the udf type decide whether to skip the construction of the privilege object based on whether the function type is a system function or a temporary function, as we only consider the permanent udf uasge.

So we should define QualifiedNameStringFunctionExtractor and FunctionNameFunctionTypeExtractor to achieve the goal. And in `org.apache.kyuubi.plugin.spark.authz.PrivilegesBuilder#buildFunctions`, we define a method `buildFunctions`, use `transformAllExpressions` to gather all hive udf expression, and building input privilege objects for thoese permanent hive udf.

### _How was this patch tested?_
- [x] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [ ] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #4168 from packyan/feature_introduce_hive_function_privileges_builder.

Closes #4167

1276ac7f6 [Deng An] [KYUUBI #4167] [Authz] Introduce function support in PrivilegeBuilder with Serde layers

Authored-by: Deng An <packyande@gmail.com>
Signed-off-by: liangbowen <liangbowen@gf.com.cn>
2023-07-11 23:22:59 +08:00
yikaifei
5915d682b5
[KYUUBI #5022] [KSHC] CreateTable should use the correct provider
### _Why are the changes needed?_

This PR aims to fix a bug, In KSHC, `catalog.createTable` should use the correct provider.

### _How was this patch tested?_
- [x] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [ ] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request

Closes #5022 from Yikf/KSHC-createTable.

Closes #5022

cd8cb1cf2 [yikaifei] CreateTable should use the correct provider

Authored-by: yikaifei <yikaifei@apache.org>
Signed-off-by: yikaifei <yikaifei@apache.org>
2023-07-07 12:04:55 +08:00
yikaifei
46f8e0ca94
[KYUUBI #5017] [KSHC] Support Parquet/Orc provider is splitable
### _Why are the changes needed?_

This PR amins to support Parquet/Orc provider is splitable.

### _How was this patch tested?_
- [x] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request

Closes #5017 from Yikf/KSHC-support-split.

Closes #5017

9dc3d3d56 [yikaifei] Support Parquet/Orc provider is splitable

Authored-by: yikaifei <yikaifei@apache.org>
Signed-off-by: yikaifei <yikaifei@apache.org>
2023-07-06 19:21:05 +08:00
yikaifei
da82217388
[KYUUBI #5023] [KSHC] TableIdentify don't attach catalog
### _Why are the changes needed?_

As title, In KSHC, HiveTable's identify does not attach the catalog to prevent an incorrect catalogName. default catalog is "spark_catalog"

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [ ] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request

Closes #5023 from Yikf/tableName2.

Closes #5023

86b6a58d0 [yikaifei] KSHC v1IdentifierNoCatalog in spark3.4

Authored-by: yikaifei <yikaifei@apache.org>
Signed-off-by: ulyssesyou <ulyssesyou@apache.org>
2023-07-06 18:26:37 +08:00
zhaomin
7feb535668
[KYUUBI #5028] Update session hadoop conf to catalog hadoop conf
### _Why are the changes needed?_

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request

Closes #5028 from zhaomin1423/fix_hive_connector.

Closes #5028

d9c7e9c8a [zhaomin] Update session hadoop conf to catalog hadoop conf

Authored-by: zhaomin <zhaomin1423@163.com>
Signed-off-by: ulyssesyou <ulyssesyou@apache.org>
2023-07-06 18:25:12 +08:00
ulysses-you
5512c6c60f
[KYUUBI #5018] Make kyuubi spark extension compatible with Spark3.4
### _Why are the changes needed?_

The main change is copy code from `kyuubi-extension-spark-common_2.12` and `kyuubi-extension-spark-3-3`. The reason copy `kyuubi-extension-spark-common_2.12` is that, Spark3.4 v1 writes does not have ctas...

### _How was this patch tested?_
- [x] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [ ] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request

Closes #5018 from ulysses-you/extension.

Closes #5018

1ca4951e1 [ulysses-you] fix
fb20bb1f0 [ulysses-you] fix
338a3120a [ulysses-you] rebase
241a436fd [ulysses-you] fix
70618f368 [ulysses-you] fix
5f5e2b7c8 [ulysses-you] fix
301ae5c04 [ulysses-you] fix
bb9416811 [ulysses-you] fix
a38595d0a [ulysses-you] copy
9dec27d6c [ulysses-you] copy

Authored-by: ulysses-you <ulyssesyou18@gmail.com>
Signed-off-by: ulyssesyou <ulyssesyou@apache.org>
2023-07-06 18:22:23 +08:00
Fu Chen
11dcd30e88 [KYUUBI #1265] OPTIMIZE where clause expression support
### _Why are the changes needed?_

to close #1265

After this PR, the following case will work

```sql
CREATE TABLE p (c1 INT, c2 INT, c3 INT) PARTITIONED BY (event_date DATE);
OPTIMIZE p where event_date = current_date() ZORDER BY c1, c2;
```

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #2893 from cfmcgrady/where-expression-support.

Closes #1265

97ac710f0 [Fu Chen] Merge remote-tracking branch 'apache/master' into where-expression-support
c188f0b3d [Fu Chen] fix style
e5f7409d6 [Fu Chen] move verifyPartitionPredicates to KyuubiSparkSQLAstBuilder
f7234abba [Fu Chen] fix style
95d314122 [Fu Chen] fork PredicateHelper.isLikelySelective
1e596e3dd [Fu Chen] partition predicates constraint
541e373cc [Fu Chen] fix
06d9efdf0 [Fu Chen] adapt to spark-3.1/spark-3.2 suite
867263673 [Fu Chen] fix style
b6801b279 [Fu Chen] add test case
79ab60554 [Fu Chen] fix suite bug
cf1b16ee7 [Fu Chen] fix style
dc0ebd908 [Fu Chen] add ut
286d94cc6 [Fu Chen] fix style
1736d18f6 [Fu Chen] adapt to spark-3.1/spark-3.2
04e88a5aa [Fu Chen] fix nep
59103095b [Fu Chen] simplify logical
59fba01e4 [Fu Chen] adapt to spark-3.1
e6477a9c5 [Fu Chen] remove unused
855283e20 [Fu Chen] where clause expression support

Authored-by: Fu Chen <cfmcgrady@gmail.com>
Signed-off-by: Fu Chen <cfmcgrady@gmail.com>
2023-07-05 10:21:49 +08:00
Cheng Pan
1d5ac07dfc [KYUUBI #4999] [KSHC] Kyuubi-Spark-Hive-Connector support Apache Spark 3.4
### _Why are the changes needed?_

This pr amins to make KSHC support Apache Spark 3.4.

- KSHC support Apache Spark 3.4
- Make Apache kyuubi `codecov` module contain the spark-3.4 profile. so that Apache kyubbi CI can cover some modules.

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request

Closes #4999 from Yikf/kudu-spark3.4.

Closes #4999

6a35e54b8 [Cheng Pan] Update extensions/spark/kyuubi-spark-connector-hive/src/main/scala/org/apache/kyuubi/spark/connector/hive/HiveConnectorUtils.scala
66bb742eb [Cheng Pan] Update extensions/spark/kyuubi-spark-connector-hive/src/main/scala/org/apache/kyuubi/spark/connector/hive/HiveConnectorUtils.scala
7be517c7f [Cheng Pan] Update extensions/spark/kyuubi-spark-connector-hive/src/main/scala/org/apache/kyuubi/spark/connector/hive/HiveConnectorUtils.scala
ae23133d1 [Cheng Pan] Update extensions/spark/kyuubi-spark-connector-hive/src/main/scala/org/apache/kyuubi/spark/connector/hive/HiveConnectorUtils.scala
dda5e6521 [Cheng Pan] Update extensions/spark/kyuubi-spark-connector-hive/src/main/scala/org/apache/kyuubi/spark/connector/hive/HiveConnectorUtils.scala
e43a25dff [Cheng Pan] Update extensions/spark/kyuubi-spark-connector-hive/src/main/scala/org/apache/kyuubi/spark/connector/hive/HiveConnectorUtils.scala
54f52f16d [Cheng Pan] Update extensions/spark/kyuubi-spark-connector-hive/src/main/scala/org/apache/kyuubi/spark/connector/hive/HiveConnectorUtils.scala
0955b544b [Cheng Pan] Update pom.xml
38a1383d9 [yikaifei] codecov module should contain the spark 3.4 profile

Lead-authored-by: Cheng Pan <pan3793@gmail.com>
Co-authored-by: yikaifei <yikaifei@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2023-07-04 17:25:57 +08:00
zhaomin
80bc028e6d [KYUUBI #4995] Use hadoop conf and hive conf from catalog options
### _Why are the changes needed?_

There are hdfs-site.xml, hive-site, etc in spark job classpath, but we should use hadoop conf and hive conf from catalog options.

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request

Closes #4995 from zhaomin1423/fix_hive_connector.

Closes #4995

64429fdcb [Xiao Zhao] Update extensions/spark/kyuubi-spark-connector-hive/src/main/scala/org/apache/kyuubi/spark/connector/hive/HiveTableCatalog.scala
d921be750 [zhaomin] fix
375934d65 [zhaomin] Using hadoop conf and hive conf from catalog options

Lead-authored-by: zhaomin <zhaomin1423@163.com>
Co-authored-by: Xiao Zhao <zhaomin1423@163.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2023-06-26 15:04:39 +08:00
Cheng Pan
056c5efbe4
[KYUUBI #4986] Always use Files#deleteIfExists
### _Why are the changes needed?_

Always use `Files#deleteIfExists` to replace `Files#delete` to suppress the stacktrace like

```
17:14:47.167 ERROR org.apache.kyuubi.operation.BatchJobSubmission: Failed to remove corresponding log file of operation: /Users/chengpan/Projects/apache-kyuubi/server_operation_logs/9a96f19d-93a9-474c-b170-f957ed82c502/3fe99873-134b-4307-a286-94494ae847f2
java.io.IOException: Failed to remove corresponding log file of operation: /Users/chengpan/Projects/apache-kyuubi/server_operation_logs/9a96f19d-93a9-474c-b170-f957ed82c502/3fe99873-134b-4307-a286-94494ae847f2
	at org.apache.kyuubi.operation.log.OperationLog.trySafely(OperationLog.scala:272) ~[classes/:?]
	at org.apache.kyuubi.operation.log.OperationLog.close(OperationLog.scala:257) ~[classes/:?]
	at org.apache.kyuubi.operation.BatchJobSubmission.$anonfun$close$2(BatchJobSubmission.scala:328) ~[classes/:?]
	at org.apache.kyuubi.operation.BatchJobSubmission.$anonfun$close$2$adapted(BatchJobSubmission.scala:328) ~[classes/:?]
	at scala.Option.foreach(Option.scala:407) ~[scala-library-2.12.17.jar:?]
	at org.apache.kyuubi.operation.BatchJobSubmission.$anonfun$close$1(BatchJobSubmission.scala:328) ~[classes/:?]
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) ~[scala-library-2.12.17.jar:?]
	at org.apache.kyuubi.Utils$.withLockRequired(Utils.scala:415) ~[classes/:?]
	at org.apache.kyuubi.operation.AbstractOperation.withLockRequired(AbstractOperation.scala:51) ~[classes/:?]
	at org.apache.kyuubi.operation.BatchJobSubmission.close(BatchJobSubmission.scala:326) ~[classes/:?]
	at org.apache.kyuubi.session.KyuubiBatchSession.close(KyuubiBatchSession.scala:185) ~[classes/:?]
	at org.apache.kyuubi.session.KyuubiSessionManager.openBatchSession(KyuubiSessionManager.scala:181) ~[classes/:?]
	at org.apache.kyuubi.server.KyuubiBatchService.$anonfun$start$1(KyuubiBatchService.scala:96) ~[classes/:?]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
	at java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:264) [?:?]
	at java.util.concurrent.FutureTask.run(FutureTask.java) [?:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
	at java.lang.Thread.run(Thread.java:829) [?:?]
Caused by: java.nio.file.NoSuchFileException: /Users/chengpan/Projects/apache-kyuubi/server_operation_logs/9a96f19d-93a9-474c-b170-f957ed82c502/3fe99873-134b-4307-a286-94494ae847f2
	at sun.nio.fs.UnixException.translateToIOException(UnixException.java:92) ~[?:?]
	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111) ~[?:?]
	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116) ~[?:?]
	at sun.nio.fs.UnixFileSystemProvider.implDelete(UnixFileSystemProvider.java:249) ~[?:?]
	at sun.nio.fs.AbstractFileSystemProvider.delete(AbstractFileSystemProvider.java:105) ~[?:?]
	at java.nio.file.Files.delete(Files.java:1142) ~[?:?]
	at org.apache.kyuubi.operation.log.OperationLog.$anonfun$close$4(OperationLog.scala:257) ~[classes/:?]
	at org.apache.kyuubi.operation.log.OperationLog.trySafely(OperationLog.scala:263) ~[classes/:?]
	... 18 more
```

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request

Closes #4986 from pan3793/delete.

Closes #4986

7d49bfec0 [Cheng Pan] Always use Files#deleteIfExists

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2023-06-21 00:17:01 +08:00
liangbowen
eeee5c1ae3 [KYUUBI #4959] [MINOR] Code improvements for Scala
### _Why are the changes needed?_

- To improve Scala code with corrections, simplification, scala style, redundancy cleaning-up. No feature changes introduced.

Corrections:
- Class doesn't correspond to file name (SparkListenerExtensionTest)
- Correct package name in ResultSetUtil and PySparkTests

Improvements:
- 'var' could be a 'val'
- GetOrElse(null) to orNull

Cleanup & Simplification:
- Redundant cast inspection
- Redundant collection conversion
- Simplify boolean expression
- Redundant new on case class
- Redundant return
- Unnecessary parentheses
- Unnecessary partial function
- Simplifiable empty check
- Anonymous function convertible to a method value

Scala Style:
- Constructing range for seq indices
- Get and getOrElse to getOrElse
- Convert expression to Single Abstract Method (SAM)
- Scala unnecessary semicolon inspection
- Map and getOrElse(false) to exists
- Map and flatten to flatMap
- Null initializer can be replaced by _
- scaladoc link to method

Other Improvements:
- Replace map and getOrElse(true) with forall
- Unit return type in the argument of map
- Size to length on arrays and strings
- Type check can be pattern matching
- Java mutator method accessed as parameterless
- Procedure syntax in method definition

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [ ] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #4959 from bowenliang123/scala-Improve.

Closes #4959

2d36ff351 [liangbowen] code improvement for Scala

Authored-by: liangbowen <liangbowen@gf.com.cn>
Signed-off-by: liangbowen <liangbowen@gf.com.cn>
2023-06-16 21:20:17 +08:00
liangbowen
4deb98cd42 [KYUUBI #4970] Unified reflection methods invokeAs and getField
### _Why are the changes needed?_

- comment https://github.com/apache/kyuubi/pull/4963#discussion_r1230490326
- simplify reflection calling with unified `invokeAs` / `getField` method for either declared, inherited, or static methods / fields

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #4970 from bowenliang123/unify-invokeas.

Closes #4970

592833459 [liangbowen] Revert "dedicate invokeStaticAs method"
ad45ff3fd [liangbowen] dedicate invokeStaticAs method
f08528c0f [liangbowen] nit
42aeb9fcf [liangbowen] add ut case
b5b384120 [liangbowen] nit
072add599 [liangbowen] add ut
8d019ab35 [liangbowen] unified invokeAs and getField

Authored-by: liangbowen <liangbowen@gf.com.cn>
Signed-off-by: liangbowen <liangbowen@gf.com.cn>
2023-06-16 20:08:42 +08:00
zhouyifan279
34e79b4195 [KYUUBI #4917][Bug][AUTHZ] Table owner undefied in Iceberg 1.3.0 on Spark 3.4
### _Why are the changes needed?_
Fix #4917
- support extracting table owner from `ResolvedIdentifier`

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request
<img width="1266" alt="image" src="https://github.com/apache/kyuubi/assets/88070094/e3066d0e-7a14-41da-96f6-032a5c53780f">

Closes #4941 from zhouyifan279/drop-table.

Closes #4917

b2207ed17 [zhouyifan279] [KYUUBI #4917][Bug][AUTHZ] Table owner undefied in Iceberg 1.3.0 on Spark 3.4
bc4661a13 [zhouyifan279] [KYUUBI #4917][Bug][AUTHZ] Table owner undefied in Iceberg 1.3.0 on Spark 3.4

Authored-by: zhouyifan279 <zhouyifan279@gmail.com>
Signed-off-by: liangbowen <liangbowen@gf.com.cn>
2023-06-09 15:37:32 +08:00
liangbowen
c8a138f986 [KYUUBI #4933] [DOCS] [MINOR] Mark spark.sql.optimizer.insertRepartitionNum config for Spark 3.1 only
### _Why are the changes needed?_

- Update doc to mark the spark plugin's config `spark.sql.optimizer.insertRepartitionNum` used for Spark 3.1 only

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [ ] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #4933 from bowenliang123/insert-num.

Closes #4933

5ed6e2867 [liangbowen] comment and style
280a6af03 [liangbowen] spark.sql.optimizer.insertRepartitionNum only available for Spark 3.1.x
7f01cf3b6 [liangbowen] spark.sql.optimizer.insertRepartitionNum only available for Spark 3.1.x

Authored-by: liangbowen <liangbowen@gf.com.cn>
Signed-off-by: liangbowen <liangbowen@gf.com.cn>
2023-06-09 08:30:23 +08:00
zhouyifan279
9ff46a3c63
[KYUUBI #4935] More than target num of executors may survive after FinalStageResourceManager did kill
### _Why are the changes needed?_
When FinalStageResourceManager chooses executors to be killed, it may add dead executors to the kill list.
This will leave more than target num of executors survived and cause resource waste.

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #4936 from zhouyifan279/kill-executor.

Closes #4936

2aaa84cb1 [zhouyifan279] [KYUUBI#4935][Improvement] More than target num of executors may survive after FinalStageResourceManager did kill

Authored-by: zhouyifan279 <zhouyifan279@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2023-06-08 20:18:19 +08:00
liangbowen
d0675a35a7 [KYUUBI #4879] Refactor and promote relection utils and cleanup similar reflection methods
### _Why are the changes needed?_

- apply the usage of `ReflectUtils` and `Dyn*` to the modules of engines and plugins (eg. Spark engine, Authz plugin, lineage plugin, beeline)
- remove similar redundant methods for calling reflected methods or getting field values
- unified reflection helper methods with type casting support, as `getField[T]` for getting field values from `getFields`, `invokeAs[T]` for invoking methods in `getMethods`.

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #4879 from bowenliang123/reflect-use.

Closes #4879

c685fb67d [liangbowen] bug fix for "Cannot bind static field options" when executing "bin/beeline"
fc1fdf1de [liangbowen] import
59c3dd032 [liangbowen] comment
c435c131d [liangbowen] reflect util usage

Authored-by: liangbowen <liangbowen@gf.com.cn>
Signed-off-by: liangbowen <liangbowen@gf.com.cn>
2023-06-06 18:59:18 +08:00
wforget
408862af72
[KYUUBI #4814] Introduce Apache Atlas hook support in lineage plugin
### _Why are the changes needed?_

Implements AtlasLineageDispatcher to send lineage to Apache Atlas.

close #4814

Atlas Spark Model Definition: https://github.com/apache/atlas/blob/master/addons/models/1000-Hadoop/1100-spark_model.json

spark process:

![1](https://github.com/apache/kyuubi/assets/17894939/28e2c68c-0ffd-4f1d-b805-a7e964f85aab)

table lineage:

![2](https://github.com/apache/kyuubi/assets/17894939/76b3db6d-ed50-42e3-97cf-2f96d4e403df)

column lineage:

![3](https://github.com/apache/kyuubi/assets/17894939/41ae6ef8-acbf-43b9-ad05-42d669c5e950)

### _How was this patch tested?_
- [X] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [X] Add screenshots for manual tests if appropriate

- [ ] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #4815 from wForget/KYUUBI-4814.

Closes #4814

3df8a7ec9 [wforget] comments
c58eae7c5 [wforget] comments
926bcf211 [wforget] comment
e0b4067c3 [wforget] comment
e4cc3e3f8 [wforget] comments
adc72b96f [Bowen Liang] Update extensions/spark/kyuubi-spark-lineage/src/main/scala/org/apache/kyuubi/plugin/lineage/dispatcher/atlas/AtlasEntityHelper.scala
e3bdd1c65 [Bowen Liang] Update extensions/spark/kyuubi-spark-lineage/src/main/scala/org/apache/kyuubi/plugin/lineage/dispatcher/atlas/AtlasEntityHelper.scala
baf1711ac [Bowen Liang] Update extensions/spark/kyuubi-spark-lineage/src/test/scala/org/apache/kyuubi/plugin/lineage/dispatcher/atlas/AtlasLineageDispatcherSuite.scala
61e79f3d5 [Bowen Liang] Update extensions/spark/kyuubi-spark-lineage/src/test/scala/org/apache/kyuubi/plugin/lineage/dispatcher/atlas/AtlasLineageDispatcherSuite.scala
541df3780 [Bowen Liang] Update extensions/spark/kyuubi-spark-lineage/src/test/scala/org/apache/kyuubi/plugin/lineage/dispatcher/atlas/AtlasLineageDispatcherSuite.scala
5dd310657 [wforget] fix
cea1e137d [wforget] fix
f028d4b09 [wforget] fix
0c9b4516b [wforget] fix
6f8113032 [wforget] add close atlas client shutdown hook
3f4d2a7db [wforget] add remote user
a0db58afc [wforget] comments
6dd3c66df [wforget] comments
f2b2a30dc [wforget] style
83eb1e481 [wforget] add atlas.column.lineage.enable configuration
0719a2b65 [wforget] doc
05f936005 [wforget] fix
d169b661d [wforget] fix
6da80d742 [wforget] fix
820ae5c5f [wforget] column lineages
dabe8173e [wforget] license
f22e044d2 [wforget] test
b948bce90 [wforget] fix and add test
0aef1be6b [wforget] fix
368b5ab3f [wforget] [KYUUBI-4814] Implements AtlasLineageDispatcher to send lineage to Apache Atlas

Lead-authored-by: wforget <643348094@qq.com>
Co-authored-by: Bowen Liang <bowenliang@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2023-06-06 17:47:19 +08:00
liangbowen
cb689b66b1 [KYUUBI #4914] [AUTHZ] Reuse extractor singleton instance with generalized getter for supported extractor types
### _Why are the changes needed?_

- Reuse extractor singleton instance for less memory footprint, as Authz's extractors are stateless and ready for sharing
- Reneralized getter crossing supported extractor types
   - get extractor by class type
   - get extractor by explicit class name

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #4914 from bowenliang123/authz-get-extractor.

Closes #4914

11dde777f [liangbowen] update
77ce00276 [liangbowen] make extractorKey of lookupExtractor not null by default
5f5b6e580 [liangbowen] Revert "extractorKey: String = null => extractorKey: Option[String] = None"
400c3b054 [liangbowen] extractorKey: String = null => extractorKey: Option[String] = None
60acd27ec [liangbowen] rename `getExtractor` to `lookupExtractor`
e6fbb450f [liangbowen] generalize getExtractor for getting instance of supported types of extractors

Authored-by: liangbowen <liangbowen@gf.com.cn>
Signed-off-by: liangbowen <liangbowen@gf.com.cn>
2023-06-05 13:11:34 +08:00
liangbowen
0171c63425
[KYUUBI #4913] [TEST] [MINOR] Eliminate unnecessary output in ut "union an unmasked table"
### _Why are the changes needed?_

- Eliminate unnecessary output in ut "union an unmasked table"
<img width="433" alt="image" src="https://github.com/apache/kyuubi/assets/1935105/f179b827-6144-4887-b1fb-ba11da5a5f2b">

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #4913 from bowenliang123/authz-remove-show.

Closes #4913

2f6f43080 [liangbowen] remove unnecessary output in ut "union an unmasked table"

Authored-by: liangbowen <liangbowen@gf.com.cn>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2023-06-01 11:54:03 +08:00
liangbowen
bbf855df0e
[KYUUBI #4912] [TEST] Replace Scala's assert in tests with Scalatest's for prettified error message
### _Why are the changes needed?_

- replacing callings to Scala's assert method by Scalatest's `Assertions.assert`
- While Scala's assert method just throws a simple Java's Assertion Error ,
```
  def assert(assertion: Boolean) {
    if (!assertion)
      throw new java.lang.AssertionError("assertion failed")
  }
```
the Scalatest's `Assertions.assert` prettifies the error message, eg.,
`assert(a == b || c >= d) // Error message: 1 did not equal 2, and 3 was not greater than or equal to 4`

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #4912 from bowenliang123/scalatest-assert.

Closes #4912

e1d2ce3e0 [liangbowen] use Scalatest's assert for better error message

Authored-by: liangbowen <liangbowen@gf.com.cn>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2023-06-01 11:52:56 +08:00
liangbowen
cf886c9676 [KYUUBI #4905] Generalize util method for loading class from service loader
### _Why are the changes needed?_

- Generalize util method for loading class from service loader in `kyuubi-util-scala` module

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [ ] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #4905 from bowenliang123/service-load-util.

Closes #4905

545183fbf [liangbowen] nit
8714e0591 [liangbowen] rename loadClassFromServiceLoader to loadFromServiceLoader
11936419e [liangbowen] nit
81584e335 [liangbowen] fix loadExtractorsToMap
1d64b662d [liangbowen] fix
b7d8895d3 [liangbowen] update
e15b7d22c [liangbowen] optimize JpsApplicationOperationSuite
c58ef573c [liangbowen] simplify ConnectionProvider.loadProviders
31de53df8 [liangbowen] nit
fca265998 [liangbowen] simplify
1fada9516 [liangbowen] import
323b2bd0e [liangbowen] generalize util method for loading class from service loader

Authored-by: liangbowen <liangbowen@gf.com.cn>
Signed-off-by: liangbowen <liangbowen@gf.com.cn>
2023-05-31 20:37:26 +08:00
liangbowen
5cc51c8ac6 [KYUUBI #4910] Extract table from ResolvedIdentifier for DropTable in Spark 3.4
### _Why are the changes needed?_

- adapting changes in logical plan of DropTable in Spark 3.4 by extracting table object from ResolvedIdntifier, to fix test w/ Spark 3.4 ut "DropTable"

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #4910 from bowenliang123/authz-resolved-idtable.

Closes #4910

53c76f66d [liangbowen] Extract table from ResolvedIdentifier for DropTable in Spark 3.4

Authored-by: liangbowen <liangbowen@gf.com.cn>
Signed-off-by: liangbowen <liangbowen@gf.com.cn>
2023-05-31 20:10:48 +08:00
zhouyifan279
8f61835630 [KYUUBI #4903] [AUTHZ] Fix NoSuchElementException when listing database in CatalogImpl in Spark 3.4
### _Why are the changes needed?_
Fix #4902

We changed `ObjectFilterPlaceHolder` to extend `UnaryNode` so that `CatalogImpl#listDatabases()` can get `ShowNamespaces` object in LogicalPlan.

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #4903 from zhouyifan279/ShowNamespaces.

Closes #4903

8bf3e1391 [zhouyifan279] [KYUUBI#4902] Fix NoSuchElementException when listing database in CatalogImpl in Spark 3.4
8698b4a48 [zhouyifan279] [KYUUBI#4902] Fix NoSuchElementException when listing database in CatalogImpl in Spark 3.4
a9ad36051 [zhouyifan279] [KYUUBI#4902] Fix NoSuchElementException when listing database in CatalogImpl in Spark 3.4
78d3d6336 [zhouyifan279] [KYUUBI#4902] Fix NoSuchElementException when listing database in CatalogImpl in Spark 3.4

Authored-by: zhouyifan279 <zhouyifan279@gmail.com>
Signed-off-by: liangbowen <liangbowen@gf.com.cn>
2023-05-31 13:05:15 +08:00
liangbowen
6b301b0684 [KYUUBI #4904] Move AssertionUtils to kyuubi-util-scala module
### _Why are the changes needed?_

- move `AssertionUtils` class to `kyuubi-util-scala` module's test source

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #4904 from bowenliang123/assert-utils.

Closes #4904

ee0c0ad88 [liangbowen] Move AssertionUtils to kyuubi-util-scala module

Authored-by: liangbowen <liangbowen@gf.com.cn>
Signed-off-by: liangbowen <liangbowen@gf.com.cn>
2023-05-30 21:08:21 +08:00
liangbowen
1e3eff69db Revert "[KYUUBI #4900] [AUTHZ] Extract table from ResolvedIdentifier for DropTable in Spark 3.4"
This reverts commit 5a95b50bda.
2023-05-30 17:20:53 +08:00
liangbowen
5a95b50bda [KYUUBI #4900] [AUTHZ] Extract table from ResolvedIdentifier for DropTable in Spark 3.4
### _Why are the changes needed?_

- adapting changes in logical plan of DropTable in Spark 3.4 by extracting table object from `ResolvedIdntifier`, to fix test w/ Spark 3.4
  - ut "DropTable"

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #4900 from bowenliang123/authz-resolved-idtable.

Closes #4900

560bb6288 [liangbowen] apply ResolvedIdentifierTableExtractor to CreateTable and DropTable

Authored-by: liangbowen <liangbowen@gf.com.cn>
Signed-off-by: liangbowen <liangbowen@gf.com.cn>
2023-05-30 16:02:57 +08:00
liangbowen
eb5b07b9a7 [KYUUBI #4899] [AUTHZ] Extract function from FunctionIdentifier for CreateFunction and DropFunction in Spark 3.4
### _Why are the changes needed?_

- adapting changes in logical plan of CreateFunction/DropFunction  in Spark 3.4 by extracting table object from `FunctionIdentifier`, to fix tests on Spark 3.4
  - ut "CreateFunctionCommand"
  - ut "DropFunctionCommand"

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #4899 from bowenliang123/authz-functionid.

Closes #4899

464d3eb3d [liangbowen] apply FunctionIdentifierFunctionTypeExtractor to CreateFunction/DropFunction

Authored-by: liangbowen <liangbowen@gf.com.cn>
Signed-off-by: liangbowen <liangbowen@gf.com.cn>
2023-05-30 15:31:31 +08:00
liangbowen
a1ce7fb684 [KYUUBI #4892] [AUTHZ] Make identifier part name comparision case insenstive in tests of PrivilegeBuilder
### _Why are the changes needed?_

- the identifier parts are turned into lower case by default as `spark.sql.caseSensitive` config  (including catalog, database, table, function name), in [`SessionCatalog.qualifyIdentifier`](https://github.com/apache/spark/pull/37415/files#diff-9dd0899e5406230aeff96654432da54f35255f6dc60eecb87264a5c508a8c826R161) of <https://github.com/apache/spark/pull/37415>
- fix failed ut in Authz pluin tested w/ Spark 3.4
  - AlterTableRenameCommand
  - AlterTableAddPartitionCommand
  - AlterViewAsCommand
  - AlterTableDropPartitionCommand
  - RefreshFunctionCommand
  - AlterTableRenamePartitionCommand
  - AlterTableSetLocationCommand
  - AlterTable(Un)SetPropertiesCommand
  - TruncateTableCommand
  - AlterTableAddColumnsCommand
  - AlterTableChangeColumnCommand
  - ShowCreateTableAsSerdeCommand

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #4892 from bowenliang123/authz-assert-incase.

Closes #4892

8500dd8ed [liangbowen] case insenstive assertion to identifer part name

Authored-by: liangbowen <liangbowen@gf.com.cn>
Signed-off-by: liangbowen <liangbowen@gf.com.cn>
2023-05-30 13:55:18 +08:00
liangbowen
ec7e7479fa [KYUUBI #4869] [AUTHZ] Introduce table extractor for ResolvedIdentifier in Spark 3.4
### _Why are the changes needed?_

- introduce ResolvedIdentifierTableExtractor for extracting table from `org.apache.spark.sql.catalyst.analysis.ResolvedIdentifier` in Spark 3.4
- fixing ut failures w/ Spark 3.4
   -  ut CreateTable / CreateTableAsSelect / ReplaceTable / ReplaceTableAsSelect
   -  ut "Extracting table info with ResolvedDbObjectNameTableExtractor"

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #4869 from bowenliang123/resolved.

Closes #4869

0bf65cd60 [liangbowen] introduce ResolvedIdentifierTableExtractor for spark 3.4

Authored-by: liangbowen <liangbowen@gf.com.cn>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2023-05-27 22:30:19 +08:00
liangbowen
151b2eaec2 [KYUUBI #4888] [AUTHZ] Remove filtering results for ShowDatabasesCommand in Spark 2.x
### _Why are the changes needed?_

- remove FilteredShowDatabasesCommand for filtering "show databases" in Spark 2.x, as ShowDatabasesCommand is removed since Spark 3.0

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #4888 from bowenliang123/remove-showdatabasecommand.

Closes #4888

d4ae60bc1 [liangbowen] remove FilteredShowDatabasesCommand for filtering show database in spark 2.4

Authored-by: liangbowen <liangbowen@gf.com.cn>
Signed-off-by: liangbowen <liangbowen@gf.com.cn>
2023-05-26 16:49:43 +08:00
liangbowen
810a41c54e [KYUUBI #4871] [AUTHZ] Adapt plan changes for CreateNamespace and SetCatalogAndNamespace in Spark 3.4
### _Why are the changes needed?_

- namespace changed from `Seq[String]` to `ResolvedNamespace` in Spark 3.4
- fixing uts in the name of `CreateNamespace` and `Extracting database info with ResolvedDBObjectNameDatabaseExtractor` w/ Spark 3.4 for  CreateNamespace and SetCatalogAndNamespace commands

Testing by running `build/mvn clean install -pl :kyuubi-spark-authz_2.12 -Pspark-3.4 -Dmaven.plugin.scalatest.exclude.tags=org.apache.kyuubi.tags.IcebergTest`.

Before:
```
- CreateNamespace *** FAILED ***
  0 did not equal 1 (V2CommandsPrivilegesSuite.scala:707)
...
- Extracting database info with ResolvedDBObjectNameDatabaseExtractor *** FAILED ***
  java.lang.NullPointerException:
  at org.apache.kyuubi.plugin.spark.authz.V2JdbcTableCatalogPrivilegesBuilderSuite.$anonfun$new$24(V2JdbcTableCatalogPrivilegesBuilderSuite.scala:151)
  at org.scalatest.Assertions.withClue(Assertions.scala:1065)
  at org.scalatest.Assertions.withClue$(Assertions.scala:1052)
  at org.scalatest.funsuite.AnyFunSuite.withClue(AnyFunSuite.scala:1564)
  at org.apache.kyuubi.plugin.spark.authz.V2JdbcTableCatalogPrivilegesBuilderSuite.$anonfun$new$21(V2JdbcTableCatalogPrivilegesBuilderSuite.scala:150)
  at scala.collection.immutable.List.foreach(List.scala:431)
  at org.apache.kyuubi.plugin.spark.authz.V2JdbcTableCatalogPrivilegesBuilderSuite.$anonfun$new$20(V2JdbcTableCatalogPrivilegesBuilderSuite.scala:143)
  at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
  at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
  at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
```

After:
```
- CreateNamespace
- Extracting database info with ResolvedDBObjectNameDatabaseExtractor
```

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #4871 from bowenliang123/create-ns.

Closes #4871

fa141fb5a [liangbowen] update spec json
2cd18f829 [liangbowen] use meaningful desc names
142224d1e [liangbowen] Adapt changes for CreateNamespace and SetCatalogAndNamespace

Authored-by: liangbowen <liangbowen@gf.com.cn>
Signed-off-by: liangbowen <liangbowen@gf.com.cn>
2023-05-26 09:26:10 +08:00
liangbowen
320178bb68
[KYUUBI #4873] [AUTHZ] Refactor Authz reflection with kyuubi-util's DynMethods
### _Why are the changes needed?_

- add reflection utils in kyuubi-util-scala, using kyuubi-util's DynMethods
- continue to provided simplified reflection calling in scala

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #4873 from bowenliang123/authz-reflect.

Closes #4873

d0a508400 [liangbowen] import
95d4760ad [Cheng Pan] Update kyuubi-util-scala/src/main/scala/org/apache/kyuubi/util/reflect/ReflectUtils.scala
83e70f09b [liangbowen] authz reflect

Lead-authored-by: liangbowen <liangbowen@gf.com.cn>
Co-authored-by: Cheng Pan <pan3793@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2023-05-23 15:25:54 +08:00
liangbowen
aee9b946f3 [KYUUBI #4874] [AUTHZ] [MINOR] Improve methods in AuthzUtils
### _Why are the changes needed?_

- remove unused methods, passSparkVersionCheck and isSparkVersionEqualTo
- extract sparkSemanticVersion singleton
- move spark version helper to AuthzUtils

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #4874 from bowenliang123/authz-util-improve.

Closes #4874

36a8bb157 [Bowen Liang] Merge branch 'master' into authz-util-improve
28e798e88 [liangbowen] import
1c345b984 [liangbowen] blank line
0797143da [liangbowen] blank line
2f368b838 [liangbowen] remove unused method passSparkVersionCheck and isSparkVersionEqualTo, extract sparkSemanticVersion

Lead-authored-by: liangbowen <liangbowen@gf.com.cn>
Co-authored-by: Bowen Liang <bowenliang@apache.org>
Signed-off-by: liangbowen <liangbowen@gf.com.cn>
2023-05-23 11:38:42 +08:00
liangbowen
0456d14fa8 [KYUUBI #4875] [AUTHZ] Remove checking Spark v2 in tests since Spark v2 not supported
### _Why are the changes needed?_

- remove assuming Spark v2 in Authz testing, since Spark v2 is marked not supported

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #4875 from bowenliang123/authz-remove-spark2.

Closes #4875

6686a4d01 [liangbowen] remove checking spark v2

Authored-by: liangbowen <liangbowen@gf.com.cn>
Signed-off-by: liangbowen <liangbowen@gf.com.cn>
2023-05-23 11:31:22 +08:00
liangbowen
13f6affb6e
[KYUUBI #4866] Add annotation for Iceberg tests in Authz plugin
### _Why are the changes needed?_

- latest released Iceberg plugin does not support Spark 3.4, and it also blocks other tests with exception thrown :
```
IcebergCatalogRangerSparkExtensionSuite:
*** RUN ABORTED ***
  java.lang.NoClassDefFoundError: org/apache/spark/sql/catalyst/expressions/AnsiCast
  at org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions.$anonfun$apply$6(IcebergSparkSessionExtensions.scala:54)
```

- introduce annotation AuthzIcebergTest for Iceberg ut in Authz plugin, for skipping Iceberg test when testing w/ Spark 3.4 locally
-` build/mvn clean install -pl :kyuubi-spark-authz_2.12 -Pspark-3.4 -Dmaven.plugin.scalatest.exclude.tags=org.apache.kyuubi.tags.IcebergTest`

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #4866 from bowenliang123/tag-authz-iceberg.

Closes #4866

3c4348f14 [liangbowen] change to @IcebergTest
ddd1b885c [liangbowen] add AuthzIcebergTest

Authored-by: liangbowen <liangbowen@gf.com.cn>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2023-05-23 00:57:26 +08:00
Cheng Pan
01d80eb272
[KYUUBI #4870] Add kyuubi-util and kyuubi-util-scala modules
### _Why are the changes needed?_

Close #4870

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #4872 from pan3793/util.

Closes #4870

0b9fe3cba [Cheng Pan] nit
ecc5ee4f2 [Cheng Pan] fix
63be7a20c [Cheng Pan] test
85363c187 [Cheng Pan] style
2227247dd [Cheng Pan] fix package
11d10a081 [Cheng Pan] Add kyuubi-util and kyuubi-util-scala modules

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2023-05-22 22:13:56 +08:00
wforget
474f0972a4 [KYUUBI #4834] [MINOR] Reduce the scope of method references in Authz plugin cleanup shutdown hook
### _Why are the changes needed?_

Reduce the scope of method references

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [ ] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #4834 from wForget/minor.

Closes #4834

2061dc942 [wforget] [MINOR] Reduce the scope of method references

Authored-by: wforget <643348094@qq.com>
Signed-off-by: liangbowen <liangbowen@gf.com.cn>
2023-05-12 22:36:10 +08:00
Cheng Pan
ab1f67cb31
[KYUUBI #4741] Kyuubi Spark Engine/TPC connectors support Spark 3.4
### _Why are the changes needed?_

- Add CI for Spark 3.4
- Kyuubi Spark TPC-DS/H connectors support Spark 3.4

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #4741 from pan3793/spark-3.4.

Closes #4741

84a2d6ad7 [Cheng Pan] log
b9b2ec1fb [Cheng Pan] Add spark-3.4 profile

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2023-04-23 20:17:20 +08:00
wforget
19d5a9a371
[KYUUBI #4641] Add MaxFileSizeStrategy to limit max scan file size
### _Why are the changes needed?_

Add MaxFileSizeStrategy to limit max scan file size.
close #4641

### _How was this patch tested?_
- [X] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [ ] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #4642 from wForget/KYUUBI-4641.

Closes #4641

14a680f8e [wforget] comment
d2a393d97 [wforget] comment
b1ef4c52c [wforget] fix
d9e94bd8e [wforget] fix style
8a9121131 [wforget] use optional value
094eb61e3 [wforget] combine
89e2cb4d0 [wforget] [KYUUBI-4641] Add MaxFileSizeStrategy to limit max scan file size

Authored-by: wforget <643348094@qq.com>
Signed-off-by: ulyssesyou <ulyssesyou@apache.org>
2023-04-23 17:51:44 +08:00
packyan
cba1be9739 [KYUUBI #4717] [AUTHZ] Check Authz plugin's spec json files in UT
### _Why are the changes needed?_

to close #4715

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #4717 from packyan/imporve_authz_spec_json_should_be_generated_in_each_build.

Closes #4717

88e70daa7 [Deng An] Update JsonSpecFileGenerator.scala
d195a6db7 [Deng An] Merge branch 'master' into imporve_authz_spec_json_should_be_generated_in_each_build
a078c8c53 [packyan] add ut for check or generate spec json files.

Lead-authored-by: packyan <packyande@gmail.com>
Co-authored-by: Deng An <36296995+packyan@users.noreply.github.com>
Co-authored-by: Deng An <packy@Dengs-MBP.lan>
Signed-off-by: liangbowen <liangbowen@gf.com.cn>
2023-04-20 09:41:08 +08:00
liangbowen
2ed0990b73 [KYUUBI #4676] [AUTHZ] Reuse users and namespaces in both tests and policy file generation
### _Why are the changes needed?_

- align the same list of users and namespaces used in tests and in policy file generation, as users and namespaces are the most important elements of Ranger policy's conditions and resources.
- help to improve and simplify the decision in Authz testing and make a clear view of what's exactly tested and authorized, and very handy and easy to see the usage link in IDE
- reduce possible abuse and untracable uses of authorized and unauthorized users, rules, resources. (We have up to 4 unauthorized users in separated tests!)

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #4676 from bowenliang123/authz-gen-common.

Closes #4676

dc535a4d8 [liangbowen] authz-gen-common

Authored-by: liangbowen <liangbowen@gf.com.cn>
Signed-off-by: liangbowen <liangbowen@gf.com.cn>
2023-04-19 18:06:06 +08:00
Cheng Pan
609018a6b2
[KYUUBI #4727] [DOC] kyuubi-spark-lineage has no transitive deps
### _Why are the changes needed?_

Update outdated docs

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [ ] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #4727 from pan3793/lineage-doc.

Closes #4727

b6843b282 [Cheng Pan] [DOC] kyuubi-spark-lineage has no transitive deps

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: odone <odone.zhang@gmail.com>
2023-04-19 17:48:14 +08:00
Deng An
4581920a31 [KYUUBI #4716] [KYUUBI 4715] [AUTHZ] Fix the incorrect class name of InsertIntoHiveDirCommand in table spec generator
### _Why are the changes needed?_

to close #4715

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #4716 from packyan/improve_prevent_edit_auto_generated_files.

Closes #4716

b6fff8fe7 [Deng An]  fix the inconsistency in the spec json file

Authored-by: Deng An <packyande@gmail.com>
Signed-off-by: liangbowen <liangbowen@gf.com.cn>
2023-04-18 23:48:21 +08:00
Deng An
258172fde9
Revert "[KYUUBI #4712] Bump Spark from 3.2.3 to 3.2.4"
This reverts commit 93ba8f762f.
2023-04-18 00:31:47 +08:00
Anurag Rajawat
93ba8f762f [KYUUBI #4712] Bump Spark from 3.2.3 to 3.2.4
### _Why are the changes needed?_

Fixes #4712

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [ ] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #4718 from anurag-rajawat/upgrade-spark.

Closes #4712

79dcf1b79 [Anurag Rajawat] Bump Spark from 3.2.3 to 3.2.4

Authored-by: Anurag Rajawat <anuragsinghrajawat22@gmail.com>
Signed-off-by: liangbowen <liangbowen@gf.com.cn>
2023-04-17 09:14:20 +08:00
liangbowen
52251ddae2 [KYUUBI #4677] [AUTHZ] Check generated policy file in test suite
### _Why are the changes needed?_

- add ut to check generated Ranger policy file in #4585
- manually activated `genpolicy` profile in CI builds, as the property based activation not auto-triggered  as expectedly with property `ranger.version=2.4.0` set in project parent pom
- Support regenerated policy file within the same test suite, by running
`KYUUBI_UPDATE=1 build/mvn clean test -pl :kyuubi-spark-authz_2.12 -Dtest=none -DwildcardSuites=org.apache.kyuubi.plugin.spark.authz.gen.PolicyJsonFileGenerator -Pgenpolicy`

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #4677 from bowenliang123/authz-check-policy-gen.

Closes #4677

a372bdfd4 [liangbowen] remove unnecessary profile used in style workflow
7562c88f2 [liangbowen] include in spotless
37b674223 [liangbowen] update policy id
724ec5e28 [liangbowen] replace counter by using zipWithIndex
d322980e7 [liangbowen] extract KRangerPolicyResource object to simplify resource assembly
42c37606a [liangbowen] nit
18a8f4c51 [liangbowen] add usage comments
4ee254d6d [liangbowen] fix issue name in docs
d3cb08d21 [liangbowen] improve file reading
37e4c9c9f [Bowen Liang] Merge branch 'master' into authz-check-policy-gen
6366c50e4 [liangbowen] rename profile to `gen-policy` and remove activation rule by property setting
892faf5ef [liangbowen] update clue
266baa71a [liangbowen] update
cb94e8014 [liangbowen] update
de1f36531 [liangbowen] cleanup
e88c75d46 [liangbowen] check policy file gen

Lead-authored-by: liangbowen <liangbowen@gf.com.cn>
Co-authored-by: Bowen Liang <bowenliang@apache.org>
Signed-off-by: liangbowen <liangbowen@gf.com.cn>
2023-04-13 16:08:32 +08:00
liangbowen
d7532c5fd5 [KYUUBI #4615] Bump Ranger from 2.3.0 to 2.4.0
### _Why are the changes needed?_

To close #4615
- bump Ranger version to 2.4.0, release notes: https://cwiki.apache.org/confluence/display/RANGER/Apache+Ranger+2.4.0+-+Release+Notes
- #4585 fixed duplication and conflict in policy file
- update docs

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #4675 from bowenliang123/ranger-2.4.0.

Closes #4615

d403bc324 [liangbowen] bump ranger from 2.3.0 to 2.4.0

Authored-by: liangbowen <liangbowen@gf.com.cn>
Signed-off-by: liangbowen <liangbowen@gf.com.cn>
2023-04-10 13:12:34 +08:00
ulysses-you
91a2ab3665
[KYUUBI #4678] Improve FinalStageResourceManager kill executors
### _Why are the changes needed?_

This pr change two things:
1. add a config to kill executors if the plan contains table caches. It's not always safe to kill executors if the cache is referenced by two write-like plan.
2. force adjustTargetNumExecutors when killing executors. YarnAllocator` might re-request original target executors if DRA has not updated target executors yet. Note, DRA would re-adjust executors if there are more tasks to be executed, so we are safe. It's better to adjuest target num executor once we kill executors.

### _How was this patch tested?_
These issues are found during my POC

Closes #4678 from ulysses-you/skip-cache.

Closes #4678

b12620954 [ulysses-you] Improve kill executors

Authored-by: ulysses-you <ulyssesyou18@gmail.com>
Signed-off-by: ulyssesyou <ulyssesyou@apache.org>
2023-04-10 11:41:37 +08:00
huangzhir
a834ed3efb
[KYUUBI #4530] [AUTHZ] Support non-English chars for MASK, MASK_SHOW_FIRST_4, and MASK_SHOW_FIRST_4
### _Why are the changes needed?_
To fix https://github.com/apache/kyuubi/issues/4530.
1. The reason for issue https://github.com/apache/kyuubi/issues/4530  is that MASK_SHOW_FIRST_4 and MASK_SHOW_LAST_4 mask types are currently implemented using the regexp_replace method, which only replaces English letters and digits, but ignores other languages, such as Chinese.
2. To fix this issue, I modified the regexp_replace method to replace no-english characters to 'U' letters, so they will also be masked properly.

### _How was this patch tested?_

- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #4643 from huangzhir/fixbug-datamask.

Closes #4530

abe45b278 [huangzhir] fix nit
f74e582ed [huangzhir] Move the data preparation to setup,some tests were modified due to changes in the data.
fb3f89e15 [huangzhir] 1. Modified test methods to perform end-to-end testing. 2. Mask data should not ignore spaces.
bb6406c81 [huangzhir] Rollback unnecessary changes, add tests using SQL queries, and modify the Scala style checking code.
7754d74fd [huangzhir] Switching the plan.Replace all characters except English letters and numbers with a single character 'U'.Preserve the " " character.
a905817a0 [huangzhir] fix
ce23bcd1b [huangzhir] Regression testing is to keep the original tests unchanged, and only add the "regexp_replace" test method.
a39f185dd [huangzhir] 1. Use a ‘密’ replacer for it Chinese chars 2. Use a separate ut cases for testing this regexp_replace method.
94b05db89 [huangzhir] [KYUUBI #4530] [AUTHZ] fixbug support MASK_SHOW_FIRST_4 和 MASK_SHOW_FIRST_4 chinese data mask
0fc1065ca [huangzhir] fixbug support MASK_SHOW_FIRST_4 和 MASK_SHOW_FIRST_4 chinese data mask

Authored-by: huangzhir <306824224@qq.com>
Signed-off-by: Kent Yao <yao@apache.org>
2023-04-10 10:26:28 +08:00