### _Why are the changes needed?_
This PR aims to support auto merge small files in multi insert statement, for example
`FROM VALUES(1) INSERT INTO tmp1 SELECT * INSERT INTO tmp2 SELECT *;`
will generate the following plan, `Union` is the root node instead of `InsertIntoHiveTable`
```
Union
:- InsertIntoHiveTable
: +- Project
: +- LocalRelation
+- InsertIntoHiveTable
+- Project
+- LocalRelation
```
This PR also fixed the `canInsertRepartitionByExpression`, previous it did not consider the `SubqueryAlias` which may cause inserting error `Repartition`/`Reblance` node and currupt the data distribution, e.g.
`FROM (SELECT * FROM VALUES(1) DOSTRIBUTE BY col1) INSERT INTO tmp1 SELECT * INSERT INTO tmp2 SELECT *;`
```
Union
:- InsertIntoHiveTable
: +- Project
: +- SubqueryAlias
: +- RepartitionByExpression
: +- Project
: +- LocalRelation
+- InsertIntoHiveTable
+- Project
+- SubqueryAlias
+- RepartitionByExpression
+- Project
+- LocalRelation
```
### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible
- [ ] Add screenshots for manual tests if appropriate
- [x] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request
Closes #1974 from pan3793/ext.
Closes #1974
56cd7734 [Cheng Pan] nit
e0155c27 [Cheng Pan] Support merge small files in multi table insertion
Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
|
||
|---|---|---|
| .. | ||
| kyuubi-codecov | ||
| kyuubi-extension-spark-3-1 | ||
| kyuubi-extension-spark-3-2 | ||
| kyuubi-extension-spark-common | ||
| kyuubi-tpcds | ||
| checkout_pr.sh | ||
| dependencyList | ||
| merge_kyuubi_pr.py | ||
| reformat | ||