### _Why are the changes needed?_
- Update doc to mark the spark plugin's config `spark.sql.optimizer.insertRepartitionNum` used for Spark 3.1 only
### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible
- [ ] Add screenshots for manual tests if appropriate
- [ ] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request
Closes#4933 from bowenliang123/insert-num.
Closes#4933
5ed6e2867 [liangbowen] comment and style
280a6af03 [liangbowen] spark.sql.optimizer.insertRepartitionNum only available for Spark 3.1.x
7f01cf3b6 [liangbowen] spark.sql.optimizer.insertRepartitionNum only available for Spark 3.1.x
Authored-by: liangbowen <liangbowen@gf.com.cn>
Signed-off-by: liangbowen <liangbowen@gf.com.cn>
### _Why are the changes needed?_
Add MaxFileSizeStrategy to limit max scan file size.
close#4641
### _How was this patch tested?_
- [X] Add some test cases that check the changes thoroughly including negative and positive cases if possible
- [ ] Add screenshots for manual tests if appropriate
- [ ] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request
Closes#4642 from wForget/KYUUBI-4641.
Closes#4641
14a680f8e [wforget] comment
d2a393d97 [wforget] comment
b1ef4c52c [wforget] fix
d9e94bd8e [wforget] fix style
8a9121131 [wforget] use optional value
094eb61e3 [wforget] combine
89e2cb4d0 [wforget] [KYUUBI-4641] Add MaxFileSizeStrategy to limit max scan file size
Authored-by: wforget <643348094@qq.com>
Signed-off-by: ulyssesyou <ulyssesyou@apache.org>
### _Why are the changes needed?_
Update outdated docs
### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible
- [ ] Add screenshots for manual tests if appropriate
- [ ] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request
Closes#4727 from pan3793/lineage-doc.
Closes#4727
b6843b282 [Cheng Pan] [DOC] kyuubi-spark-lineage has no transitive deps
Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: odone <odone.zhang@gmail.com>
### _Why are the changes needed?_
fix the wrong version
### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible
- [ ] Add screenshots for manual tests if appropriate
- [ ] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request
Closes#4683 from ulysses-you/followup.
Closes#4683
8e5d46fda [ulysses-you] update version
Authored-by: ulysses-you <ulyssesyou18@gmail.com>
Signed-off-by: ulyssesyou <ulyssesyou@apache.org>
### _Why are the changes needed?_
This pr change two things:
1. add a config to kill executors if the plan contains table caches. It's not always safe to kill executors if the cache is referenced by two write-like plan.
2. force adjustTargetNumExecutors when killing executors. YarnAllocator` might re-request original target executors if DRA has not updated target executors yet. Note, DRA would re-adjust executors if there are more tasks to be executed, so we are safe. It's better to adjuest target num executor once we kill executors.
### _How was this patch tested?_
These issues are found during my POC
Closes#4678 from ulysses-you/skip-cache.
Closes#4678
b12620954 [ulysses-you] Improve kill executors
Authored-by: ulysses-you <ulyssesyou18@gmail.com>
Signed-off-by: ulyssesyou <ulyssesyou@apache.org>
### _Why are the changes needed?_
Add a new rule `InjectCustomResourceProfile` to support custom resource profile for final write stage.
It now supports executor configs:
```
executor core
executor memory
executor memory overhead
executor off heap memory
```
### _How was this patch tested?_
add test and manully test
<img width="778" alt="image" src="https://user-images.githubusercontent.com/12025282/226606147-82a29b8c-1a31-4842-97a7-fe702d80e190.png">
Closes#4615 from ulysses-you/resource-profile.
Closes#4615
852b207cd [ulysses-you] Support stage level schedule for final write stag
Authored-by: ulysses-you <ulyssesyou18@gmail.com>
Signed-off-by: ulyssesyou <ulyssesyou@apache.org>
### _Why are the changes needed?_
This pr adds a new rule `FinalStageResourceManager` to eagerly kill redundant executors
We first get the final stage partition which is the actually required cores, then kill the redundant executors. The priority of kill executors follow:
1. kill executor who is younger than other (The older the JIT works better)
2. kill executor who produces less shuffle data first
The reason why add this feature is that, if the previous stage contains lots executors but final stage has less, then the tasks of final stage would be scheduled randomly in all exists executors which may cause resource waste. e.g., each executor only run 1 or 2 tasks but holds 4 or 5 cores.
### _How was this patch tested?_
test manually
- test for the kill executor
<img width="755" alt="image" src="https://user-images.githubusercontent.com/12025282/227203809-9fe0731c-f97f-40d2-ac7f-b892a2a35289.png">
Closes#4592 from ulysses-you/eagerly-kill-executors.
Closes#4592
f35208bfd [ulysses-you] nit
ec627ee4f [ulysses-you] nit
28d4230f8 [ulysses-you] address comments
f2492cec6 [ulysses-you] nit
f44e48451 [ulysses-you] Support eagerly kill redundant executors
Authored-by: ulysses-you <ulyssesyou18@gmail.com>
Signed-off-by: ulyssesyou <ulyssesyou@apache.org>
### _Why are the changes needed?_
Refactor lineage plugin to add LineageDispatcher.
close#3929
### _How was this patch tested?_
- [X] Add some test cases that check the changes thoroughly including negative and positive cases if possible
- [ ] Add screenshots for manual tests if appropriate
- [ ] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request
Closes#3919 from wForget/dev-lineage-dispatcher.
Closes#3929
5df2aa2f [wforget] add doc
98683ebc [wforget] fix
7b97b2e0 [wForget] rebase
4b046868 [wForget] separate LineageDispatcherType class file
e14cf838 [wForget] Refactor lineage plugin to add LineageDispatcher
Lead-authored-by: wForget <643348094@qq.com>
Co-authored-by: wforget <643348094@qq.com>
Signed-off-by: ulyssesyou <ulyssesyou@apache.org>
close#4330
### _Why are the changes needed?_
### _How was this patch tested?_
- [x] Add some test cases that check the changes thoroughly including negative and positive cases if possible
- [ ] Add screenshots for manual tests if appropriate
- [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request
Closes#4503 from iodone/kyuubi-4330.
Closes#4330
d2c48e7a [odone] Instead of `optimizedPlan` with `analyzedPlan`
12614d19 [odone] add skip permenent view support
Authored-by: odone <odone.zhang@gmail.com>
Signed-off-by: ulyssesyou <ulyssesyou@apache.org>
### _Why are the changes needed?_
- Prefer `https://` URLs in docs, and all changed URLs are validated.
### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible
- [ ] Add screenshots for manual tests if appropriate
- [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request
Closes#4235 from bowenliang123/https-link.
Closes#4235
f114dde2 [liangbowen] update AllKyuubiConfiguration
ad8aaedf [liangbowen] style
e973be5a [liangbowen] update
2370f4bf [liangbowen] prefer https URLs in docs
Authored-by: liangbowen <liangbowen@gf.com.cn>
Signed-off-by: liangbowen <liangbowen@gf.com.cn>
### _Why are the changes needed?_
- fix word spelling typos in docs
### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible
- [ ] Add screenshots for manual tests if appropriate
- [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request
Closes#4226 from bowenliang123/doc-word-typo.
Closes#4226
393de90d [liangbowen] update
365cdc4b [liangbowen] fix word typos in docs
Authored-by: liangbowen <liangbowen@gf.com.cn>
Signed-off-by: liangbowen <liangbowen@gf.com.cn>
### _Why are the changes needed?_
- to consolidate styles in markdown files from manual written or auto-generated
- apply markdown formatting rules with flexmark from [spotless-maven-plugin](https://github.com/diffplug/spotless/tree/main/plugin-maven#markdown) to *.md files in `/docs`
- use `flexmark` to format markdown generation in `TestUtils` of common module used by `AllKyuubiConfiguration` and `KyuubiDefinedFunctionSuite`, as the same way in `FlexmarkFormatterFunc ` of `spotless-maven-plugin` using with `COMMONMARK` as `FORMATTER_EMULATION_PROFILE` (https://github.com/diffplug/spotless/blob/maven/2.30.0/lib/src/flexmark/java/com/diffplug/spotless/glue/markdown/FlexmarkFormatterFunc.java)
- using `flexmark` of` 0.62.2`, as the last version requiring Java 8+ (checked from pom file and bytecode version)
```
<markdown>
<includes>
<include>docs/**/*.md</include>
</includes>
<flexmark></flexmark>
</markdown>
```
- Changes applied to markdown doc files,
- no style change or breakings in built docs by `make html`
- removal all the first blank in licences and comments to conform markdown style rules
- tables regenerated by flexmark following as in [GitHub Flavored Markdown](https://help.github.com/articles/organizing-information-with-tables/) (https://github.com/vsch/flexmark-java/wiki/Extensions#tables)
### _How was this patch tested?_
- [x] regenerate docs using `make html` successfully and check all the markdown pages available
- [x] regenerate `settings.md` and `functions.md` by `AllKyuubiConfiguration` and `KyuubiDefinedFunctionSuite`, and pass the checks by both themselves and spotless check via `dev/reformat`
- [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request
Closes#4200 from bowenliang123/markdown-formatting.
Closes#4200
1eeafce4 [liangbowen] revert minor changes in AllKyuubiConfiguration
4f892857 [liangbowen] use flexmark in markdown doc generation
8c978abd [liangbowen] changes on markdown files
a9190556 [liangbowen] apply markdown formatting rules with `spotless-maven-plugin` to markdown files with in `/docs`
Authored-by: liangbowen <liangbowen@gf.com.cn>
Signed-off-by: liangbowen <liangbowen@gf.com.cn>
### _Why are the changes needed?_
As Kyuubi graduated as top level project, the setting page will be more often requested and should be increasingly reliable and readable with less grammar and spelling mistakes.
This PR is to
- correct mistakes in grammar, spelling, abbreviation and terminology
- with no config name or essential meanings changed
### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible
- [ ] Add screenshots for manual tests if appropriate
- [x] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request
Closes#4161 from bowenliang123/conf-grammar.
Closes#4161
038edfbea [liangbowen] nit
1ec073a4b [liangbowen] to JSON
4f5259a32 [liangbowen] to Prometheus
523855008 [liangbowen] to K8s
fc7a3a81e [liangbowen] to AUTO-GENERATED
da64f54fa [liangbowen] update
d54f9a528 [liangbowen] fix `comma separated` to `comma-separated`
f1d7cc1f1 [liangbowen] update
d84208844 [liangbowen] update
1b75f011c [liangbowen] correction of grammar and spelling mistakes
Authored-by: liangbowen <liangbowen@gf.com.cn>
Signed-off-by: Kent Yao <yao@apache.org>
### _Why are the changes needed?_
fix#4070 ,all commands in alphabetical order
### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible
- [ ] Add screenshots for manual tests if appropriate
- [ ] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request
Closes#4072 from jiaoqingbo/kyuubi4070.
Closes#4070
abb62aeb [jiaoqingbo] [KYUUBI #4070] Add missing spark commands to lineage.md
Authored-by: jiaoqingbo <1178404354@qq.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
### _Why are the changes needed?_
Detect and inject a tag if plan is for writing, then skip doing final stage isolation at query preparation phase.
To make final stage config more flexible with complex Spark application.
### _How was this patch tested?_
- [x] Add some test cases that check the changes thoroughly including negative and positive cases if possible
- [ ] Add screenshots for manual tests if appropriate
- [ ] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request
Closes#3988 from ulysses-you/final-stage.
Closes#3988
d0f2b622 [ulysses-you] fix
e5351fd5 [ulysses-you] nit
39082b20 [ulysses-you] Final stage config isolation support write only
Authored-by: ulysses-you <ulyssesyou18@gmail.com>
Signed-off-by: ulysses-you <ulyssesyou@apache.org>
### _Why are the changes needed?_
Update outdated docs.
### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible
- [ ] Add screenshots for manual tests if appropriate
- [ ] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request
Closes#4024 from QianyongY/features/update-outdated-docs.
Closes#4024
0ad7173f [yongqian] [DOCS] Update the rules documentation for Kyuubi Spark SQL extension
Authored-by: yongqian <yongqian@trip.com>
Signed-off-by: ulysses-you <ulyssesyou@apache.org>
### _Why are the changes needed?_
add two conditions to decide if we should add shuffle.
1. make sure AQE is enabled, otherwise it is no meaning to add a shuffle
2. try to reduce the performance regression if add a shuffle
for condition 2: we do not add shuffle if the original plan does not have shuffle
### _How was this patch tested?_
- [x] Add some test cases that check the changes thoroughly including negative and positive cases if possible
- [ ] Add screenshots for manual tests if appropriate
- [ ] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request
Closes#3962 from ulysses-you/no-shuffle.
Closes#3962
a084cccc [ulysses-you] address comment
9d0aab1b [ulysses-you] address comment
09fc9b21 [ulysses-you] fix ut
06f249a2 [ulysses-you] Reduce the performance regression
Authored-by: ulysses-you <ulyssesyou18@gmail.com>
Signed-off-by: ulysses-you <ulyssesyou@apache.org>
### _Why are the changes needed?_
Improve the rebalance before writing rule.
The rebalance before writing rule adds a rebalance at the top of query for data writing command, however the default partitioning of rebalance uses RoundRobinPartitioning which would break the original partitioning of data. It may cause the the output data size bigger than before.
This pr supports infer the columns from join and aggregate for rebalance and sort to improve the compression ratio.
Note that, this improvement only works for static partition writing.
### _How was this patch tested?_
- [x] Add some test cases that check the changes thoroughly including negative and positive cases if possible
- [ ] Add screenshots for manual tests if appropriate
- [ ] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request
Closes#3601 from ulysses-you/smart-order.
Closes#3601
c190dc1a [ulysses-you] docs
995969b5 [ulysses-you] view
ea23c417 [ulysses-you] Support infer columns for rebalance and sort
Authored-by: ulysses-you <ulyssesyou18@gmail.com>
Signed-off-by: ulysses-you <ulyssesyou@apache.org>
…and register to JdbcDialects
### _Why are the changes needed?_
close#3487 .
1. add kyuubi-extension-spark-client_2.12 module, and introduce KyuubiSparkClientExtension
2. implement HiveDialect and register to JdbcDialects
### _How was this patch tested?_
- [x] Add some test cases that check the changes thoroughly including negative and positive cases if possible
- [ ] Add screenshots for manual tests if appropriate
- [x] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request
Closes#3489 from bowenliang123/3487-hive-jdbc-dialect.
Closes#3487
3ed8be75 [Bowen Liang] nit
47be0ba6 [Bowen Liang] update docs for hive jdbc dialect
84623a35 [Bowen Liang] update pom in minor details
b7edc6cf [Bowen Liang] add ut
968bb722 [Bowen Liang] move to package org.apache.spark.sql.dialect
03eab323 [Bowen Liang] renamed to kyuubi-extension-spark-jdbc-dialect module and moved to extensions/spark
9a4eaf44 [Bowen Liang] add kyuubi-extension-spark-client_2.12 module, implement HiveDialect and register to JdbcDialects
Authored-by: Bowen Liang <liangbowen@gf.com.cn>
Signed-off-by: Cheng Pan <chengpan@apache.org>
close#3312
### _Why are the changes needed?_
SQL supported like:
```sql
-- ScalarQuery
select (select a from table0) as aa, b as bb from table0
select (select count(*) from table0) as aa, b as bb from table0
-- Left Semi or Anti Join
select * from table0 where table0.a in (select a from table1)
select * from table0 where table0.a not in (select a from table1)
select * from table0 where exists (select * from table1 where table0.c = table1.c)
```
### _How was this patch tested?_
- [x] Add some test cases that check the changes thoroughly including negative and positive cases if possible
- [ ] Add screenshots for manual tests if appropriate
- [x] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request
Closes#3384 from iodone/kyuubi-3312.
Closes#3312
e2af4e1c [odone] change lineage column __aggregate__ to __count__ if exist count(*)
d9c46c34 [odone] add aggregate expression lineage extracting
2fd63482 [odone] add subquery support
Authored-by: odone <odone.zhang@gmail.com>
Signed-off-by: ulysses-you <ulyssesyou@apache.org>
### _Why are the changes needed?_
Update outdated docs.
### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible
- [ ] Add screenshots for manual tests if appropriate
- [ ] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request
Closes#3210 from pan3793/spark-extension.
Closes#3210
5e5ebd35 [Cheng Pan] Mention Kyuubi Spark SQL extension supports Spark 3.3
Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
### _Why are the changes needed?_
Build the content for extension points documentation, pre-work for #3100
<img width="1767" alt="image" src="https://user-images.githubusercontent.com/8326978/179930987-1accbbb7-e804-4230-871f-6c4b1152f4a1.png">
1. the extensions are divided into 2: server side and engine side extensions. (Do we have client side extension support?)
2. the server side authentication page is cross-referenced by the security section, see 1 in the picture.
3. the engine side ones are grouped by different compute frameworks.
4. connector is one type of extension, so we cross-reference the connector pages directly, see 2 & 3 in the picture.
### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible
- [x] Add screenshots for manual tests if appropriate
- [x] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request
Closes#3103 from yaooqinn/3101.
Closes#3101
a9ae3e32 [Kent Yao] [KYUUBI #3101] [Subtask][#3100] Build content for extension points documentation
3b7367e9 [Kent Yao] [KYUUBI #3101] [Subtask][#3100] Build content for extension points documentation
b5eda13e [Kent Yao] [KYUUBI #3101] [Subtask][#3100] Build content for extension points documentation
Authored-by: Kent Yao <yao@apache.org>
Signed-off-by: Kent Yao <yao@apache.org>