kyuubi/dev
h cc3ca56ab2
[KYUUBI #1085] Add forcedMaxOutputRows rule for limitation to avoid huge output unexpectly
Add MaxOutputRows rule for output rows limitation to avoid huge output unexpectedly
<!--
Thanks for sending a pull request!

Here are some tips for you:
  1. If this is your first time, please read our contributor guidelines: https://kyuubi.readthedocs.io/en/latest/community/contributions.html
  2. If the PR is related to an issue in https://github.com/apache/incubator-kyuubi/issues, add '[KYUUBI #XXXX]' in your PR title, e.g., '[KYUUBI #XXXX] Your PR title ...'.
  3. If the PR is unfinished, add
<img width="1440" alt="截屏2021-09-12 下午12 19 28" src="https://user-images.githubusercontent.com/635169/132972063-b12937bb-807a-47bd-8d21-835d83031191.png">
 '[WIP]' in your PR title, e.g., '[WIP][KYUUBI #XXXX] Your PR title ...'.
-->

### _Why are the changes needed?_
<!--
Please clarify why the changes are needed. For instance,
  1. If you add a feature, you can talk about the use case of it.
  2. If you fix a bug, you can clarify why it is a bug.
-->
We support the PR feature with limitation that avoid huge output rows in user ad-hoc query unexpected,Generally ad-hoc query seems needle in a Haystack, user pick few computed result data in huge data from warehouse,  we mainly used in below cases:

- CASE 1:
```
SELECT [c1, c2, ...]
```
- CASE 2:
```
WITH CTE AS (...)
SELECT [c1, c2, ...] FROM Express(CTE) ...
```

### _How was this patch tested?_
- [x] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [x] Add screenshots for manual tests if appropriate
<img width="1440" alt="截屏2021-09-12 下午12 19 28" src="https://user-images.githubusercontent.com/635169/132972078-c4821135-0520-420d-9ab8-24e124f6c6c9.png">

- [x] [Run test](https://kyuubi.readthedocs.io/en/latest/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #1085 from i7xh/watchdogMaxOutputRows.

Closes #1085

77aa6ff3 [h] Resolve issue: unify maxOutputRows and using unittest setupData
939d2556 [h] Resolve issue: well-format code
5b0383dd [h] Fix issue: Final Limit according to spark.sql.watchdog.forcedMaxOutputRows
7dcbb0a4 [h] Resolve issue: remove origin match from rule
f4dff4cf [h] Resolve issue: update forcedMaxOutputRows config doc
ae21c1ac [h] Resovled issue: Support Aggregate force limitation and Remove InsertIntoDataSourceDirCommand process
a9d3640b [h] Resolved code review issue
01c87fd2 [h] Add MaxOutputRows rule for output rows limitation to avoid huge output rows of query unexpectedly

Authored-by: h <h@zhihu.com>
Signed-off-by: ulysses-you <ulyssesyou@apache.org>
2021-09-18 12:31:37 +08:00
..
kyuubi-codecov [BUILD] Bump 1.4.0-SNAPSHOT 2021-08-17 01:39:06 +08:00
kyuubi-extension-spark-3-1 [KYUUBI #1085] Add forcedMaxOutputRows rule for limitation to avoid huge output unexpectly 2021-09-18 12:31:37 +08:00
kyuubi-tpcds [BUILD] Bump 1.4.0-SNAPSHOT 2021-08-17 01:39:06 +08:00
dependencyList [KYUUBI #1112] Upgrade scala to 2.12.15 2021-09-17 21:12:56 +08:00
merge_kyuubi_pr.py [KYUUBI #809] [INFRA] Support reopened PR in pr merge tool 2021-07-15 18:05:07 +08:00