kyuubi/docs
王龙 e1e7772a9f
[KYUUBI #5402] Introduce Spark JVM quake plugin
# 🔍 Description
## Issue References 🔗

This pull request fixes #5402

## Describe Your Solution 🔧

When facing out-of-control memory management in Spark engine, we typically use JVMkill as a remedy by killing the process and generating a heap dump for post-analysis. However, even with jvmkill protection, we may still encounter issues caused by JVM running out of memory, such as repeated execution of Full GC without performing any useful work during the pause time. Since the JVM does not exhaust 100% of resources, JVMkill will not be triggered.

So introducing JVMQuake provides more granular monitoring of GC behavior, enabling early detection of memory management issues and facilitating fast failure.
You can use the following configuration to enable jvmQuake plugins:
```
spark.plugins=org.apache.spark.kyuubi.jvm.quake.KyuubiJVMQuakePlugin
```
|  configuration   | default  | comment  |
|  ----  | ----  | ----  |
| spark.driver.jvmQuake.enabled  | false | when true, enable driver jvmQuake   |
| spark.executor.jvmQuake.enabled  | false | when true, enable executor jvmQuake   |
| spark.driver.jvmQuake.heapDump.enabled  | false | when true, enable jvm heap dump when jvmQuake rearch the threshold   |
| spark.executor.jvmQuake.heapDump.enabled  | false | when true, enable jvm heap dump when jvmQuake rearch the threshold   |
| spark.jvmQuake.dumpThreshold  | 100 | The number of seconds to dump memory  |
| spark.jvmQuake.killThreshold  | 200 | The number of seconds to kill process  |
| spark.jvmQuake.exitCode  | 502 | The exit code of kill process  |
| spark.jvmQuake.heapDumpPath  | /tmp/kyuubi_jvm_quake/apps | The path of heap dump  |
| spark.jvmQuake.checkInterval  | 3 | The number of seconds to check jvmQuake  |
| spark.jvmQuake.runTimeWeight  | 1.0 | The weight of rum time  |

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

#### Behavior Without This Pull Request ⚰️

#### Behavior With This Pull Request 🎉

#### Related Unit Tests

---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6572 from yoock/features/kyuubi-jvm-quake.

Closes #5402

84361ce8f [王龙] add jvm quake

Authored-by: 王龙 <wanglong16@xiaomi.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2024-09-02 12:29:41 +08:00
..
_static/css Revert "[KYUUBI #5908] [DOCS] Remove workaround for malformed table" 2023-12-24 01:53:05 +08:00
appendix [KYUUBI #4655] [DOCS] Enrich docs for Kyuubi Hive JDBC driver 2023-04-03 18:51:27 +08:00
client [KYUUBI #6652] Support to list batches in descending order 2024-08-31 18:43:36 -07:00
configuration [KYUUBI #6587] Periodically expire temp files and operation logs on server to avoid memeory leak by Files.deleteOnExit 2024-08-28 17:13:27 +08:00
connector [KYUUBI #6558] Bump Iceberg 1.6.0 2024-07-25 22:37:21 +08:00
contributing [KYUUBI #6545] Deprecate and remove building support for Spark 3.2 2024-07-22 11:59:34 +08:00
deployment [KYUUBI #6628] [DOCS] Improve docs for GROUP Share Level 2024-08-21 14:34:15 +08:00
extensions [KYUUBI #5402] Introduce Spark JVM quake plugin 2024-09-02 12:29:41 +08:00
imgs [KYUUBI #5914] Update layer diagram on welcome page 2023-12-25 16:13:48 +08:00
monitor [KYUUBI #6239] Rename beeline to kyuubi-beeline 2024-04-03 18:35:38 +08:00
overview [KYUUBI #6596] Fix typos in architecture page 2024-08-08 12:12:30 +00:00
quick_start [KYUUBI #6557] Support Flink 1.20 2024-08-05 22:57:39 +08:00
security [KYUUBI #6512] Improve docs for KSHC 2024-07-01 10:51:12 +08:00
tools [KYUUBI #6242] Remove block cleaner docs 2024-04-03 13:39:34 +08:00
conf.py Revert "[KYUUBI #5908] [DOCS] Remove workaround for malformed table" 2023-12-24 01:53:05 +08:00
index.rst [KYUUBI #6068] Remove community section from user docs 2024-02-21 05:20:42 +00:00
make.bat [KYUUBI #4235] [DOCS] Prefer https:// URLs in docs 2023-02-03 14:01:11 +08:00
Makefile
requirements.txt [KYUUBI #5902] Bump doc build dependencies 2023-12-21 18:37:43 -08:00