celeborn/docs/developers
SteNicholas 73cf1562f7 [CELEBORN-1299] Introduce JVM profiling in Celeborn Worker using async-profiler
### What changes were proposed in this pull request?

Introduce JVM profiling `JVMProfier` in Celeborn Worker using async-profiler to capture CPU and memory profiles.

### Why are the changes needed?

[async-profiler](https://github.com/async-profiler) is a sampling profiler for any JDK based on the HotSpot JVM that does not suffer from Safepoint bias problem. It has low overhead and doesn’t rely on JVMTI. It avoids the safepoint bias problem by using the `AsyncGetCallTrace` API provided by HotSpot JVM to profile the Java code paths, and Linux’s perf_events to profile the native code paths. It features HotSpot-specific APIs to collect stack traces and to track memory allocations.
The feature introduces a profier plugin that does not add any overhead unless enabled and can be configured to accept profiler arguments as a configuration parameter. It should support to turn profiling on/off, includes the jar/binaries needed for profiling.

Backport [[SPARK-46094] Support Executor JVM Profiling](https://github.com/apache/spark/pull/44021).

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Worker cluster test.

Closes #2409 from SteNicholas/CELEBORN-1299.

Authored-by: SteNicholas <programgeek@163.com>
Signed-off-by: Shuang <lvshuang.xjs@alibaba-inc.com>
2024-03-25 14:05:50 +08:00
..
client.md [CELEBORN-853][DOC] Document on LifecycleManager 2023-07-31 17:36:42 +08:00
configuration.md [CELEBORN-1286] Introduce configuration.md to document dynamic config and config service 2024-02-28 11:49:28 +08:00
faulttolerant.md [CELEBORN-860][DOC] Document on ShuffleClient 2023-07-31 20:07:20 +08:00
glutensupport.md [CELEBORN-1341] Improve Celeborn document 2024-03-20 15:02:05 +08:00
integrate.md [CELEBORN-1341][FOLLOWUP] Improve Celeborn document 2024-03-22 16:34:25 +08:00
jvmprofiler.md [CELEBORN-1299] Introduce JVM profiling in Celeborn Worker using async-profiler 2024-03-25 14:05:50 +08:00
lifecyclemanager.md [MINOR] Fix some typos 2023-10-12 20:34:07 +08:00
master.md [CELEBORN-853][DOC] Document on LifecycleManager 2023-07-31 17:36:42 +08:00
overview.md [CELEBORN-1341][FOLLOWUP] Improve Celeborn document 2024-03-22 16:34:25 +08:00
sbt.md [CELEBORN-1341] Improve Celeborn document 2024-03-20 15:02:05 +08:00
shuffleclient.md [CELEBORN-1341][FOLLOWUP] Improve Celeborn document 2024-03-22 16:34:25 +08:00
slotsallocation.md [MINOR] Fix style and Gluten link in Developers Doc 2024-03-11 12:07:01 +08:00
storage.md [CELEBORN-1341] Improve Celeborn document 2024-03-20 15:02:05 +08:00
trafficcontrol.md [MINOR] Fix incorrect default resume ratio in trafficcontrol doc 2023-09-21 11:18:48 +08:00
worker.md [CELEBORN-877][DOC] Document on SBT 2023-08-11 12:17:55 +08:00
workerexclusion.md [CELEBORN-869][DOC] Document on Integrating Celeborn 2023-08-02 17:22:41 +08:00