### What changes were proposed in this pull request?
Adding documentation for Worker Tags feature
### Why are the changes needed?
https://cwiki.apache.org/confluence/display/CELEBORN/CIP-11+Supporting+Tags+in+Celeborn
### Does this PR introduce _any_ user-facing change?
NA
### How was this patch tested?
NA
Closes#2981 from s0nskar/tags_docu.
Authored-by: Sanskar Modi <sanskarmodi97@gmail.com>
Signed-off-by: mingji <fengmingxiao.fmx@alibaba-inc.com>
### What changes were proposed in this pull request?
Deprecate identity configs related with quota –
```
"celeborn.quota.identity.provider"
"celeborn.quota.identity.user-specific.tenant"
"celeborn.quota.identity.user-specific.userName"
```
In favour of identity configs independent of quota
```
"celeborn.identity.provider"
"celeborn.identity.user-specific.tenant"
"celeborn.identity.user-specific.userName"
```
### Why are the changes needed?
Current identity configs are tied with quota but identity should be free of quota because other pieces like tags are also using it. In future other new components can also make use of identity.
### Does this PR introduce _any_ user-facing change?
NA
### How was this patch tested?
Existing UTs
Closes#2952 from s0nskar/fix_identity.
Authored-by: Sanskar Modi <sanskarmodi97@gmail.com>
Signed-off-by: Shuang <lvshuang.xjs@alibaba-inc.com>
### What changes were proposed in this pull request?
as titile
### Why are the changes needed?
The doc fail to mention S3 as one of storage layers
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
Closes#2963 from zhaohehuhu/dev-1128.
Authored-by: zhaohehuhu <luoyedeyi@163.com>
Signed-off-by: mingji <fengmingxiao.fmx@alibaba-inc.com>
### What changes were proposed in this pull request?
Add migration doc for RESTful api change for celeborn 0.5.0.
### Why are the changes needed?
There was a typo in https://github.com/apache/celeborn/pull/2371, the `/shuffles` api was renamed to `/shuffle`.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
GA.
Closes#2960 from turboFei/shuffles_api.
Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: SteNicholas <programgeek@163.com>
### What changes were proposed in this pull request?
- Adding support to enable/disable worker tags feature by a master config flag.
- Fixed BUG: After this change #2936, admins can also define the tagsExpr for users. In a case user is passing an empty tagsExpr current code will ignore the admin defined tagsExpr and allow job to use all workers.
### Why are the changes needed?
https://cwiki.apache.org/confluence/display/CELEBORN/CIP-11+Supporting+Tags+in+Celeborn
### Does this PR introduce _any_ user-facing change?
NA
### How was this patch tested?
Existing UTs
Closes#2953 from s0nskar/tags-enabled.
Authored-by: Sanskar Modi <sanskarmodi97@gmail.com>
Signed-off-by: mingji <fengmingxiao.fmx@alibaba-inc.com>
### What changes were proposed in this pull request?
Remove the code for app top disk usage both in master and worker end.
Prefer to use below prometheus expr to figure out the top app usages.
```
topk(50, sum by (applicationId) (metrics_diskBytesWritten_Value{role="worker", applicationId!=""}))
```
### Why are the changes needed?
To address comments: https://github.com/apache/celeborn/pull/2947#issuecomment-2499564978
> Due to the application dimension resource consumption, this feature should be included in the deprecated features. Maybe you can remove the codes for application top disk usage.
### Does this PR introduce _any_ user-facing change?
Yes, remove the app top disk usage api.
### How was this patch tested?
GA.
Closes#2949 from turboFei/remove_app_top_usage.
Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: mingji <fengmingxiao.fmx@alibaba-inc.com>
### What changes were proposed in this pull request?
Flink supports fallback to vanilla Flink built-in shuffle implementation.
### Why are the changes needed?
When quota is unenough or workers are unavailable, `RemoteShuffleMaster` does not support fallback to `NettyShuffleMaster`, and `RemoteShuffleEnvironment` does not support fallback to `NettyShuffleEnvironment` at present. Flink should support fallback to vanilla Flink built-in shuffle implementation for unenough quota and unavailable workers.

### Does this PR introduce _any_ user-facing change?
- Introduce `ShuffleFallbackPolicy` interface to determine whether fallback to vanilla Flink built-in shuffle implementation.
```
/**
* The shuffle fallback policy determines whether fallback to vanilla Flink built-in shuffle
* implementation.
*/
public interface ShuffleFallbackPolicy {
/**
* Returns whether fallback to vanilla flink built-in shuffle implementation.
*
* param shuffleContext The job shuffle context of Flink.
* param celebornConf The configuration of Celeborn.
* param lifecycleManager The {link LifecycleManager} of Celeborn.
* return Whether fallback to vanilla flink built-in shuffle implementation.
*/
boolean needFallback(
JobShuffleContext shuffleContext,
CelebornConf celebornConf,
LifecycleManager lifecycleManager);
}
```
- Introduce `celeborn.client.flink.shuffle.fallback.policy` config to support shuffle fallback policy configuration.
### How was this patch tested?
- `RemoteShuffleMasterSuiteJ#testRegisterJobWithForceFallbackPolicy`
- `WordCountTestBase#celeborn flink integration test with fallback - word count`
Closes#2932 from SteNicholas/CELEBORN-1700.
Authored-by: SteNicholas <programgeek@163.com>
Signed-off-by: Shuang <lvshuang.xjs@alibaba-inc.com>
### What changes were proposed in this pull request?
Support predefined tags expression for tenant and users via dynamic config. Using this admin can configure tags for users/tenants and give permission to special users to provide custom tags expression.
### Why are the changes needed?
https://cwiki.apache.org/confluence/display/CELEBORN/CIP-11+Supporting+Tags+in+Celeborn
### Does this PR introduce _any_ user-facing change?
NA
### How was this patch tested?
UTs
Closes#2936 from s0nskar/admin_tags.
Authored-by: Sanskar Modi <sanskarmodi97@gmail.com>
Signed-off-by: mingji <fengmingxiao.fmx@alibaba-inc.com>
### What changes were proposed in this pull request?
as title
### Why are the changes needed?
AWS S3 doesn't support append, so Celeborn had to copy the historical data from s3 to worker and write to s3 again, which heavily scales out the write. This PR implements a better solution via MPU to avoid copy-and-write.
### Does this PR introduce _any_ user-facing change?
### How was this patch tested?

I conducted an experiment with a 1GB input dataset to compare the performance of Celeborn using only S3 storage versus using SSD storage. The results showed that Celeborn with SSD storage was approximately three times faster than with only S3 storage.
<img width="1728" alt="Screenshot 2024-11-16 at 13 02 10" src="https://github.com/user-attachments/assets/8f879c47-c01a-4004-9eae-1c266c1f3ef2">
The above screenshot is the second test with 5000 mapper and reducer that I did.
Closes#2830 from zhaohehuhu/dev-1021.
Lead-authored-by: zhaohehuhu <luoyedeyi@163.com>
Co-authored-by: He Zhao <luoyedeyi459@163.com>
Signed-off-by: mingji <fengmingxiao.fmx@alibaba-inc.com>
### What changes were proposed in this pull request?
1. Introduce `celeborn.client.spark.stageRerun.enabled` with alternative `celeborn.client.spark.fetch.throwsFetchFailure` to enable spark stage rerun.
2. Change the default value of `celeborn.client.spark.fetch.throwsFetchFailure` from `false` to `true`, which enables spark stage rerun at default.
### Why are the changes needed?
User could not directly understand the meaning of `celeborn.client.spark.fetch.throwsFetchFailure` as whether to enable stage rerun, which means that client throws `FetchFailedException` instead of `CelebornIOException`. It's recommended to introduce `celeborn.client.spark.stageRerun.enabled` with alternative `celeborn.client.spark.fetch.throwsFetchFailure` to enable spark stage rerun.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
CI.
Closes#2920 from SteNicholas/CELEBORN-1719.
Authored-by: SteNicholas <programgeek@163.com>
Signed-off-by: SteNicholas <programgeek@163.com>
### What changes were proposed in this pull request?
implement queue time/processing time metrics for rpc framework
### Why are the changes needed?
to identify rpc processing bottelneck
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
local
Closes#2784 from ErikFang/main-rpc-metrics.
Lead-authored-by: Erik.fang <fmerik@gmail.com>
Co-authored-by: 仲甫 <fangming@antgroup.com>
Signed-off-by: Shuang <lvshuang.xjs@alibaba-inc.com>
### What changes were proposed in this pull request?
1. cache the available workers
2. Only count the available workers device free capacity.
3. place the metrics_AvailableWorkerCount_Value in overall and metrics_WorkerCount_Value in `Master` part
### Why are the changes needed?
Cache the available workers to reduce the computation that need to loop the workers frequently.
To have an accurate device capacity overview that does not include the excluded workers, decommissioning workers, etc.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
UT.
<img width="1705" alt="image" src="https://github.com/user-attachments/assets/bee17b4e-785d-4112-8410-dbb684270ec0">
Closes#2827 from turboFei/device_free.
Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: Shuang <lvshuang.xjs@alibaba-inc.com>
### What changes were proposed in this pull request?
When looking into the source code, I found some blank delimiter is missing in the ConfigEntry doc.
In this PR, I go through all the ConfigEntry docs to fix the missing blank in the description.
### Why are the changes needed?
Fix typo.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
GA.
Closes#2917 from turboFei/nit_docs.
Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
### What changes were proposed in this pull request?
1. Document adds Flink 1.16 support including `README.md`, `deploy.md`.
2. Update description of `celeborn.client.shuffle.compression.codec` to change the supported Flink version for ZSTD.
### Why are the changes needed?
#2619 has supported Flink 1.16, which should update the document for the support. Meanwhile, since Flink version 1.16, zstd is supported for Flink shuffle client.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
No.
Closes#2904 from SteNicholas/CELEBORN-1504.
Authored-by: SteNicholas <programgeek@163.com>
Signed-off-by: mingji <fengmingxiao.fmx@alibaba-inc.com>
### What changes were proposed in this pull request?
1. `ShuffleFallbackPolicy` supports `ShuffleFallbackCount` metric to provide the shuffle fallback count of each fallback policy.
2. Introduce `ShuffleTotalCount` metric to record the total count of shuffle.
3. Fix Spark 2 does not increment shuffle count via `LifecycleManager`.
### Why are the changes needed?
The implementations of `ShuffleFallbackPolicy` does not support `ShuffleFallbackCount` metric at present. Meanwhile, Bilibili production practice needs `ShuffleFallbackCount` of different `ShuffleFallbackPolicy`.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Cluster test.
Closes#2891 from SteNicholas/CELEBORN-1685.
Authored-by: SteNicholas <programgeek@163.com>
Signed-off-by: Shuang <lvshuang.xjs@alibaba-inc.com>
### What changes were proposed in this pull request?
Fix docs typo.
### Why are the changes needed?
Fix docs typo.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
GA.
Closes#2890 from turboFei/nit.
Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: mingji <fengmingxiao.fmx@alibaba-inc.com>
### What changes were proposed in this pull request?
As title, introduce metrics_ShuffleFallbackCount_Value.
### Why are the changes needed?
To provide the insights that how many shuffles fallback to spark built-in shuffle service. It is helpful for us to deprecate the ESS progressively.
Currently, we plan to set the `celeborn.client.spark.shuffle.fallback.numPartitionsThreshold` to fallback the shuffle with too large shuffle partitions number, for example: 50k.
In the future, we plan to limit the acceptable maximum shuffle partition number so that the bad job would be rejected and not impact the celeborn master health.
### Does this PR introduce _any_ user-facing change?
Yes, new metrics.
### How was this patch tested?
UT.
<img width="1188" alt="image" src="https://github.com/user-attachments/assets/8193c12c-5dc9-4783-b64b-6a8449a1bea4">
Closes#2866 from turboFei/record_fallback.
Lead-authored-by: Wang, Fei <fwang12@ebay.com>
Co-authored-by: Fei Wang <cn.feiwang@gmail.com>
Signed-off-by: mingji <fengmingxiao.fmx@alibaba-inc.com>
### What changes were proposed in this pull request?
If we use celeborn shuffle service, we can't submit both batch and streaming to the same flink session cluster. This should be highlight in doc.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
No need.
Closes#2879 from reswqa/session-doc.
Authored-by: Weijie Guo <reswqa@163.com>
Signed-off-by: mingji <fengmingxiao.fmx@alibaba-inc.com>
### What changes were proposed in this pull request?
Follow up of [https://github.com/apache/celeborn/pull/2835]
Only use dynamic resources when candidates are not enough.
And change the way geting availableWorkers form heartbeat to requestSlots RPC to avoid the burden of heartbeat.
### Why are the changes needed?
No
### Does this PR introduce _any_ user-facing change?
Add another configuration.
### How was this patch tested?
UT
Closes#2852 from zaynt4606/clb1636-flu2.
Authored-by: szt <zaynt4606@163.com>
Signed-off-by: Shuang <lvshuang.xjs@alibaba-inc.com>
### What changes were proposed in this pull request?
Improve Celeborn CLI user guide including:
- Add license of Celeborn CLI user guide.
- Optimize the introduction of setup and usage for Celeborn CLI.
- Optimize the navigation of Celeborn CLI to combine Celeborn Ratis Shell.
### Why are the changes needed?
There is no license in Celeborn CLI user guide. Meanwhile, there are certain improvement in user guide including the license, navigation, and the introduction of setup and usage.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
No.
Closes#2875 from SteNicholas/CELEBORN-1678.
Authored-by: SteNicholas <programgeek@163.com>
Signed-off-by: mingji <fengmingxiao.fmx@alibaba-inc.com>
### What changes were proposed in this pull request?
When users deploy using the release binary as outlined in the documentation, the instructions for copying the client JAR can be unclear.
### Why are the changes needed?
No
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?

Closes#2877 from zaynt4606/md.
Authored-by: szt <zaynt4606@163.com>
Signed-off-by: SteNicholas <programgeek@163.com>
### What changes were proposed in this pull request?
Add Flink hybrid shuffle doc
### Why are the changes needed?
We need the doc for the new hybrid shuffle mode.
### Does this PR introduce _any_ user-facing change?
no
### How was this patch tested?
no neeed.
Closes#2867 from reswqa/add-hs-doc.
Authored-by: Weijie Guo <reswqa@163.com>
Signed-off-by: SteNicholas <programgeek@163.com>
### What changes were proposed in this pull request?
As title
### Why are the changes needed?
Currently, only Flink retries establishing a client when a connection problem occurs. This would be beneficial for all other engines to implement as well.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
UT
Closes#2855 from RexXiong/CELEBORN-1673.
Lead-authored-by: Shuang <lvshuang.xjs@alibaba-inc.com>
Co-authored-by: lvshuang.xjs <lvshuang.xjs@taobao.com>
Signed-off-by: SteNicholas <programgeek@163.com>
### What changes were proposed in this pull request?
adding user guide to README for cli
### Why are the changes needed?
better user experience when using CLI.
### Does this PR introduce _any_ user-facing change?
no
### How was this patch tested?
N/A
Closes#2862 from akpatnam25/CELEBORN-1678.
Authored-by: Aravind Patnam <akpatnam25@gmail.com>
Signed-off-by: mingji <fengmingxiao.fmx@alibaba-inc.com>
### What changes were proposed in this pull request?
Support interrupt shuffle on client side.
I will develop the following functions in order
1. Client supports interrupt shuffle
2. Master supports calculating app-level shuffle usage
### Why are the changes needed?
The current storage quota logic can only limit new shuffles, and cannot limit the writing of existing shuffles. In our production environment, there is such an scenario: the cluster is small, but the user's app single shuffle is large which occupied disk resources, we want to interrupt those shuffle.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Unable to test this part independently, Additional tests will be added after completing the second part.
Closes#2801 from leixm/CELEBORN-1577-1.
Authored-by: Xianming Lei <31424839+leixm@users.noreply.github.com>
Signed-off-by: Shuang <lvshuang.xjs@alibaba-inc.com>
### What changes were proposed in this pull request?
Currently, the ChangePartitionManager retrieves workers from the LifeCycleManager's workerSnapshot. However, during the revival process in reallocateChangePartitionRequestSlotsFromCandidates, it does not account for newly added available workers resulting from elastic contraction and expansion. This PR addresses this issue by updating the candidate workers in the ChangePartitionManager to use the available workers reported in the heartbeat from the master.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
UT
Closes#2835 from zaynt4606/clbdev.
Authored-by: szt <zaynt4606@163.com>
Signed-off-by: Shuang <lvshuang.xjs@alibaba-inc.com>
### What changes were proposed in this pull request?
Support ratio threshold of unhealthy disks for excluding worker with `celeborn.master.excludeWorker.unhealthyDiskRatioThreshold`.
### Why are the changes needed?
We often encounter issues such as disk input/output errors in production practice. When a bad disk occurs, the worker will be maintained to decommission for repairing the machine disk. The reason is that generally the fault will be repaired in time after it is discovered. It is possible that the machine will not trigger all disk failures if it is out of warranty. It can be replaced directly when it is under warranty. If the disk fails after it is out of warranty, you need to purchase the disk yourself for replacement. At the same time, submitting the disk for repair at one time will affect the failure rate judgment of the system group and scenario. In addition, the occurrence of bad disks will bring about some management problems, such as continuous alarms, and the handling of disk failures is relatively customized.
Therefore, it's recommended to configure ratio threshold of unhealthy disks for excluding worker, which does not need to wait for all unhealthy disks to exclude corresponding worker.
### Does this PR introduce _any_ user-facing change?
Introduce `celeborn.master.excludeWorker.unhealthyDiskRatioThreshold` to configure max ratio of unhealthy disks for excluding worker.
### How was this patch tested?
Cluster test.
Closes#2812 from SteNicholas/CELEBORN-1651.
Authored-by: SteNicholas <programgeek@163.com>
Signed-off-by: mingji <fengmingxiao.fmx@alibaba-inc.com>
### What changes were proposed in this pull request?
To support revising lost shuffle IDs in a long-running job such as flink batch jobs.
### Why are the changes needed?
1. To support revise lost shuffles.
2. To add an HTTP endpoint to revise lost shuffles manually.
### Does this PR introduce _any_ user-facing change?
NO.
### How was this patch tested?
Cluster tests.
Closes#2746 from FMX/b1600.
Lead-authored-by: mingji <fengmingxiao.fmx@alibaba-inc.com>
Co-authored-by: Ethan Feng <fengmingxiao.fmx@alibaba-inc.com>
Signed-off-by: SteNicholas <programgeek@163.com>
### What changes were proposed in this pull request?
CongestionController support dynamic config
### Why are the changes needed?
Currently, Celeborn only supports quota management based on disk file bytes/count, and this quota management cannot cope with sudden increases in traffic, which will cause corrupt to the cluster.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
UT.
Closes#2817 from leixm/CELEBORN-1487-2.
Authored-by: Xianming Lei <31424839+leixm@users.noreply.github.com>
Signed-off-by: Shuang <lvshuang.xjs@alibaba-inc.com>
### What changes were proposed in this pull request?
`NettyMemoryMetrics` supports `numHeapArenas`, `numDirectArenas`, `tinyCacheSize`, `smallCacheSize`, `normalCacheSize`, `numThreadLocalCaches` and `chunkSize` from `PooledByteBufAllocatorMetric`. Meanwhile, remove `server_` prefix from metric name of netty memory metric in `monitoring.md`.
### Why are the changes needed?
`PooledByteBufAllocatorMetric` provides the following API to support netty memory metrics:
```
public int numHeapArenas() {
return this.allocator.numHeapArenas();
}
public int numDirectArenas() {
return this.allocator.numDirectArenas();
}
public List<PoolArenaMetric> heapArenas() {
return this.allocator.heapArenas();
}
public List<PoolArenaMetric> directArenas() {
return this.allocator.directArenas();
}
public int numThreadLocalCaches() {
return this.allocator.numThreadLocalCaches();
}
public int tinyCacheSize() {
return this.allocator.tinyCacheSize();
}
public int smallCacheSize() {
return this.allocator.smallCacheSize();
}
public int normalCacheSize() {
return this.allocator.normalCacheSize();
}
public int chunkSize() {
return this.allocator.chunkSize();
}
public long usedHeapMemory() {
return this.allocator.usedHeapMemory();
}
public long usedDirectMemory() {
return this.allocator.usedDirectMemory();
}
```
`NettyMemoryMetrics` only supports `usedHeapMemory` and `usedDirectMemory`, which could support `numHeapArenas`, `numDirectArenas`, `tinyCacheSize`, `smallCacheSize`, `normalCacheSize`, `numThreadLocalCaches` and `chunkSize` from `PooledByteBufAllocatorMetric`.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
[Celeborn Grafana Dashboard](https://stenicholas.grafana.net/public-dashboards/a520ca36a33843a38bbde28387023f97)
Closes#2802 from SteNicholas/CELEBORN-1640.
Authored-by: SteNicholas <programgeek@163.com>
Signed-off-by: mingji <fengmingxiao.fmx@alibaba-inc.com>
### What changes were proposed in this pull request?
Adding REST api and cli for container info. User can configure this api to be based on whichever cluster manager they are using.
### Why are the changes needed?
see above
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
added UTs
Closes#2758 from akpatnam25/CELEBORN-1599.
Authored-by: Aravind Patnam <akpatnam25@gmail.com>
Signed-off-by: mingji <fengmingxiao.fmx@alibaba-inc.com>
### What changes were proposed in this pull request?
We can add randomUUID as an suffix to solve it
### Why are the changes needed?
currently, we cannot guarantee application id is really unique. this may lead to data issue.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
test locally
Closes#2810 from chenkovsky/feature/uuid_appid.
Authored-by: Chongchen Chen <chenkovsky@qq.com>
Signed-off-by: Shuang <lvshuang.xjs@alibaba-inc.com>
### What changes were proposed in this pull request?
Supporting passing tag expression in RequestSlots request. Clients can pass the tags using CelebornConf. Default tag configs for system/tenant/user will be suppoted in follow up PRs.
### Why are the changes needed?
https://cwiki.apache.org/confluence/display/CELEBORN/CIP-11+Supporting+Tags+in+Celeborn
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
Existing UTs passed, will add more UTs while integrating TagsManager with ConfigService.
Closes#2770 from s0nskar/request-slots.
Authored-by: Sanskar Modi <sanskarmodi97@gmail.com>
Signed-off-by: mingji <fengmingxiao.fmx@alibaba-inc.com>
### What changes were proposed in this pull request?
Introduce support control traffic by user/worker traffic speed.
### Why are the changes needed?
Currently, Celeborn only supports quota management based on disk file bytes/count, and this quota management cannot cope with sudden increases in traffic, which will cause corrupt to the cluster.
### Does this PR introduce _any_ user-facing change?
Yes.
### How was this patch tested?
UTs.
Closes#2797 from leixm/issue_1487_1.
Authored-by: Xianming Lei <31424839+leixm@users.noreply.github.com>
Signed-off-by: Shuang <lvshuang.xjs@alibaba-inc.com>
### What changes were proposed in this pull request?
Add a config to bypass memory check when sorting shuffle files.
### Why are the changes needed?
If a celeborn worker has quite a large memory and it supports both Spark and Flink engines. This config should be enabled.
### Does this PR introduce _any_ user-facing change?
NO.
### How was this patch tested?
Cluster test.
Closes#2798 from FMX/b1637.
Authored-by: mingji <fengmingxiao.fmx@alibaba-inc.com>
Signed-off-by: SteNicholas <programgeek@163.com>
### What changes were proposed in this pull request?
Introduce Blaze support document.
### Why are the changes needed?
[Blaze](https://github.com/kwai/blaze) supports Celeborn as remote shuffle service. It's recommened to Blaze support document for introduction of Blaze usage.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
No.
Closes#2787 from SteNicholas/CELEBORN-1635.
Authored-by: SteNicholas <programgeek@163.com>
Signed-off-by: Shuang <lvshuang.xjs@alibaba-inc.com>
### What changes were proposed in this pull request?
Add navigation for `REST API` document.
### Why are the changes needed?
`REST API` document does not have any navigation, which is better to add navigation to guide REST API.
<img width="1438" alt="image" src="https://github.com/user-attachments/assets/b5b3a14a-38d4-4769-bffb-3acd571d5dbb">
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
No.
Closes#2775 from SteNicholas/navigate-rest-api.
Authored-by: SteNicholas <programgeek@163.com>
Signed-off-by: mingji <fengmingxiao.fmx@alibaba-inc.com>
### What changes were proposed in this pull request?
Adding a worker metrics for publish unreleased shuffle count when worker was decommissioned.
<img width="885" alt="Screenshot 2024-09-16 at 11 12 33 AM" src="https://github.com/user-attachments/assets/c81f36c1-cbed-44fe-814b-88f3ff29875d">
### Why are the changes needed?
Currently celeborn don't publish the count of unreleased shuffle key which gets lost when a worker is decommissioned. This can be useful for monitoring and configuring the `forceExitTimeout`.
### Does this PR introduce _any_ user-facing change?
NO
### How was this patch tested?
NA
Closes#2711 from s0nskar/unrelease_shuffle_metric.
Authored-by: Sanskar Modi <sanskarmodi97@gmail.com>
Signed-off-by: mingji <fengmingxiao.fmx@alibaba-inc.com>
### What changes were proposed in this pull request?
In order to speed up the resource releasing,this PR Unregister shuffle in batch;
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
UT & Local cluster testing
Closes#2701 from zaynt4606/batchUnregister.
Lead-authored-by: szt <zaynt4606@163.com>
Co-authored-by: Zaynt <shuaizhentao.szt@alibaba-inc.com>
Signed-off-by: Shuang <lvshuang.xjs@alibaba-inc.com>
### What changes were proposed in this pull request?
Update `advertised endpoint` of master service log in startup document.
### Why are the changes needed?
#2713 has changed the startup log of of master service in `NettyRpcEnv`, which should update the log of startup document.
```
logInfo(s"Starting RPC Server [${config.name}] on ${config.bindAddress}:$actualPort " +
s"with advertised endpoint ${config.advertiseAddress}:$actualPort")
```
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
No.
Closes#2773 from SteNicholas/CELEBORN-1513.
Authored-by: SteNicholas <programgeek@163.com>
Signed-off-by: mingji <fengmingxiao.fmx@alibaba-inc.com>
### What changes were proposed in this pull request?
Add `emptyFilePrimaryIds` and `emptyFileReplicaIds` of worker service log in startup document.
### Why are the changes needed?
#2300 has added `emptyFilePrimaryIds` and `emptyFileReplicaIds` of startup log of for worker service in `Controller`, which should also add into the log of startup document.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
No.
Closes#2774 from SteNicholas/CELEBORN-914.
Authored-by: SteNicholas <programgeek@163.com>
Signed-off-by: mingji <fengmingxiao.fmx@alibaba-inc.com>
### What changes were proposed in this pull request?
Update name of master service from `MasterSys` to `Master` in startup document to follow up https://github.com/apache/celeborn/pull/2003/files#r1365454256.
### Why are the changes needed?
#2003 has already changed the name of master and worker service, which should also update the name in startup logs of document.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
No.
Closes#2772 from SteNicholas/CELEBORN-1058.
Authored-by: SteNicholas <programgeek@163.com>
Signed-off-by: mingji <fengmingxiao.fmx@alibaba-inc.com>
### What changes were proposed in this pull request?
Update document link of `Get Started With Velox` and `Get Started With ClickHouse` in `glutensupport.md`. Meanwhile, replace `gluten-celeborn-package-xx-SNAPSHOT.jar` with `(The bundled Gluten Jar. Make sure -Pceleborn is specified when it is built.)`, which refers to https://github.com/apache/incubator-gluten/pull/6692.
### Why are the changes needed?
The document link of `Get Started With Velox` and `Get Started With ClickHouse` could not access, which has already changed the url.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
No.
Closes#2762 from SteNicholas/CELEBORN-1486.
Authored-by: SteNicholas <programgeek@163.com>
Signed-off-by: mingji <fengmingxiao.fmx@alibaba-inc.com>
### What changes were proposed in this pull request?
Support SSL for celeborn RESTful service.
### Why are the changes needed?
For HTTP SSL connection requirements.
### Does this PR introduce _any_ user-facing change?
No, SSL is disabled by defaults.
### How was this patch tested?
Integration testing.
```
celeborn.master.http.ssl.enabled=true
celeborn.master.http.ssl.keystore.path=/hadoop/keystore.jks
celeborn.master.http.ssl.keystore.password=xxxxxxx
```
<img width="1143" alt="image" src="https://github.com/user-attachments/assets/2334561d-1de3-4b38-bc80-5d5d86d3b8ff">
<img width="695" alt="image" src="https://github.com/user-attachments/assets/e3877468-cc3b-4a4a-bf75-2994f557a104">
Closes#2756 from turboFei/HADP_1609_ssl2.
Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: Shuang <lvshuang.xjs@alibaba-inc.com>
### What changes were proposed in this pull request?
Impl worker write process for Flink Hybrid Shuffle.
### Why are the changes needed?
We supports tiered producer write data from flink to worker. In this PR, we enable the worker to write this kind of data to storage.
### Does this PR introduce _any_ user-facing change?
no
### How was this patch tested?
no need.
Closes#2741 from reswqa/cip6-6-pr.
Authored-by: Weijie Guo <reswqa@163.com>
Signed-off-by: Shuang <lvshuang.xjs@alibaba-inc.com>
### What changes were proposed in this pull request?
Fix `RaftPeerId` generated by command of `raftMetaConf` to use real PeerId for `local` command in `celeborn_ratis_shell.md` to sync document [cli.md](https://github.com/apache/ratis/blob/ratis-3.0.1/ratis-docs/src/site/markdown/cli.md).
### Why are the changes needed?
Celeborn has already bumped Ratis version from 3.0.1 to 3.1.0. Ratis v3.1.0 has already fixed RaftPeerId generated by command of "raftMetaConf" to use real PeerId in `raft-meta.conf` and store back to generated `new-raft-meta.conf`.
Backport: https://github.com/apache/ratis/pull/1060
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
No.
Closes#2748 from SteNicholas/CELEBORN-1466.
Authored-by: SteNicholas <programgeek@163.com>
Signed-off-by: mingji <fengmingxiao.fmx@alibaba-inc.com>
### What changes were proposed in this pull request?
Fix the unique key to reflect correct columns names.
### Why are the changes needed?
Running current DB scripts give below error because `user` column was renamed to `name` (https://github.com/apache/celeborn/pull/2340) but the unique key was not updated correctly.
```
mysql> CREATE TABLE IF NOT EXISTS celeborn_cluster_tenant_config
-> (
-> id int NOT NULL AUTO_INCREMENT,
-> cluster_id int NOT NULL,
-> tenant_id varchar(255) NOT NULL,
-> level varchar(255) NOT NULL COMMENT 'config level, valid level is TENANT,USER',
-> name varchar(255) DEFAULT NULL COMMENT 'tenant sub user',
-> config_key varchar(255) NOT NULL,
-> config_value varchar(255) NOT NULL,
-> type varchar(255) DEFAULT NULL COMMENT 'conf categories, such as quota',
-> gmt_create timestamp NOT NULL,
-> gmt_modify timestamp NOT NULL,
-> PRIMARY KEY (id),
-> UNIQUE KEY `index_unique_tenant_config_key` (`cluster_id`, `tenant_id`, `user`, `config_key`)
-> );
ERROR 1072 (42000): Key column 'user' doesn't exist in table
```
### Does this PR introduce _any_ user-facing change?
NA
### How was this patch tested?
Tested in local DB
```
mysql> CREATE TABLE IF NOT EXISTS celeborn_cluster_tenant_config
-> (
-> id int NOT NULL AUTO_INCREMENT,
-> cluster_id int NOT NULL,
-> tenant_id varchar(255) NOT NULL,
-> level varchar(255) NOT NULL COMMENT 'config level, valid level is TENANT,USER',
-> name varchar(255) DEFAULT NULL COMMENT 'tenant sub user',
-> config_key varchar(255) NOT NULL,
-> config_value varchar(255) NOT NULL,
-> type varchar(255) DEFAULT NULL COMMENT 'conf categories, such as quota',
-> gmt_create timestamp NOT NULL,
-> gmt_modify timestamp NOT NULL,
-> PRIMARY KEY (id),
-> UNIQUE KEY `index_unique_tenant_config_key` (`cluster_id`, `tenant_id`, `name`, `config_key`)
-> );
Query OK, 0 rows affected (0.01 sec)
```
Closes#2740 from s0nskar/fix-db-script.
Authored-by: Sanskar Modi <sanskarmodi97@gmail.com>
Signed-off-by: Shuang <lvshuang.xjs@alibaba-inc.com>
### What changes were proposed in this pull request?
enrich the docs for supporting wildcard address bind in this [PR](https://github.com/apache/celeborn/pull/2713).
### Why are the changes needed?
better docs
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
N/A - just docs change
Closes#2736 from akpatnam25/CELEBORN-1513-doc-followup.
Authored-by: Aravind Patnam <akpatnam25@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>