### What changes were proposed in this pull request?
Update infrastructure for SSL support.
Please see #2416 for the consolidated PR with all the changes for reference.
### Why are the changes needed?
At a high level, the changes are:
* `ManagedBuffer.convertToNettyForSsl`, to support SSL encryption.
* Add `EncryptedMessageWithHeader`, which is used to wrap the message and body, for use with SSL.
* `SslMessageEncoder` is an encoder for SSL
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
The overall PR #2416 (and this PR as well) passes all tests, and this PR includes relevant subset of tests.
Closes#2427 from mridulm/update-infra-for-ssl.
Authored-by: Mridul Muralidharan <mridulatgmail.com>
Signed-off-by: SteNicholas <programgeek@163.com>
### What changes were proposed in this pull request?
Remove unnecessary configuration `celeborn.client.flink.inputGate.minMemory` and `celeborn.client.flink.resultPartition.minMemory`.
### Why are the changes needed?
`celeborn.client.flink.inputGate.minMemory` and `celeborn.client.flink.resultPartition.minMemory` are configured as min memory reserved at present. Meanwhile, `celeborn.client.flink.inputGate.memory` should be at least `networkBufferSize * MIN_BUFFERS_PER_GATE` bytes, and `celeborn.client.flink.resultPartition.memory` should be at least `networkBufferSize * MIN_BUFFERS_PER_PARTITION` bytes. Therefore, `celeborn.client.flink.inputGate.minMemory` and `celeborn.client.flink.resultPartition.minMemory` are unnecessary configuration for `celeborn.client.flink.inputGate.memory` and `celeborn.client.flink.resultPartition.memory`.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
`PluginSideConfSuiteJ#testCoalesce`
Closes#2433 from SteNicholas/CELEBORN-1362.
Authored-by: SteNicholas <programgeek@163.com>
Signed-off-by: mingji <fengmingxiao.fmx@alibaba-inc.com>
### What changes were proposed in this pull request?
Support Netty level logging at the network layer for Celeborn. To configure Netty level logging a LogHandler must be added to the channel pipeline. `NettyLogger` is introduced as a new class which is able to construct a log handler depending on the log level:
- In case of `<Logger name="org.apache.celeborn.common.network.util.NettyLogger" level="DEBUG" additivity="false">`: a custom log handler is created which does not dump the message contents. This way the log is a bit more compact. Moreover when network level encryption is switched on this level might be sufficient.
- In case of `<Logger name="org.apache.celeborn.common.network.util.NettyLogger" level="TRACE" additivity="false">`: Netty's own log handler is used which dumps the message contents.
- Otherwise (when the logger is not `TRACE` or `DEBUG`) the pipeline does not contain a log handler (there is no runtime penalty for the default setting but a long running service must be restarted along with the new log level to have an effect).
Backport:
- [[SPARK-36719][CORE] Supporting Netty Logging at the network layer](https://github.com/apache/spark/pull/33962)
- [[SPARK-45377][CORE] Handle InputStream in NettyLogger](https://github.com/apache/spark/pull/43165)
### Why are the changes needed?
This level of logging proved to be sufficient during debugging some external shuffle related problem. Compared with the tcpdump this log lines can be more easily correlated with the Celeborn internal calls. Moreover the log layout can be configured to contain the thread names that way for a timeout a busy thread could be identified.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Local manually test.
Closes#2423 from SteNicholas/CELEBORN-1359.
Authored-by: SteNicholas <programgeek@163.com>
Signed-off-by: Shuang <lvshuang.xjs@alibaba-inc.com>
### What changes were proposed in this pull request?
Before, there is no http request spec likes query param, http method and response mediaType.
And for each api, a HttpEndpoint class is needed.
In this PR, we refine the code for http service and provide swagger ui.
Note that: This pr does not change the orignal api request and response behavior, including metrics APIs.
TODO:
1. define DTO
2. http request authentication
<img width="1900" alt="image" src="https://github.com/apache/incubator-celeborn/assets/6757692/7f8c2363-170d-4bdf-b2c9-74260e31d3e5">
<img width="1138" alt="image" src="https://github.com/apache/incubator-celeborn/assets/6757692/3ae6ec8e-00a8-475b-bb37-0329536185f6">
### Why are the changes needed?
To close CELEBORN-1317
### Does this PR introduce _any_ user-facing change?
The api is align with before.
### How was this patch tested?
UT.
Closes#2371 from turboFei/jetty.
Authored-by: Fei Wang <fwang12@ebay.com>
Signed-off-by: Shuang <lvshuang.xjs@alibaba-inc.com>
Add SSL related configs and support for `ReloadingX509TrustManager`, required for enabling SSL support.
Please see #2416 for the consolidated PR with all the changes for reference.
Introduces SSL related configs for enabling and configuring use of TLS.
Yes, introduces configs to control behavior of SSL
The overall PR #2411 (and this PR as well) passes all tests, this is specifically pulling out the `ReloadingX509TrustManager` and config related changes
Closes#2419 from mridulm/config-for-ssl.
Authored-by: Mridul Muralidharan <mridulatgmail.com>
Signed-off-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
### What changes were proposed in this pull request?
Build changes and test resources for enabling SSL support.
Please see #2416 for the consolidate PR with all the changes for reference.
Note: I closed the older PR #2413 and reopened this one give the repo changes.
### Why are the changes needed?
Build dependency updates and addition of test resources for use with tests.
The specific tests leveraging these will be added in subsequent jiras linked off of CELEBORN-1343
Splitting it up into multiple PR's to reduce the review load.
### Does this PR introduce _any_ user-facing change?
io.netty:netty-tcnative-boringssl-static is an additional dependency.
org.bouncycastle:* are test dependencies which should have no user facing changes.
### How was this patch tested?
The overall PR #2411 passes all tests, this is specifically pulling out the dependency changes and resources.
Closes#2417 from mridulm/build-and-test-for-tls.
Lead-authored-by: Mridul Muralidharan <mridul@gmail.com>
Co-authored-by: Mridul Muralidharan <mridulatgmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
### What changes were proposed in this pull request?
Batch OpenStream RPCs by Worker to avoid too many RPCs.
### Why are the changes needed?
ditto
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
Passes GA and Manual tests.
Closes#2362 from waitinfuture/1144.
Authored-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
Signed-off-by: Shuang <lvshuang.xjs@alibaba-inc.com>
### What changes were proposed in this pull request?
`AbstractRemoteShuffleResultPartitionFactory` removes the check of shuffle compression codec.
### Why are the changes needed?
`AbstractRemoteShuffleResultPartitionFactory` checks whether shuffle compression codec is LZ4 for Flink 1.14 and 1.15 version at present. Meanwhile, since Flink 1.17 version, ZSTD has already been supported. Therefore `AbstractRemoteShuffleResultPartitionFactory` should remove the check of shuffle compression codec for Flink 1.17 version and above, which is checked via the constructor of `BufferCompressor`.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
- `RemoteShuffleResultPartitionFactorySuiteJ`
Closes#2414 from SteNicholas/CELEBORN-1357.
Authored-by: SteNicholas <programgeek@163.com>
Signed-off-by: Shuang <lvshuang.xjs@alibaba-inc.com>
### What changes were proposed in this pull request?
Currently, the Celeborn master calculates the estimatedPartitionSize based on the fileInfo committed by the application. This estimate is then used to allocate slots across all workers. However, this partition size may be too large or too small for Celeborn. For example, if an application commits a single file of 1TB to only one worker, using that partition size could result in all other workers having no available slots or only very small slots. To improve this, it would be better to implement a cap on the master's estimated partition size to prevent such imbalances.
### Why are the changes needed?
As title
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
UT
Closes#2412 from RexXiong/CELEBORN-1345.
Lead-authored-by: lvshuang.xjs <lvshuang.xjs@taobao.com>
Co-authored-by: Shuang <lvshuang.xjs@alibaba-inc.com>
Signed-off-by: mingji <fengmingxiao.fmx@alibaba-inc.com>
### What changes were proposed in this pull request?
Introduce JVM profiling `JVMProfier` in Celeborn Worker using async-profiler to capture CPU and memory profiles.
### Why are the changes needed?
[async-profiler](https://github.com/async-profiler) is a sampling profiler for any JDK based on the HotSpot JVM that does not suffer from Safepoint bias problem. It has low overhead and doesn’t rely on JVMTI. It avoids the safepoint bias problem by using the `AsyncGetCallTrace` API provided by HotSpot JVM to profile the Java code paths, and Linux’s perf_events to profile the native code paths. It features HotSpot-specific APIs to collect stack traces and to track memory allocations.
The feature introduces a profier plugin that does not add any overhead unless enabled and can be configured to accept profiler arguments as a configuration parameter. It should support to turn profiling on/off, includes the jar/binaries needed for profiling.
Backport [[SPARK-46094] Support Executor JVM Profiling](https://github.com/apache/spark/pull/44021).
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Worker cluster test.
Closes#2409 from SteNicholas/CELEBORN-1299.
Authored-by: SteNicholas <programgeek@163.com>
Signed-off-by: Shuang <lvshuang.xjs@alibaba-inc.com>
### What changes were proposed in this pull request?
add a new parameter to cap the max memory can be used for sort writer buffer
### Why are the changes needed?
with a huge number of partitions, the threshold based on buffer size * number of partitions without this cap can be too large, e.g. 64K * 100000 = 6G
### Does this PR introduce _any_ user-facing change?
a new parameter
### How was this patch tested?
ut
Closes#2388 from CodingCat/adaptive_followup.
Lead-authored-by: CodingCat <zhunansjtu@gmail.com>
Co-authored-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
Co-authored-by: Keyong Zhou <zhouky@apache.org>
Co-authored-by: Keyong Zhou <zky.zhoukeyong@alibaba-inc.com>
Signed-off-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
### What changes were proposed in this pull request?
Importing details from https://github.com/apache/spark/pull/43162:
--
This PR avoids a race condition where a connection which is in the process of being closed could be returned by the TransportClientFactory only to be immediately closed and cause errors upon use.
This race condition is rare and not easily triggered, but with the upcoming changes to introduce SSL connection support, connection closing can take just a slight bit longer and it's much easier to trigger this issue.
Looking at the history of the code I believe this was an oversight in https://github.com/apache/spark/pull/9853.
--
### Why are the changes needed?
We are working towards adding TLS support, which is essentially based on Spark 4.0 TLS support, and this is one of the fixes from there.
(I am yet to file the overall TLS support jira yet, but this is enabling work).
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
Unit tests
Closes#2400 from mridulm/add-SPARK-45375.
Authored-by: Mridul Muralidharan <mridulatgmail.com>
Signed-off-by: SteNicholas <programgeek@163.com>
### What changes were proposed in this pull request?
### Why are the changes needed?
### Does this PR introduce _any_ user-facing change?
### How was this patch tested?
Closes#2406 from cxzl25/TransportClient_typo.
Authored-by: sychen <sychen@ctrip.com>
Signed-off-by: SteNicholas <programgeek@163.com>
### What changes were proposed in this pull request?
### Why are the changes needed?
`CELEBORN-1320` uses `ReviveManager` to batch processing SOFT_SPLIT event RPC, so `partitionSplitPool` is no longer used, and the configuration item `celeborn.client.push.splitPartition.threads` is meaningless.
### Does this PR introduce _any_ user-facing change?
### How was this patch tested?
Closes#2396 from cxzl25/CELEBORN-1336.
Authored-by: sychen <sychen@ctrip.com>
Signed-off-by: SteNicholas <programgeek@163.com>
### What changes were proposed in this pull request?
Fix typo of `celeborn.network.bind.preferIpAddress` doc from `ture` to `true`.
### Why are the changes needed?
`celeborn.network.bind.preferIpAddress` doc has typo for `ture`.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
No.
Closes#2392 from SteNicholas/prefer-ip-address.
Authored-by: SteNicholas <programgeek@163.com>
Signed-off-by: SteNicholas <programgeek@163.com>
### What changes were proposed in this pull request?
To fix a bug that might cause persisted committed file info lost.
### Why are the changes needed?
A worker starts will clean its persisted committed file info and won't put back if this worker restart again, the committed file infos will lost.
### Does this PR introduce _any_ user-facing change?
NO.
### How was this patch tested?
GA.
Closes#2390 from FMX/b863-1.
Authored-by: mingji <fengmingxiao.fmx@alibaba-inc.com>
Signed-off-by: Shuang <lvshuang.xjs@alibaba-inc.com>
### What changes were proposed in this pull request?
Updates Celeborn to account for [JDK-8303083](https://bugs.openjdk.org/browse/JDK-8303083), which affects JDK21. (similar to the Apache Spark change here:
https://github.com/apache/spark/pull/39909)
### Why are the changes needed?
Without this change, Spark users of Celeborn will encounter a runtime error similar to the following:
`Caused by: java.lang.ExceptionInInitializerError: Exception java.lang.IllegalStateException: java.lang.NoSuchMethodException: java.nio.DirectByteBuffer.<init>(long, int) [in thread "Executor task launch worker for task 0.0 in stage 0.0 (TID 0)"]
at org.apache.celeborn.common.unsafe.Platform.<clinit>(Platform.java:135)
... 16 more`
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Tested using standalone Spark 3.5.1 (Hadoop 3.2) and the Celeborn `main` branch with JDK21 build changes in https://github.com/apache/incubator-celeborn/pull/2385. Reproduced the runtime error above and confirmed the patch resolves it.
Closes#2387 from curtishoward/CELEBORN-1327.
Authored-by: Curtis Howard <curtis@curtiss-mbp.lan>
Signed-off-by: Cheng Pan <chengpan@apache.org>
### What changes were proposed in this pull request?
while SortBasedWriter has less memory footprint than HashBasedWriter, it suffers from performance issue when we have many partitions and the write buffer is filled with small chunks of data quickly
for example, if sort buffer size is 32K, you have 4 partitions and 128K data in total, the data distribution is like partition A, B, C, D, each time it comes with 8K per partition.... in this case, you need to compress and send small 8K chunk 4 times per partition , the cost would become very high. If you use hashbasedwriter, it doesn't have this problem since the push only happens when the per-partition buffer is full. Of course , larger sort buffer size can mitigate the issue, but tuning sort buffer size per job is a tedious work
this PR introduces a new feature that we measure total size of pushed bytes and pushed count as well as the "should-pushed" bytes and counts (should-push means that , the data we pushed is larger than CLIENT_PUSH_BUFFER_MAX_SIZE (in another word, we will trigger a push even with hashbasedwriter in this case))
when actualPushedBytes/actualPushedCounts > (1 + Threshold) * (ShouldPushBytes/ShouldPushCounts), we will enlarge the sort buffer size by 1X to try to buffer more data before pushing (the max size of sortBuffer would be capped at # of partitions * CLIENT_PUSH_BUFFER_MAX_SIZE)
### Why are the changes needed?
to reduce perf cost in sortbased writer
### Does this PR introduce _any_ user-facing change?
no, but have 2 extra configurations
### How was this patch tested?
in prod of our company and also unit test
Closes#2358 from CodingCat/adaptive_memory_threshold.
Authored-by: CodingCat <zhunansjtu@gmail.com>
Signed-off-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
### What changes were proposed in this pull request?
Enable custom network location aware replication, based on a custom impl of `DNSToSwitchMapping`.
### Why are the changes needed?
Resolution of network location of multiple workers at master can be expensive at times. This way, each worker resolves its own network location and sends to master via the RegisterWorker transport message. If worker cannot resolve, fallback to attempting to resolve at master (during update meta or reload of snapshot). Proposal: [Celeborn Custom Network Location Aware Replication](https://docs.google.com/document/d/11M_MKKnIXCTExJHMX-OMTq7SBpkl8fJMlpy8hLgmev0/edit#heading=h.s3vnydz589z5)
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
Updated the unit tests.
Closes#2367 from akpatnam25/CELEBORN-1313.
Authored-by: Aravind Patnam <apatnam@linkedin.com>
Signed-off-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
### What changes were proposed in this pull request?
PrometheusSink is not used.
### Why are the changes needed?
Close CELEBORN-1324
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Not needed.
Closes#2381 from turboFei/remove_unused.
Authored-by: Fei Wang <fwang12@ebay.com>
Signed-off-by: SteNicholas <programgeek@163.com>
### What changes were proposed in this pull request?
This enables a Celeborn Worker to retrieve the application meta from the Master if it hasn't received the secret from the Master before the application attempts to connect to it. Additionally, the Celeborn Worker's SecretRegistry has been converted into an LRU cache to prevent unbounded growth of the registry.
### Why are the changes needed?
This is last change needed for Auth support in Celeborn (https://issues.apache.org/jira/browse/CELEBORN-1011)
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
Added UTs and part of a bigger change which will be tested end-to-end.
Closes#2363 from otterc/CELEBORN-1179.
Authored-by: Chandni Singh <singh.chandni@gmail.com>
Signed-off-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
### What changes were proposed in this pull request?
When we shut down the master or worker, we can output the signal as a record.
### Why are the changes needed?
Conveniently track the status of master and workers.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
local test
```bash
./sbin/stop-all.sh
```
```
12:20:59.932 [SIGTERM handler] ERROR org.apache.celeborn.service.deploy.master.Master - RECEIVED SIGNAL TERM
```
```
12:20:59.563 [SIGTERM handler] ERROR org.apache.celeborn.service.deploy.worker.Worker - RECEIVED SIGNAL TERM
```
Closes#2334 from cxzl25/CELEBORN-1293.
Authored-by: sychen <sychen@ctrip.com>
Signed-off-by: mingji <fengmingxiao.fmx@alibaba-inc.com>
### What changes were proposed in this pull request?
This enables client to push and fetch shuffle data securely to Celeborn Workers.
### Why are the changes needed?
This change is required for adding authentication. ([CELEBORN-1011](https://issues.apache.org/jira/browse/CELEBORN-1011)).
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
It is part of bigger change which will be tested end to end.
Closes#2360 from otterc/CELEBORN-1261.
Authored-by: Chandni Singh <singh.chandni@gmail.com>
Signed-off-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
### What changes were proposed in this pull request?
This enables Celeborn Master to persist application meta in Ratis and also push it to Celeborn Workers when it receives the requests for slots from the LifecycleManager.
### Why are the changes needed?
This change is required for adding authentication. ([CELEBORN-1011](https://issues.apache.org/jira/browse/CELEBORN-1011)).
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
Added some UTs.
Closes#2346 from otterc/CELEBORN-1234.
Authored-by: Chandni Singh <singh.chandni@gmail.com>
Signed-off-by: SteNicholas <programgeek@163.com>
### What changes were proposed in this pull request?
Catch and throw FetchFailedException in CelebornInputStream#fillBuffer to enable spark's stage rerun
when fillBuffer encounters fetch chunk exception
### Why are the changes needed?
ditto
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
GA
Closes#2349 from waitinfuture/1301.
Lead-authored-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
Co-authored-by: Keyong Zhou <waitinfuture@gmail.com>
Signed-off-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
### What changes were proposed in this pull request?
This adds an internal port and auth support to Celeborn Wokers.
1. Internal port is used by a worker to receive messages from Celeborn Master.
2. Authentication support for secure communication with clients. This change doesn't add the support in clients to communicate to the Workers securely. That will be in a future change.
This change targets just adding the port and auth support to Worker. The following items from the proposal are still pending:
- Persisting the app secrets in Ratis.
- Forwarding secrets to Workers and having ability for the workers to pull registration info from the Master.
- Secured communication between workers and clients.
### Why are the changes needed?
It is needed for adding authentication support to Celeborn ([CELEBORN-1011](https://issues.apache.org/jira/browse/CELEBORN-1011))
### Does this PR introduce _any_ user-facing change?
Yes
### How was this patch tested?
Part of a bigger change. For this change, only modified existing UTs.
Closes#2292 from otterc/CELEBORN-1256.
Authored-by: Chandni Singh <singh.chandni@gmail.com>
Signed-off-by: waitinfuture <zky.zhoukeyong@alibaba-inc.com>
### What changes were proposed in this pull request?
Deprecate `celeborn.quota.configuration.path` config. User `celeborn.dynamicConfig.store.fs.path` instead.
### Why are the changes needed?
`DefaultQuotaManager` is removed in #2298, which causes that `celeborn.quota.configuration.path` is useless. `celeborn.quota.configuration.path` could be deprecated that uses `celeborn.dynamicConfig.store.fs.path` to config quota.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
No.
Closes#2339 from SteNicholas/CELEBORN-1239.
Authored-by: SteNicholas <programgeek@163.com>
Signed-off-by: SteNicholas <programgeek@163.com>
### What changes were proposed in this pull request?
Introduce `celeborn.dynamicConfig.store.fs.path` config to configure the path of dynamic config file for fs store backend.
### Why are the changes needed?
`FsConfigServiceImpl` uses `celeborn.quota.configuration.path` to configure the path of dynamic config file for fs store backend at present. The path of dynamic config file should be introduced with `celeborn.dynamicConfig.store.fs.path` instead of quota configuration path.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
No.
Closes#2337 from SteNicholas/CELEBORN-1296.
Authored-by: SteNicholas <programgeek@163.com>
Signed-off-by: SteNicholas <programgeek@163.com>
### What changes were proposed in this pull request?
When an `IdleStateEvent ` is received, the configuration items of the corresponding module are output.
### Why are the changes needed?
Now that the `IdleStateEvent` event is received, only the timeout time is output, but the corresponding configuration items are not output.
```
24/02/26 04:12:08,062 [data-client-5-8] ERROR TransportChannelHandler: Connection to /XXX:YYY has been quiet for 240000 ms while there are outstanding requests.
```
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
GA
Closes#2329 from cxzl25/CELEBORN-1288.
Authored-by: sychen <sychen@ctrip.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
### What changes were proposed in this pull request?
Optimize the handling of exceptions during the push of replica data, now only throwing PUSH_DATA_CONNECTION_EXCEPTION_REPLICA in specific scenarios.
### Why are the changes needed?
When handling exceptions related to pushing replica data in the worker, unmatched exceptions, such as 'file already closed,' are uniformly transformed into REPLICATE_DATA_CONNECTION_EXCEPTION_COUNT and returned to the client. The client then excludes the peer node based on this count, which may not be appropriate in certain scenarios. For instance, in the case of an exception like 'file already closed,' it typically occurs during multiple splits and commitFile operations. Excluding a large number of nodes under such circumstances is clearly not in line with expectations.

### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
through exist uts
Closes#2323 from lyy-pineapple/CELEBORN-1282.
Authored-by: liangyongyuan <liangyongyuan@xiaomi.com>
Signed-off-by: waitinfuture <zky.zhoukeyong@alibaba-inc.com>
### What changes were proposed in this pull request?
Add `celeborn.quota.enabled` at Master and Client side to enable checking quota
### Why are the changes needed?
`celeborn.quota.enabled` should be added in Master and Client side to enable quota check for Celeborn Master and Client.
### Does this PR introduce _any_ user-facing change?
Add categories of `celeborn.quota,enabled` with `master` and `client`.
### How was this patch tested?
No.
Closes#2318 from AngersZhuuuu/CELEBORN-1277.
Authored-by: Angerszhuuuu <angers.zhu@gmail.com>
Signed-off-by: Angerszhuuuu <angers.zhu@gmail.com>
### What changes were proposed in this pull request?
https://github.com/apache/incubator-celeborn/pull/2292#discussion_r1497160753
Based on the above discussion, removing the additional secured port. The existing port will be used for secured communication when auth is enabled.
### Why are the changes needed?
These changes are for enabling authentication
### Does this PR introduce _any_ user-facing change?
Yes.
### How was this patch tested?
This removed additional secured port.
Closes#2327 from otterc/CELEBORN-1257.
Authored-by: Chandni Singh <singh.chandni@gmail.com>
Signed-off-by: waitinfuture <zky.zhoukeyong@alibaba-inc.com>
### What changes were proposed in this pull request?
`TransportClientFactory` avoid contention and get or create clientPools quickly.
### Why are the changes needed?
Avoid contention for getting or creating clientPools, and clean up the code.
Backport: [[SPARK-38555][NETWORK][SHUFFLE] Avoid contention and get or create clientPools quickly in the TransportClientFactory](https://github.com/apache/spark/pull/35860)
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
No.
Closes#2322 from SteNicholas/CELEBORN-1283.
Authored-by: SteNicholas <programgeek@163.com>
Signed-off-by: mingji <fengmingxiao.fmx@alibaba-inc.com>
### What changes were proposed in this pull request?
Change the default value of `celeborn.worker.graceful.shutdown.recoverDbBackend` from `LEVELDB` to `ROCKSDB`.
### Why are the changes needed?
Because the LevelDB support will be removed, the default value of `celeborn.worker.graceful.shutdown.recoverDbBackend` could be changed to ROCKSDB instead of LEVELDB for preparation of LevelDB deprecation.
Backport:
[[SPARK-45351][CORE] Change spark.shuffle.service.db.backend default value to ROCKSDB](https://github.com/apache/spark/pull/43142)
[[SPARK-45413][CORE] Add warning for prepare drop LevelDB support](https://github.com/apache/spark/pull/43217)
### Does this PR introduce _any_ user-facing change?
The default value of `celeborn.worker.graceful.shutdown.recoverDbBackend` is changed from `LEVELDB` to `ROCKSDB`.
### How was this patch tested?
No.
Closes#2320 from SteNicholas/CELEBORN-1280.
Lead-authored-by: SteNicholas <programgeek@163.com>
Co-authored-by: Nicholas Jiang <programgeek@163.com>
Signed-off-by: mingji <fengmingxiao.fmx@alibaba-inc.com>
### What changes were proposed in this pull request?
Refer [SPARK-37894](https://issues.apache.org/jira/browse/SPARK-37984)/ https://github.com/apache/spark/pull/35276
Avoid calculating all outstanding requests to improve performance
### Why are the changes needed?
Ditto.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Not needed.
Closes#2319 from turboFei/SPARK-37984_backport.
Authored-by: Fei Wang <fwang12@ebay.com>
Signed-off-by: mingji <fengmingxiao.fmx@alibaba-inc.com>
### What changes were proposed in this pull request?
Refer: [SPARK-28160](https://issues.apache.org/jira/browse/SPARK-28160) / https://github.com/apache/spark/pull/24964
ByteBuffer.allocate may throw OutOfMemoryError when the response is large but no enough memory is available. However, when this happens, TransportClient.sendRpcSync will just hang forever if the timeout set to unlimited.
### Why are the changes needed?
To catch the exception of `ByteBuffer.allocate` in corner case.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Quote the local test in https://github.com/apache/spark/pull/24964
```
I tested in my IDE by setting the value of size to -1 to verify the result. Without this patch, it won't be finished until timeout (May hang forever if timeout set to MAX_INT), or the expected IllegalArgumentException will be caught.
Override
public void onSuccess(ByteBuffer response) {
try {
int size = response.remaining();
ByteBuffer copy = ByteBuffer.allocate(size); // set size to -1 in runtime when debug
copy.put(response);
// flip "copy" to make it readable
copy.flip();
result.set(copy);
} catch (Throwable t) {
result.setException(t);
}
}
```
Closes#2316 from turboFei/fix_transport_client_onsucess.
Authored-by: Fei Wang <fwang12@ebay.com>
Signed-off-by: chenfu <chenfu@xiaohongshu.com>
### What changes were proposed in this pull request?
Move checkQuotaSpaceAvailable from Quota to QuotaManager
### Why are the changes needed?
Put method in correct place
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
Added UT
Closes#2317 from AngersZhuuuu/CELEBORN-1276.
Authored-by: Angerszhuuuu <angers.zhu@gmail.com>
Signed-off-by: Angerszhuuuu <angers.zhu@gmail.com>
### What changes were proposed in this pull request?
This pr does 2 things:
1. Remove unnecessary conf QUOTA_MANAGER since we implement it with ConfigService and ConfigService already have a conf to indicate the implement method.
2. Move the quota manager to Master side since only master use this
3. Support quota manager use FsConfigService and support default system level
### Why are the changes needed?
1. Many times, for users who do not have a quota configured, we hope to have a default quota that applies to them.
2. Quota manager should support refresh
3. QuotaManager should support integrate with ConfigService
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
added ut
Closes#2298 from AngersZhuuuu/CELEBORN-1239.
Authored-by: Angerszhuuuu <angers.zhu@gmail.com>
Signed-off-by: Angerszhuuuu <angers.zhu@gmail.com>
### What changes were proposed in this pull request?
Fix some typos.
### Why are the changes needed?
Ditto.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Not needed.
Closes#2314 from turboFei/fix_typo.
Authored-by: Fei Wang <fwang12@ebay.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
### What changes were proposed in this pull request?
Rename `celeborn.worker.sortPartition.reservedMemory.enabled` to `celeborn.worker.sortPartition.prefetch.enabled`. Address [r1469066327](https://github.com/apache/incubator-celeborn/pull/2264/files#r1469066327) of pan3793.
### Why are the changes needed?
`celeborn.worker.sortPartition.reservedMemory.enabled` is misleading, which should represent that prefetch the original partition files during the first sequential reading path to leverage the Linux PageCache mechanism to speed up the subsequent random reading of them. The config name could use `celeborn.worker.sortPartition.prefetch.enabled` which is is more accurate.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
No.
Closes#2312 from SteNicholas/CELEBORN-1254.
Authored-by: SteNicholas <programgeek@163.com>
Signed-off-by: SteNicholas <programgeek@163.com>
### What changes were proposed in this pull request?
Just fix minor typo:
```
ls **/scala/**/*.java
common/src/main/scala/org/apache/celeborn/common/meta/WorkerEventInfo.java common/src/main/scala/org/apache/celeborn/common/meta/WorkerStatus.java
```
After this:
```
ls **/scala/**/*.java
zsh: no matches found: **/scala/**/*.java
ls **/java/**/*.scala
zsh: no matches found: **/java/**/*.scala
```
### Why are the changes needed?
Fix code format.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Existing UT.
Closes#2310 from turboFei/scala_java.
Authored-by: Fei Wang <fwang12@ebay.com>
Signed-off-by: SteNicholas <programgeek@163.com>
### What changes were proposed in this pull request?
Since we support ConfigService, many configuration can be dynamic, add `isDynamic` property for CelebornConf in this pr.
### Why are the changes needed?
Make configuration doc more cleear
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
Existed UT
Closes#2308 from AngersZhuuuu/CELEBORN-1051.
Authored-by: Angerszhuuuu <angers.zhu@gmail.com>
Signed-off-by: Angerszhuuuu <angers.zhu@gmail.com>
### What changes were proposed in this pull request?
Improve the implementation of `ConfigService` including:
- Removes `celeborn.dynamicConfig.enabled`.
- Changes `celeborn.dynamicConfig.store.backend` to optional.
- Renames `refreshAllCache` to `refreshCache` in `ConfigService`.
- Checks whether the dynamic config file exists and is file in `FsConfigServiceImpl`.
### Why are the changes needed?
Whether to enable dynamic config could check via whether `celeborn.dynamicConfig.store.backend` is provided, instead of `celeborn.dynamicConfig.enabled`. The `refreshAllCache` interface could rename to `refreshCache` and throw Exception simply. Meanwhile, `FsConfigServiceImpl` should check whether the dynamic config file exists and is file.
### Does this PR introduce _any_ user-facing change?
- Renames `refreshAllCache` to `refreshCache` in `ConfigService`.
### How was this patch tested?
- `ConfigServiceSuiteJ`
Closes#2304 from SteNicholas/CELEBORN-1052.
Authored-by: SteNicholas <programgeek@163.com>
Signed-off-by: SteNicholas <programgeek@163.com>
### What changes were proposed in this pull request?
To close CELEBORN-1016, fix the issue when parse IPv6 host address.
### Why are the changes needed?
Fix CELEBORN-1016
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
UT.
Closes#2293 from turboFei/CELEBORN-1016_ipv6.
Authored-by: Fei Wang <fwang12@ebay.com>
Signed-off-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
### What changes were proposed in this pull request?
As title.
### Why are the changes needed?
For some scenarios, if Celeborn cannot be used, users want to report an error directly instead of fallback.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
CI
Closes#2291 from kerwin-zk/add-config.
Authored-by: xiyu.zk <xiyu.zk@alibaba-inc.com>
Signed-off-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
### What changes were proposed in this pull request?
This adds a secured port to Celeborn Master which is used for secure communication with LifecycleManager.
This is part of adding authentication support in Celeborn (see CELEBORN-1011).
This change targets just adding the secured port to Master. The following items from the proposal are still pending:
1. Persisting the app secrets in Ratis.
2. Forwarding secrets to Workers and having ability for the workers to pull registration info from the Master.
3. Secured and internal port in Workers.
4. Secured communication between workers and clients.
In addition, since we are supporting both secured and unsecured communication for backward compatibility and seamless rolling upgrades, there is an additional change needed. An app which registers with the Master can try to talk to the workers on unsecured ports which is a security breach. So, the workers need to know whether an app registered with Master or not and for that Master has to propagate list of un-secured apps to Celeborn workers as well. We can discuss this more with https://issues.apache.org/jira/browse/CELEBORN-1261
### Why are the changes needed?
It is needed for adding authentication support to Celeborn (CELEBORN-1011)
### Does this PR introduce _any_ user-facing change?
Yes
### How was this patch tested?
Added a simple UT.
Closes#2281 from otterc/CELEBORN-1257.
Authored-by: Chandni Singh <singh.chandni@gmail.com>
Signed-off-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
### What changes were proposed in this pull request?
Support database based store backend implementation for dynamic configuration management
### Why are the changes needed?
Currently celeborn provides `FsConfigServiceImpl` implementation for dynamic config service which is based on file system, We cloud Support database based store backend implementation.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
- `ConfigServiceSuiteJ#testDbConfig`
Closes#2273 from RexXiong/CELEBORN-1054.
Authored-by: Shuang <lvshuang.xjs@alibaba-inc.com>
Signed-off-by: Shuang <lvshuang.xjs@alibaba-inc.com>
### What changes were proposed in this pull request?
`WorkerSource` supports application dimension `ActiveConnectionCount` metric to record the number of registered connections for each application.
### Why are the changes needed?
`ActiveConnectionCount` metric records the number of registered connections at present. It's recommended to support dimension ActiveConnectionCount metric to record the number of registered connections for each application in Worker. Application dimension `ActiveConnectionCount` metric could provide users with the actual number of registered connections for each application.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Internal tests.
Closes#2167 from SteNicholas/CELEBORN-1182.
Authored-by: SteNicholas <programgeek@163.com>
Signed-off-by: mingji <fengmingxiao.fmx@alibaba-inc.com>
### What changes were proposed in this pull request?
Introduce `ThreadUtils#shutdown(executor)` method to improve the default gracePeriod of `ThreadUtils#shutdown`.
### Why are the changes needed?
The default value of `gracePeriod` for `ThreadUtils#shutdown` is 30 seconds at present. Meanwhile, the `gracePeriod` of most invoker for `ThreadUtils#shutdown` is 800 milliseconds. Therefore, the default `gracePeriod` of `ThreadUtils#shutdown` could be improved as 800 milliseconds.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
No.
Closes#2276 from SteNicholas/CELEBORN-1259.
Authored-by: SteNicholas <programgeek@163.com>
Signed-off-by: mingji <fengmingxiao.fmx@alibaba-inc.com>