### What changes were proposed in this pull request?
Cleans up the pooled send buffers and push tasks if the SendBufferPool has been idle for more than
`celeborn.client.push.sendbufferpool.expireTimeout`.
### Why are the changes needed?
Before this PR the SendBufferPool will cache the send buffers and push tasks forever. If they are large
and will not be reused in the future, it wastes memory and causes GC.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Passes GA and manual tests.
Closes#1735 from waitinfuture/812-1.
Authored-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
Signed-off-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
### What changes were proposed in this pull request?
After discussion, we make sure that `shuffleManager.unregisterShuffle()` will be triggered by Spark both in driver and executor. In this pr:
1. Add shuffle client both in driver and executor side in ShuffleManager
2. ShuffleClient call cleanupShuffle() when trigger `unregisterShuffle`.
This replaced https://github.com/apache/incubator-celeborn/pull/1719
### Why are the changes needed?
### Does this PR introduce _any_ user-facing change?
### How was this patch tested?
Closes#1726 from AngersZhuuuu/CELEBORN-804.
Authored-by: Angerszhuuuu <angers.zhu@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
### What changes were proposed in this pull request?
Fix some typos and grammar
### Why are the changes needed?
Ditto
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
manually test
Closes#1733 from onebox-li/fix-typo.
Authored-by: onebox-li <lyh-36@163.com>
Signed-off-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
### What changes were proposed in this pull request?
As title
<img width="1051" alt="截屏2023-07-19 下午1 01 25" src="https://github.com/apache/incubator-celeborn/assets/46485123/26d506b2-bab9-43f5-9bbe-58d22a761bab">
### Why are the changes needed?
### Does this PR introduce _any_ user-facing change?
### How was this patch tested?
Closes#1732 from AngersZhuuuu/CELEBORN-809.
Authored-by: Angerszhuuuu <angers.zhu@gmail.com>
Signed-off-by: Angerszhuuuu <angers.zhu@gmail.com>
…up client
### What changes were proposed in this pull request?
Add heartbeat from client to lifecycle manager. In this PR heartbeat request contains local shuffle ids from
client, lifecycle manager checks with it's local set and returns ids it doesn't know. Upon receiving response,
client calls ```unregisterShuffle``` for cleanup.
### Why are the changes needed?
Before this PR, client side ```unregisterShuffle``` is never called. When running TPCDS 3T with spark thriftserver
without DRA, I found the Executor's heap contains 1.6 million PartitionLocation objects (and StorageInfo):

After this PR, the number of PartitionLocation objects decreases to 275 thousands

This heartbeat can be extended in the future for other purposes, i.e. reporting client's metrics.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Passes GA and manual test.
Closes#1719 from waitinfuture/798.
Authored-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
Signed-off-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
### What changes were proposed in this pull request?
Warn when local shuffle reader is enabled.
```
Detected spark.sql.adaptive.localShuffleReader.enabled (default is true) is enabled,
it's highly recommended to disable it when use Celeborn as Remote Shuffle Service to
avoid performance degradation.
```
### Why are the changes needed?
When local shuffle reader is enabled, the reduce task may read shuffle data in by map id, which is not match the Celeborn shuffle data clustering model, then cause extremely bad shuffle read performance.
### Does this PR introduce _any_ user-facing change?
Yes, user would see warning message from Driver log when `spark.sql.adaptive.localShuffleReader.enabled` is true.
### How was this patch tested?
Review.
Closes#1721 from pan3793/CELEBORN-801.
Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
### What changes were proposed in this pull request?
Reuse ```DataPusher#idleQueue``` by pooling in ```SendBufferPool``` to avoid too many ```byte[]```
objects in ```PushTask```.
### Why are the changes needed?
I'm testing 3T TPCDS. Before this PR, I encountered Container killed because of OOM, GC is about 9.6h. For alive Executors, I dumped the memory and see number of PushTask object is 2w, and the number of ```64k``` byte[] is 23356, total around 1.7G:

After this PR, no container is killed because of OOM, GC is about 8.6h. I also dumped Executor and found number
of PushTask object is 3584, and the number of ```64K``` byte[] objects is 5783, total around 361M:

Also, before this PR, total execution time is ```3313.8s```, after this PR, total execution time is ```3229.5s```.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Passes GA and Manual test.
Closes#1722 from waitinfuture/802.
Authored-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
Signed-off-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
### What changes were proposed in this pull request?
1. Decrease writeTime metric sampling frequency to improve perf
2. Set default value of ```celeborn.<module>.push.timeoutCheck.threads``` and ```celeborn.<module>.fetch.timeoutCheck.threads``` to 4
### Why are the changes needed?
Following are test cases
case 1: ```spark.sparkContext.parallelize(1 to 8000, 8000).flatMap( _ => (1 to 15000000).iterator.map(num => num)).repartition(8000).count``` // shuffle 1.1T data
case 2: ```spark.sparkContext.parallelize(1 to 8000, 8000).flatMap( _ => (1 to 30000000).iterator.map(num => num)).repartition(8000).count``` // shuffle 2.2T data
Following are e2e time of shuffle write stage
||Sort pusher before|Sort pusher after|Hash pusher before|Hash pusher after|
|----|----|----|----|-----|
|case1|4.4min|4.1min|4.4min|3.9min|
|case2|9.1min|8.4min|9.7min|8.5min|
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Passes GA and manual test.
Closes#1718 from waitinfuture/797.
Authored-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
Signed-off-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
…r Spark2
### What changes were proposed in this pull request?
As title
### Why are the changes needed?
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
Pass GA and manual test.
Closes#1717 from shujiewu/CELEBORN-792.
Authored-by: 无迹 <peter.wsj@alibaba-inc.com>
Signed-off-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
### What changes were proposed in this pull request?
As title
### Why are the changes needed?
https://github.com/apache/incubator-celeborn/pull/1699#discussion_r1259137323
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
Pass GA
Closes#1704 from cfmcgrady/insert-record-followup.
Authored-by: Fu Chen <cfmcgrady@gmail.com>
Signed-off-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
### What changes were proposed in this pull request?
As title
### Why are the changes needed?
[comment](7adf1fca41 (r121138008))
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
New UT
Closes#1699 from cfmcgrady/insert-record.
Authored-by: Fu Chen <cfmcgrady@gmail.com>
Signed-off-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
### What changes were proposed in this pull request?
As title
### Why are the changes needed?
```shell
$ bin/spark-shell --master local[2]
23/07/06 16:11:30 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
23/07/06 16:11:40 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Spark context available as 'sc' (master = local[2], app id = local-1688631101733).
Spark session available as 'spark'.
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 3.3.1
/_/
Using Scala version 2.12.15 (OpenJDK 64-Bit Server VM, Java 1.8.0_292)
Type in expressions to have them evaluated.
Type :help for more information.
scala> spark.sparkContext.getConf.get("spark.executor.cores")
java.util.NoSuchElementException: spark.executor.cores
at org.apache.spark.SparkConf.$anonfun$get$1(SparkConf.scala:245)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.SparkConf.get(SparkConf.scala:245)
... 47 elided
scala>
```
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
CelebornPipelineSortSuite should cover this change
Closes#1685 from cfmcgrady/local-core-number.
Authored-by: Fu Chen <cfmcgrady@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
### What changes were proposed in this pull request?
Rename remain rss related class name and filenames etc...
### Why are the changes needed?
### Does this PR introduce _any_ user-facing change?
### How was this patch tested?
Closes#1664 from AngersZhuuuu/CELEBORN-751.
Authored-by: Angerszhuuuu <angers.zhu@gmail.com>
Signed-off-by: Angerszhuuuu <angers.zhu@gmail.com>
### What changes were proposed in this pull request?
Provide a new SparkShuffleManager to replace RssShuffleManager in the future
### Why are the changes needed?
### Does this PR introduce _any_ user-facing change?
### How was this patch tested?
Closes#1667 from AngersZhuuuu/CELEBORN-754.
Lead-authored-by: Angerszhuuuu <angers.zhu@gmail.com>
Co-authored-by: Cheng Pan <pan3793@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
### What changes were proposed in this pull request?
Currently SortBasedShuffleWriter won't update peakMemoryUsedBytes, this pr support this.
### Why are the changes needed?
### Does this PR introduce _any_ user-facing change?
### How was this patch tested?
Closes#1632 from AngersZhuuuu/CELEBORN-720.
Authored-by: Angerszhuuuu <angers.zhu@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
### What changes were proposed in this pull request?
As title
### Why are the changes needed?
before this PR the `liborg_apache_celeborn_shaded_netty_transport_native_epoll_aarch_64.so` can't correctly be loaded.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
Manually tested
```shell
> tar zxf celeborn-client-spark-3-shaded_2.12-0.4.0-SNAPSHOT.jar
> find * -name "*.so"
META-INF/native/liborg_apache_celeborn_shaded_netty_transport_native_epoll_aarch_64.so
META-INF/native/liborg_apache_celeborn_shaded_netty_transport_native_epoll_x86_64.so
```
Closes#1625 from cfmcgrady/typo.
Authored-by: Fu Chen <cfmcgrady@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
### What changes were proposed in this pull request?
1. separated push data timeout tests and push merge data timeout tests in `PushDataTimeoutTest`
2. updated the test results assertion
3. rework `pushdata timeout will add to blacklist`
### Why are the changes needed?
ensure that the timeout behavior is correctly implemented
https://github.com/apache/incubator-celeborn/pull/1613#discussion_r1236423721
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
existing tests
Closes#1620 from cfmcgrady/push-timeout-test.
Authored-by: Fu Chen <cfmcgrady@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
### What changes were proposed in this pull request?
Make appUniqueId a member of ShuffleClientImpl and remove applicationId from RPC messages across client side, so it won't cause compatibility issues.
### Why are the changes needed?
Currently Celeborn Client is bound to a single application id, so there's no need to pass applicationId around in many RPC messages in client side.
### Does this PR introduce _any_ user-facing change?
In some logs the application id will not be printed, which should not be a problem.
### How was this patch tested?
UTs.
Closes#1621 from waitinfuture/appid.
Authored-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
Signed-off-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
### What changes were proposed in this pull request?
As title.
### Why are the changes needed?
https://github.com/apache/incubator-celeborn/pull/1585#issuecomment-1589164128
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
tested locally.
Closes#1604 from cfmcgrady/hash-based-writer-metrics.
Authored-by: Fu Chen <cfmcgrady@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
### What changes were proposed in this pull request?
1. Use `<arg>-Ywarn-unused-import</arg>` to remove some unused imports
There is no way to use `<arg>-Ywarn-unused-import</arg>` at this stage
Because we have the following code
```
// Can Remove this if celeborn don't support scala211 in future
import org.apache.celeborn.common.util.FunctionConverter._
```
2. Fix scala case match not fully covered, avoid `scala.MatchError`
3. Fixed some scala compilation warnings
### Why are the changes needed?
### Does this PR introduce _any_ user-facing change?
### How was this patch tested?
Closes#1600 from cxzl25/cleanup_code.
Authored-by: sychen <sychen@ctrip.com>
Signed-off-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
### What changes were proposed in this pull request?
1. renamed `RssShuffleWriterSuiteJ` to `CelebornShuffleWriterSuiteBase`, which now serves as an abstract base class.
2. two new classes, `HashBasedShuffleWriterSuiteJ` and `SortBasedShuffleWriterSuiteJ`, have been added. These classes extend `CelebornShuffleWriterSuiteBase` and provide suites for testing hash-based and sort-based shuffle writers.
### Why are the changes needed?
### Does this PR introduce _any_ user-facing change?
### How was this patch tested?
Closes#1570 from cfmcgrady/sort-based-writer-suite.
Authored-by: Fu Chen <cfmcgrady@gmail.com>
Signed-off-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
### What changes were proposed in this pull request?
This PR upgrades
- `mockito` from 1.10.19 and 3.6.0 to 4.11.0
- `scalatest` from 3.2.3 to 3.2.16
- `mockito-scalatest` from 1.16.37 to 1.17.14
### Why are the changes needed?
Housekeeping, making test dependencies up-to-date and unified.
### Does this PR introduce _any_ user-facing change?
No, it only affects test.
### How was this patch tested?
Pass GA.
Closes#1562 from pan3793/CELEBORN-650.
Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
### What changes were proposed in this pull request?
- Replace index-based item access with an iterator for LinkedList.
- Always try to remove a buffer if SendBufferPool does not have a matched candidate, this change makes the total buffer number from `capacity+N-1` to `capacity` in worst cases.
- Some logs and code polish.
### Why are the changes needed?
Improve performance and logs, reduce memory consumption.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Pass GA.
Closes#1560 from pan3793/CELEBORN-648.
Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
### What changes were proposed in this pull request?
`mapStatusRecords` is required in Spark 2 for constructing `MapStatus` when AQE is enabled, but not in Spark 3, so remove it to save memory and compute resources.
This PR also simplifies the `for loop` code.
### Why are the changes needed?
Remove unnecessary variables to save resources and clean up code.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Pass GA.
Closes#1564 from pan3793/CELEBORN-654.
Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
### What changes were proposed in this pull request?
Rename variable `newAppId` to `appUniqueId` in Spark client.
### Why are the changes needed?
Make the variable name intuitive.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Pass GA.
Closes#1565 from pan3793/CELEBORN-655.
Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
### What changes were proposed in this pull request?
Remove support for `rss.*` configuration alias
### Why are the changes needed?
The legacy `rss.*` configuration alias was added during Celeborn entering Apache Incubator, to simplify users' migration from RSS to Celeborn.
Lots of configuration changes happened after Celeborn 0.2, the `rss.*` configuration alias become less helpful, so remove it to clean up the code.
### Does this PR introduce _any_ user-facing change?
Yes, but it's expected, the `rss.*` compatibility has never been documented.
### How was this patch tested?
Pass GA.
Closes#1547 from pan3793/CELEBORN-637.
Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
### What changes were proposed in this pull request?
Fix columnar shuffle codegen exception. This is a refactoring of #1523。
Closes#1543 from kerwin-zk/issue-620.
Authored-by: xiyu.zk <xiyu.zk@alibaba-inc.com>
Signed-off-by: xiyu.zk <xiyu.zk@alibaba-inc.com>