celeborn/master
Wang, Fei 680b072b5b [CELEBORN-1753] Optimize the code for exists and find method
### What changes were proposed in this pull request?

Optimize the code for `exists` and `find`.

1.  Enhance the performance to lookup workerInfo by workerUniqueId instead of looping the collection:
 74c1ec0a7f/client/src/main/scala/org/apache/celeborn/client/LifecycleManager.scala (L65-L66)

Change the type to:
```
 type ShuffleAllocatedWorkers =
    ConcurrentHashMap[Int, ConcurrentHashMap[String, ShufflePartitionLocationInfo]]
```
And save the `WorkerInfo` into `ShufflePartitionLocationInfo`.
```
class ShufflePartitionLocationInfo(val workerInfo: WorkerInfo) {
...
}
```

So that, we can get the `WorkerInfo` by worker uniqueId fast.

2. Reduce the loop cost for below code: 33ba0e02f5/worker/src/main/scala/org/apache/celeborn/service/deploy/worker/Controller.scala (L455-L466)

### Why are the changes needed?

Enhance the performance.
Address comments:
https://github.com/apache/celeborn/pull/2959#pullrequestreview-2466200199
https://github.com/apache/celeborn/pull/2959#issuecomment-2505137166

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?

GA

Closes #2962 from turboFei/CELEBORN_1753_exists.

Lead-authored-by: Wang, Fei <fwang12@ebay.com>
Co-authored-by: Fei Wang <cn.feiwang@gmail.com>
Signed-off-by: Shuang <lvshuang.xjs@alibaba-inc.com>
2024-12-23 17:56:20 +08:00
..
src [CELEBORN-1753] Optimize the code for exists and find method 2024-12-23 17:56:20 +08:00
pom.xml [CELEBORN-1746] Reduce the size of aws dependencies 2024-11-28 19:45:01 +08:00