celeborn/tests
sychen 362865f2ce [CELEBORN-1571] Fix flaky test - pushdata timeout will add to pushExcludedWorker
### What changes were proposed in this pull request?

### Why are the changes needed?

Because the worker port is in use, the driver's worker status may change from shutdown status to unknown, causing the test to fail.

https://github.com/apache/celeborn/actions/runs/10465286274/job/28980278764
```java
- celeborn spark integration test - pushdata timeout will add to pushExcludedWorkers *** FAILED ***
  WORKER_UNKNOWN did not equal PUSH_DATA_TIMEOUT_PRIMARY, and WORKER_UNKNOWN did not equal PUSH_DATA_TIMEOUT_REPLICA (PushDataTimeoutTest.scala:150)
```

unit-tests.log
```
24/08/20 05:28:30,400 INFO [celeborn-dispatcher-7] Master: Receive ReportNodeFailure [
Host: localhost
RpcPort: 41487
PushPort: 34259
FetchPort: 45713
ReplicatePort: 35107
InternalPort: 41487

24/08/20 05:29:29,414 WARN [celeborn-client-lifecycle-manager-change-partition-executor-3] WorkerStatusTracker:
Reporting failed workers:
Host:localhost:RpcPort:42267:PushPort:43741:FetchPort:46483:ReplicatePort:43587   PUSH_DATA_TIMEOUT_PRIMARY   2024-08-19T22:29:29.414-0700
Current unknown workers:
Host:localhost:RpcPort:41487:PushPort:34259:FetchPort:45713:ReplicatePort:35107:InternalPort:41487   2024-08-19T22:29:29.108-0700
Current shutdown workers:
Host:localhost:RpcPort:41487:PushPort:34259:FetchPort:45713:ReplicatePort:35107:InternalPort:41487
```

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
GA

Closes #2697 from cxzl25/CELEBORN-1571.

Authored-by: sychen <sychen@ctrip.com>
Signed-off-by: Shuang <lvshuang.xjs@alibaba-inc.com>
2024-10-11 14:13:49 +08:00
..
flink-it [CELEBORN-1504] Support for Apache Flink 1.16 2024-07-15 10:44:16 +08:00
kubernetes-it [CELEBORN-1565] Introduce warn-unused-import in Scala 2024-08-29 13:43:17 +08:00
mr-it [CELEBORN-1434] Support MRAppMasterWithCeleborn to disable job recovery and job reduce slow start by default 2024-05-22 15:32:41 +08:00
spark-it [CELEBORN-1571] Fix flaky test - pushdata timeout will add to pushExcludedWorker 2024-10-11 14:13:49 +08:00