Commit Graph

1099 Commits

Author SHA1 Message Date
Angerszhuuuu
7572c5b261
[CELEBORN-609] Refactor master's worker info HTTP request (#1514) 2023-05-24 18:15:39 +08:00
Angerszhuuuu
4f85d80687
[CELEBORN-606] Refine CommitHandler's noisy log (#1511) 2023-05-24 15:25:10 +08:00
zhongqiangchen
e6978c380b
[CELEBORN-603] Update version to 0.4.0-SNAPSHOT (#1507) 2023-05-24 14:31:10 +08:00
Leo Li
de97ad26ce
[CELEBORN-599] Consolidate calculation of mount point (#1505)
* [CELEBORN-599] Fix worker dirs get mount point

* update

* update

---------

Co-authored-by: liyihe <liyihe@bigo.sg>
2023-05-23 14:06:02 +08:00
Angerszhuuuu
6619015a63
[CELEBORN-596] Worker don't need to update disk max slots (#1502) 2023-05-23 10:30:35 +08:00
minseok
6e166662f1
[CELEBORN-598] Fix Typos in README 2023-05-21 19:36:38 +08:00
Angerszhuuuu
d244f44518
[CELEBORN-593] Refine some RPC related default configurations (#1498) 2023-05-19 18:23:12 +08:00
Angerszhuuuu
615d9a111f
[CELEBORN-487] Remove wrong space of config SHUFFLE_CLIENT_PUSH_BLACK (#1500) 2023-05-19 14:27:57 +08:00
Ethan Feng
ac78afdc4e
[CELEBORN-594] Eliminate Ratis noisy logs. (#1499) 2023-05-19 14:05:52 +08:00
Angerszhuuuu
aa817bdbeb
[CELEBORN-446][FOLLOWUP] Check rack should use nextMasterIndex.(#1496) 2023-05-18 16:25:14 +08:00
Angerszhuuuu
42219aeb2a
[CELEBORN-592][REFACTOR] Refactor PbSerDeUtils's some foreach code format (#1497) 2023-05-18 16:22:14 +08:00
Shuang
6eabc519b3
[CELEBORN-591] RatisSystem need decrease no leader timeout configuration. (#1495) 2023-05-18 14:49:06 +08:00
Angerszhuuuu
811e192bbd
[CELEBORN-446] Support rack aware during assign slots for ROUNDROBIN (#1370) 2023-05-18 13:58:51 +08:00
Kaijie Chen
67bc420801
[CELEBORN-558] Bump Ratis to 2.5.1 and fix API changes (#1464) 2023-05-18 11:08:37 +08:00
Ethan Feng
7015d2463a
[CELEBORN-583] Merge pooled memory allocators. (#1490) 2023-05-18 10:37:30 +08:00
Angerszhuuuu
a22c61e479
[CELEBORN-582] Celeborn should handle InterruptedException during kill task properly (#1486) 2023-05-17 18:17:41 +08:00
Angerszhuuuu
791d72d45f
[CELEBORN-590] Remove hadoop prefix of WORKER_WORKING_DIR (#1494) 2023-05-17 17:57:27 +08:00
Angerszhuuuu
7c6cb2f3bb
[CELEBORN-588] Remove test conf's category (#1491) 2023-05-17 17:37:28 +08:00
Cheng Pan
3cc296ef4f
[CELEBORN-589][INFRA] Using Apache CDN to download maven (#1492) 2023-05-17 15:46:38 +08:00
Leo Li
65cdb3eba4
[CELEBORN-585] Create if not exists worker recoverPath when graceful shutdown is enabled (#1487) 2023-05-17 11:29:09 +08:00
Angerszhuuuu
64a3534f71
[CELEBORN-584] Worker side should expose push/replicate/fetch Netty allocator metrics (#1489) 2023-05-16 17:51:33 +08:00
Shuang
f83304c337
[CELEBORN-581][Flink] Support JobManager failover. (#1485) 2023-05-16 14:51:53 +08:00
Angerszhuuuu
d657f8268a
[CELEBORN-586] Add SystemMiscSource to indicate system running status (#1488) 2023-05-16 14:03:07 +08:00
zhongqiangchen
5769c3fdc7
[CELEBORN-552] Add HeartBeat between the client and worker to keep alive (#1457) 2023-05-10 19:35:51 +08:00
Shuang
fb753fd48e
[CELEBORN-573] Guarantee resource/app/worker change persistent to raft in Ha Mode. (#1477) 2023-05-10 14:28:52 +08:00
Angerszhuuuu
778b5440bc
[CELEBORN-556][BUG] ReserveSlot should not use default RPC time out since register shuffle max timeout is network timeout (#1461) 2023-05-10 12:29:06 +08:00
Shuang
2fea818fa8
[CELEBORN-579] revert Destroy Message rename for compatibility. (#1482) 2023-05-09 15:24:02 +08:00
Angerszhuuuu
5f7e1ce8e2
[CELEBORN-578][REFACTOR] Refine commit file's log to indicate more clear about empty partitions (#1481) 2023-05-08 18:21:46 +08:00
Ethan Feng
3e0d779962
[CELEBORN-576] Add static identity provider and manually settable identity provider for non-hadoop environment. (#1480) 2023-05-08 17:29:01 +08:00
Ethan Feng
91b757555e
[CELEBORN-570] Update docs about monitor and deployment. (#1478) 2023-05-08 17:07:42 +08:00
Angerszhuuuu
a315a2eb41
[CELEBORN-575] PartitionLocationInfo change cause quick upgrade impacted (#1479) 2023-05-08 16:56:42 +08:00
Shuang
78a32fe90f
[CELEBORN-567] Timeout workers/app need consider long leader election period (#1474) 2023-05-06 17:43:16 +08:00
Angerszhuuuu
c0a9578d9f
[CELEBORN-563] Remove unnecessary code (#1469) 2023-05-06 11:25:31 +08:00
Ethan Feng
114b1b4d62
[CELEBORN-548][FLINK] Support flink 1.17. (#1472) 2023-05-05 23:00:49 +08:00
Ethan Feng
e24569cbb7
[CELEBORN-569] Update netty version to 4.1.92. (#1476) 2023-05-05 20:01:37 +08:00
Ethan Feng
596d276323
Revert "[CELEBORN-569] Update netty version to 4.1.92."
This reverts commit a95936906b.
2023-05-05 12:34:37 +08:00
Ethan Feng
a95936906b
[CELEBORN-569] Update netty version to 4.1.92. 2023-05-05 12:30:01 +08:00
Angerszhuuuu
783d4e5dc5
[CELEBORN-551] Remove unnecessary ShuffleClient.get() (#1456) 2023-05-04 20:47:45 +08:00
Ethan Feng
93d2f106e0
[CELEBORN-548][FLINK] Support flink 1.15. (#1463) 2023-05-04 15:23:59 +08:00
Angerszhuuuu
a108d6f837
[CELEBORN-559][IMPROVEMENT] createReader should also wait for retry when change to same peer (#1465) 2023-05-04 10:51:15 +08:00
Ethan Feng
58aa0ba48f
[CELEBORN-566] Refine docs to eliminate misleading configs. (#1473) 2023-05-03 17:25:59 +08:00
Angerszhuuuu
ef4c12e0fe
[CELEBORN-565] FETCH_MAX_RETRIES should double when enable replicates (#1471) 2023-04-28 14:27:35 +08:00
Angerszhuuuu
8d933691ae
[CELEBORN-479][FOLLOWUP] Add push task should check if loc is null (#1404) 2023-04-28 11:19:35 +08:00
Angerszhuuuu
bfce6052d7
[CELEBORN-560][FOLLOWUP] Follow the original design for handling rerun & speculative task after handleStageEnd (#1468) 2023-04-28 11:18:42 +08:00
Aaron Wang
6dad856fec
[CELEBORN-564] Correct stop-all.sh comments (#1470) 2023-04-28 09:38:59 +08:00
Angerszhuuuu
7a4f2ebd8a
[CELEBORN-547] Refactor request related API (#1452) 2023-04-27 16:25:41 +08:00
Angerszhuuuu
ce21a738a9
[CELEBORN-560][BUG] Rerun task in spark later then RSS stageEnd cause NPE then job failed (#1466) 2023-04-27 14:16:32 +08:00
Angerszhuuuu
be84e8ba0d
[CELEBORN-562][REFACTOR] Rename Destroy and DestroyResponse to make it more clear (#1467) 2023-04-27 12:31:32 +08:00
Shuang
64a4f7274c
[CELEBORN-554][Tuning] Improve For LM to avoid reserve/commit empty worker resources (#1459) 2023-04-26 18:04:50 +08:00
Angerszhuuuu
4bbc8aec4f
[CELEBORN-555][REFACTOR] Avoid prin noisy blacklist info when record blacklist (#1460)
* [CELEBORN-555][REFACTOR] Avoid prin noisy blacklist info when record blacklist
2023-04-26 16:45:44 +08:00