Angerszhuuuu
01dc9d4259
[CELEBORN-79][REFACTOR] Remove unused responseCheckerThread from LifecycleManager ( #1022 )
2022-11-29 15:25:37 +08:00
Angerszhuuuu
d26e73209b
[CELEBORN-76] Support batch commit hard split partition before stage end
2022-11-29 13:09:01 +08:00
Angerszhuuuu
13f4ce2be6
[CELEBORN-68][FOLLOWUP] Retry on same partition location should have a retry wait interval ( #1017 )
2022-11-28 20:17:08 +08:00
Keyong Zhou
d381df71f8
[CELEBORN-70] Add epoch for each commitFiles request ( #1012 )
2022-11-27 21:05:14 +08:00
nafiy
817eee969f
[CELEBORN-58][REFACTOR] Aggregate reserve failed logs together ( #1005 )
2022-11-26 20:56:39 +08:00
Keyong Zhou
f8bb2cd47d
[CELEBORN-12]Retry on CommitFile request ( #1011 )
2022-11-26 20:56:24 +08:00
Keyong Zhou
9214b82181
[CELEBORN-68] Client might fetch incorrect data chunk ( #1010 )
2022-11-26 18:06:06 +08:00
Ethan Feng
93dbf3f8b1
[CELEBORN-67] Revert "Fix fetch incorrect data chunk" related commits ( #1006 )
...
* Revert "[CELEBORN-50][FOLLOWUP] Channel inactive may cause new client use old stream id to fetch data (#999 )"
This reverts commit 1e8f6dc5e8 .
* Revert "[CELEBORN-50] Channel inActive may cause new client use old stream id to fetch data cause IllegalStateException. (#1000 )"
This reverts commit f1c4d675d6 .
* Revert "[CELEBORN-49] Deadlock when kill worker in shuffle read (#998 )"
This reverts commit 0be4b3399c .
* Revert "[CELEBORN-47][IMPROVEMENT] Refine logs about tracking fetch chunk (#995 )"
This reverts commit 2b05228871 .
* Revert "[BUG] Fix fetch incorrect data chunk (#926 )"
This reverts commit 6f043f8a
* Revert "[ISSUE-925][FOLLOWUP] Refactor class name of RetryingChunkReceiveCallback (#954 )"
This reverts commit 64e8ebf1
2022-11-25 20:57:47 +08:00
nafiy
fe13e9e261
[CELEBORN-59][REFACTOR] Support send destroy slots request in parallel ( #1004 )
2022-11-25 18:26:05 +08:00
Angerszhuuuu
1e8f6dc5e8
[CELEBORN-50][FOLLOWUP] Channel inactive may cause new client use old stream id to fetch data ( #999 )
...
* [CELEBORN-48][BUG] Channel inactive may cause new client use old stream id to fetch data
2022-11-23 18:22:06 +08:00
Ethan Feng
f1c4d675d6
[CELEBORN-50] Channel inActive may cause new client use old stream id to fetch data cause IllegalStateException. ( #1000 )
2022-11-23 18:07:57 +08:00
Keyong Zhou
0be4b3399c
[CELEBORN-49] Deadlock when kill worker in shuffle read ( #998 )
2022-11-23 17:31:05 +08:00
Angerszhuuuu
2b05228871
[CELEBORN-47][IMPROVEMENT] Refine logs about tracking fetch chunk ( #995 )
2022-11-23 11:56:10 +08:00
Keyong Zhou
cfc1fa15bd
[CELEBORN-46] Refine log for RssInputStream.close() ( #994 )
2022-11-22 22:01:08 +08:00
Shuang
1656458788
[CELEBORN-14] [ISSUE-955] support register attempt map task ( #984 )
2022-11-22 15:23:20 +08:00
Angerszhuuuu
5ec278f99a
[ISSUE-987][FEATURE] During worker shutdown, return HARD_SPLIT for all existed partition ( #988 )
2022-11-22 14:29:55 +08:00
Shuang
fb6d1de108
[CELEBORN-8] [ISSUE-952][FEATURE] support register shuffle task in map partition mode ( #973 )
2022-11-16 21:46:19 +08:00
Angerszhuuuu
64e8ebf158
[ISSUE-925][FOLLOWUP] Refactor class name of RetryingChunkReceiveCallback ( #954 )
2022-11-11 14:00:47 +08:00
leesf
0b8376e2c7
Cleanup some code ( #943 )
2022-11-11 13:58:39 +08:00
Ethan Feng
6f043f8ae9
[BUG] Fix fetch incorrect data chunk ( #926 )
2022-11-09 22:31:39 +08:00
leesf
3699683a3b
Fix and migrate some configs ( #927 )
2022-11-07 09:41:38 +08:00
Angerszhuuuu
38e15d89e6
[ISSUE-902][IMPROVEMENT][FOLLOWUP] LifecycleManager should reserve blacklist with irrecoverable status ( #914 )
2022-11-04 15:54:45 +08:00
Angerszhuuuu
e68ca75a9e
[ISSUE-902][BUG] LifecycleManager should not reallocate slots in failed worker during retry ( #906 )
2022-11-02 21:07:28 +08:00
leesf
f1694f3d20
[MINOR][CLEANUP] clean up some code in LifecycleManager and ShuffleClientImpl ( #896 )
2022-11-01 11:40:19 +08:00
Angerszhuuuu
87fcfa767f
[ISSUE-887][REFACTOR] Configuration type convert to Enum ( #888 )
...
* [ISSUE-332][FOLLOWUP] Add deps in worker's pom
* [Refactor] Modify package name of utils to keep consistence
* [Refactor] Modify package name of utils to keep consistence
* [REFACTOR] Remove unused isRegistered in controller
* [ISSUE-887][REFACTOR] Configuration type convert to Enum
* update
* update
* Update RssShuffleManager.java
2022-10-29 13:41:06 +08:00
Cheng Pan
d7be6006e7
Migrate network related conf to structured conf system ( #875 )
...
* Migrate network related conf to structured conf system
* migrate
* fix
* fix
* worker
* fix
* nit
* review
* nit
2022-10-28 10:45:52 +08:00
Angerszhuuuu
f9ecde3b2b
[ISSUE-863][BUG]LifecycleManager should ignore change partition request when shuffle ended and not remove workersnapshot when commit success ( #864 )
2022-10-27 22:04:18 +08:00
Ethan Feng
8800fc4a8e
[Refactor] Refine rpc cache configs ( #853 )
...
* refine rpc cache configs.
* update.
* update.
* update.
2022-10-25 20:28:18 +08:00
Ethan Feng
45ef716737
[Feature] Cache GetReducerFileGroupResponse to avoid lifecycle manager oom. ( #792 )
2022-10-25 16:16:44 +08:00
AngersZhuuuu
2ebf873b3c
[ISSUE-845][REFACTOR] Migrate partition split related conf to Celeborn Configuration System ( #846 )
...
[ISSUE-845][REFACTOR] Migrate partition split related conf to Celeborn Configuration System
2022-10-25 10:54:45 +08:00
AngersZhuuuu
0bd0a3e9f4
[ISSUE-847][REFACTOR] Migrate codec conf to Celeborn Configuration System ( #848 )
...
* [ISSUE-847][REFACTOR] Migrate codec conf to Celeborn Configuration System
* Update CelebornConf.scala
* follow comments
* update
* update
* update
* Update client.md
2022-10-25 09:16:46 +08:00
AngersZhuuuu
0fdb19065a
[ISSUE-841][REFACTOR] Migrate shuffle client side conf to Celeborn Configuration System ( #842 )
2022-10-24 20:48:48 +08:00
Keyong Zhou
63752e7a37
[BUG] RegisterShuffle should not increase epoch ( #833 )
2022-10-23 23:40:32 +08:00
nafiy
d0058fb2c5
[ISSUE-780][REFACTOR] Refactor PartitionLocation's methods ( #791 )
2022-10-22 22:46:45 +08:00
AngersZhuuuu
f2610e3b6f
[ISSUE-829][REFACTOR] Unify name of PUSH_DATA_FAIL_MAIN ( #830 )
2022-10-21 19:06:33 +08:00
AngersZhuuuu
a773c8e6db
[ISSUE-820][Refactor] Rename RssConf to CelebornConf ( #826 )
2022-10-20 20:13:13 +08:00
AngersZhuuuu
8344479df1
[ISSUE-818][REFACTOR] Move existing RssConf.xxx conf method to RssConf class ( #822 )
...
* [ISSUE-818][REFACTOR] Move existing RssConf.xxx conf method to RssConf class
Co-authored-by: Ethan Feng <ethan.aquarius.fmx@gmail.com>
2022-10-20 18:10:59 +08:00
Ethan Feng
5c761a8df3
[ISSUE-813][Refactor] Refactor flusher configurations. ( #813 )
...
* Refactor flusher configurations.
* Refactor flusher configurations.
* Update.
* remove brackets.
* update docs.
* rename.
* update.
* update docs.
* update.
* update.
* update.
* update.
* update.
* update.
* update.
* format.
* update.
* update.
2022-10-20 15:23:17 +08:00
nafiy
a75bce905e
[ISSUE-805][REFACTOR] Remove UserIdentifier out of ControlMessage ( #808 )
2022-10-19 15:32:53 +08:00
AngersZhuuuu
7fedaaeca1
[ISSUE-795][BUG] Batch handle change partition throw NPE ( #796 )
2022-10-19 10:54:08 +08:00
Ethan Feng
bff2a7065b
Keep one copy of roaringbitmap to reduce memory usage. ( #790 )
2022-10-18 13:26:49 +08:00
Cheng Pan
efad4abb5d
Migrate a bunch of configurations ( #786 )
2022-10-18 10:44:01 +08:00
Cheng Pan
ea67f4e060
Introduce categories to ConfigEntry and migrate configurations ( #775 )
2022-10-17 16:56:54 +08:00
Cheng Pan
96e969f46e
[BUILD] Extract project.version to Maven Property ( #772 )
2022-10-16 19:01:40 +08:00
AngersZhuuuu
c9b462dc02
[ISSUE-770][Refactor] Batch handle change partition should ignore empty batch and avoid print log of empty result ( #771 )
2022-10-14 21:49:37 +08:00
AngersZhuuuu
3bad403c8b
[ISSUE-768][REFACTOR] Shuffle data lost should show more clear about lost data in which worker ( #769 )
2022-10-14 11:41:15 +08:00
Cheng Pan
f01a696313
Migrate and refactor configuration for master endpoints ( #752 )
2022-10-11 21:33:21 +08:00
AngersZhuuuu
bbb4f8e225
[ISSUE-306][IMPROVEMENT] Handle change partition request in batch ( #622 )
2022-10-10 18:31:37 +08:00
AngersZhuuuu
f2a234f870
[ISSUE-739][REFACTOR] Use object wrap pb message method ( #740 )
2022-10-09 11:53:48 +08:00
AngersZhuuuu
ae4bb12d5e
[ISSUE-630][REFACTOR] Minor change of storage resource quota, include code style, comment unused code etc.. ( #728 )
2022-10-08 20:15:25 +08:00