leesf
0b8376e2c7
Cleanup some code ( #943 )
2022-11-11 13:58:39 +08:00
Ethan Feng
6f043f8ae9
[BUG] Fix fetch incorrect data chunk ( #926 )
2022-11-09 22:31:39 +08:00
leesf
aac68c3571
Rename RssException to CelebornException ( #938 )
2022-11-08 10:08:21 +08:00
leesf
496f44eda4
Shutdown worker if initialized failed. ( #931 )
2022-11-07 19:33:35 +08:00
Angerszhuuuu
99a7b85708
[ISSUE-932][REFACTOR] Device check should not directly reportError ( #933 )
...
* [ISSUE-932][REFACTOR] Device check should not directly reportError
2022-11-07 15:15:08 +08:00
nafiy
11081eac6c
[ISSUE-879][BUG] When notifyError, should destroy corresponding file writers ( #912 )
...
* [ISSUE-879][BUG] When notifyError, should destroy corresponding file writers
2022-11-07 14:01:51 +08:00
Angerszhuuuu
100e0057e8
[ISSUE-921][BUG] Flush Error should report non critical error ( #928 )
2022-11-07 11:56:11 +08:00
leesf
3699683a3b
Fix and migrate some configs ( #927 )
2022-11-07 09:41:38 +08:00
Angerszhuuuu
38e15d89e6
[ISSUE-902][IMPROVEMENT][FOLLOWUP] LifecycleManager should reserve blacklist with irrecoverable status ( #914 )
2022-11-04 15:54:45 +08:00
Angerszhuuuu
ea4ed10e5c
[ISSUE-901][BUG] During worker graceful shutdown, worker should report itself as unavailable and avoid master allocate slots on it. ( #905 )
2022-11-02 16:09:58 +08:00
Zhen Wang
643eb84541
[MINOR] Fix typo ( #898 )
2022-11-01 10:03:15 +08:00
nafiy
ce3dc889fa
[ISSUE-867][BUG] Create writer failed should report non-critical error instead of critical error ( #883 )
2022-10-31 21:23:16 +08:00
nafiy
9b1c70f219
[ISSUE-880][BUG] onTrim when flushFileWriters() should catch each file writer's exception, avoid block flush all file writers ( #894 )
2022-10-31 14:31:22 +08:00
Angerszhuuuu
87fcfa767f
[ISSUE-887][REFACTOR] Configuration type convert to Enum ( #888 )
...
* [ISSUE-332][FOLLOWUP] Add deps in worker's pom
* [Refactor] Modify package name of utils to keep consistence
* [Refactor] Modify package name of utils to keep consistence
* [REFACTOR] Remove unused isRegistered in controller
* [ISSUE-887][REFACTOR] Configuration type convert to Enum
* update
* update
* Update RssShuffleManager.java
2022-10-29 13:41:06 +08:00
Cheng Pan
d7be6006e7
Migrate network related conf to structured conf system ( #875 )
...
* Migrate network related conf to structured conf system
* migrate
* fix
* fix
* worker
* fix
* nit
* review
* nit
2022-10-28 10:45:52 +08:00
Angerszhuuuu
d283cca4e1
[ISSUE-869][REFACTOR] Migrate partition size/sorter related conf to Celeborn ConfigEntity ( #870 )
2022-10-27 16:49:55 +08:00
Angerszhuuuu
26dcc118c6
[ISSUE-871][REFACTOR] Migrate Worker conf to Celeborn Configuration System ( #873 )
...
* [ISSUE-871][REFACTOR] Migrate Worker conf to Celeborn Configuration System
2022-10-27 15:35:29 +08:00
Angerszhuuuu
5333819cb0
[ISSUE-866][BUG] Create File twice should show clear log ( #876 )
2022-10-27 14:52:45 +08:00
nafiy
e44e8c9610
[ISSUE-828][REFACTOR] Migrate memory tracker related configs to ConfigEntry ( #831 )
...
* [ISSUE-828][REFACTOR] Migrate memory tracker related configs to ConfigEntry
* Fix based on review
* update doc
* resolve review feedback
* fix
* Fix based on review
* fix based on review
2022-10-25 21:16:53 +08:00
AngersZhuuuu
0bd0a3e9f4
[ISSUE-847][REFACTOR] Migrate codec conf to Celeborn Configuration System ( #848 )
...
* [ISSUE-847][REFACTOR] Migrate codec conf to Celeborn Configuration System
* Update CelebornConf.scala
* follow comments
* update
* update
* update
* Update client.md
2022-10-25 09:16:46 +08:00
Ethan Feng
4df0d4a456
[TEST] Fix unstable LZ4 unit test ( #816 )
2022-10-24 15:36:06 +08:00
Cheng Pan
8d7d397e71
Fix Configuration page and polish naming ( #838 )
...
* Fix Configuration page and polish naming
* nit
* nit
* comment
2022-10-24 12:46:25 +08:00
Ethan Feng
74843f20a9
[BUG] Fix worker lost caused by UnsupportedOperationException ( #837 )
2022-10-24 11:20:42 +08:00
Keyong Zhou
63752e7a37
[BUG] RegisterShuffle should not increase epoch ( #833 )
2022-10-23 23:40:32 +08:00
Ethan Feng
392a252baa
[FOLLOWUP][ISSUE-813]Update doc and fix typo. ( #825 )
2022-10-22 23:02:22 +08:00
AngersZhuuuu
f2610e3b6f
[ISSUE-829][REFACTOR] Unify name of PUSH_DATA_FAIL_MAIN ( #830 )
2022-10-21 19:06:33 +08:00
nafiy
1a8a36e8fe
[ISSUE-812][Refactor] Migrate metrics system related configs to ConfigEntry ( #821 )
2022-10-21 13:57:58 +08:00
AngersZhuuuu
a773c8e6db
[ISSUE-820][Refactor] Rename RssConf to CelebornConf ( #826 )
2022-10-20 20:13:13 +08:00
AngersZhuuuu
8344479df1
[ISSUE-818][REFACTOR] Move existing RssConf.xxx conf method to RssConf class ( #822 )
...
* [ISSUE-818][REFACTOR] Move existing RssConf.xxx conf method to RssConf class
Co-authored-by: Ethan Feng <ethan.aquarius.fmx@gmail.com>
2022-10-20 18:10:59 +08:00
Ethan Feng
5c761a8df3
[ISSUE-813][Refactor] Refactor flusher configurations. ( #813 )
...
* Refactor flusher configurations.
* Refactor flusher configurations.
* Update.
* remove brackets.
* update docs.
* rename.
* update.
* update docs.
* update.
* update.
* update.
* update.
* update.
* update.
* update.
* format.
* update.
* update.
2022-10-20 15:23:17 +08:00
nafiy
1e5bed2da7
[ISSUE-806][REFACTOR] Remove ResourceConsumption out of ControlMessage ( #810 )
...
* [ISSUE-806][REFACTOR] Remove ResourceConsumption out of ControlMessage
* add line before method
* reformat
2022-10-19 17:14:51 +08:00
AngersZhuuuu
23c65a27a9
[ISSUE-798][REFACTOR] Migrate worker-recover related conf to ConfigEntry ( #799 )
2022-10-19 16:42:00 +08:00
nafiy
a75bce905e
[ISSUE-805][REFACTOR] Remove UserIdentifier out of ControlMessage ( #808 )
2022-10-19 15:32:53 +08:00
Cheng Pan
efad4abb5d
Migrate a bunch of configurations ( #786 )
2022-10-18 10:44:01 +08:00
nafiy
0e5beb9562
[ISSUE-774][REFACTOR] Add cache to avoid redundant UserIdentifier object when recover fileinfo ( #781 )
2022-10-17 21:27:54 +08:00
nafiy
0dcf946c9b
[ISSUE-751][REFACTOR] Move userResourceConsumption to WorkerInfo's parameter and format WorkerInfo's toString() ( #767 )
2022-10-17 17:58:39 +08:00
Cheng Pan
ea67f4e060
Introduce categories to ConfigEntry and migrate configurations ( #775 )
2022-10-17 16:56:54 +08:00
Ethan Feng
0959894155
[BUG]Fix rss worker register failure problem. ( #777 )
2022-10-17 09:50:04 +08:00
nafiy
373b4a744a
[ISSUE-750][Refactor] Add UserIdentifier as a field of file info ( #759 )
2022-10-13 23:15:44 +08:00
Cheng Pan
5829bda21a
Rework and migrate HA configuration system ( #763 )
2022-10-13 22:35:01 +08:00
Cheng Pan
f01a696313
Migrate and refactor configuration for master endpoints ( #752 )
2022-10-11 21:33:21 +08:00
nafiy
3ed38f1e72
[ISSUE-642][FEATURE] worker storage manger store user to shuffke key relation and recover from level db ( #706 )
2022-10-10 18:18:34 +08:00
AngersZhuuuu
13aeb4b644
[ISSUE-736][BUG] Heartbeat worker should update disk info into WorkInfo too to keep consistence with master ( #737 )
2022-10-09 15:41:01 +08:00
AngersZhuuuu
f2a234f870
[ISSUE-739][REFACTOR] Use object wrap pb message method ( #740 )
2022-10-09 11:53:48 +08:00
Ethan Feng
6deda248ac
[REFACTOR]move lifecycle manager to correct package. ( #730 )
2022-10-08 18:14:08 +08:00
Cheng Pan
ab16b4f101
[INFRA] Rename modules w/ celeborn prefix ( #723 )
2022-10-08 08:05:57 +08:00