Commit Graph

106 Commits

Author SHA1 Message Date
Fei Wang
c609c0ebaa
[MINOR] Fix typo and remove unused code (#1381)
* fix typo

* remove unused
2023-03-25 23:20:33 +08:00
Angerszhuuuu
56d796638f
[CELEBORN-438] Move ServletPath to MetricsSytsem (#1364) 2023-03-20 18:22:40 +08:00
Angerszhuuuu
e61130d397
[CELEBORN-423][FOLLOWUP] Format http request (#1353)
* [CELEBORN-423][FOLLOWUP] Format http request
2023-03-15 16:30:23 +08:00
Angerszhuuuu
1f56a5e5d1
[CELEBORN-423] Format http request result (#1349) 2023-03-15 10:32:01 +08:00
Angerszhuuuu
3907d70212
[CELEBORN-421] Add shutdown and registered to http request (#1346)
* [CELEBORN-421] Add shutdown and registered to http request
2023-03-14 18:23:21 +08:00
Angerszhuuuu
7d7279a9bc
[CELEBORN-420] Add unavailablePeers to http request (#1345)
* [CELEBORN-420] Add unavailablePeers to http request
2023-03-14 17:23:45 +08:00
Angerszhuuuu
3600ccc4e3
[CELEBORN-409] Add PartitionLocationInfo to worker's http request (#1335) 2023-03-13 17:02:28 +08:00
Angerszhuuuu
6f1ab70403
[CELEBORN-406] Add blacklist to http request to indicate blacklisted worker (#1334) 2023-03-13 16:44:46 +08:00
Angerszhuuuu
144a8cdb3f
[CELEBORN-408] Add lost worker infos to http request (#1333) 2023-03-13 15:27:41 +08:00
Angerszhuuuu
b56624d3c1
[CELEBORN-405] Add metrics about lost workers (#1330)
* [CELEBORN-405] Add metrics about lost workers
2023-03-13 14:49:49 +08:00
jiaoqingbo
92ad56c47d
[CELEBORN-393] responseBuilder.setCmdType should be called only once in MetaHandler's handleReadRequest method (#1322) 2023-03-10 21:17:22 +08:00
Keyong Zhou
9608a11819
[CELEBORN-348][FOLLOWUP] Refine comparator to support nanoseconds whi… (#1305) 2023-03-03 17:55:59 +08:00
Keyong Zhou
dcedf7b0a9
[CELEBORN-348] Support fetchTime in load-aware slots assignment strategy (#1287) 2023-03-02 18:31:50 +08:00
Angerszhuuuu
4c90e0b02a
[CELEBORN-359] Add ratis-shell to celeborn (#1292) 2023-03-01 17:04:57 +08:00
Ethan Feng
7e9ba19d58
[CELEBORN-302] Fix workers count out of sync in HA mode. (#1239) 2023-02-20 21:46:33 +08:00
Ethan Feng
d391e7d91d
[CELEBORN-300] fix a bug about non-leader master try to update partition size. (#1235) 2023-02-14 15:44:20 +08:00
Angerszhuuuu
ae32c702b6
[CELEBORN-247][FOLLOWUP] Fix NPE issue (#1207) 2023-02-06 17:28:05 +08:00
Angerszhuuuu
46240d59de
[CELEBORN-247][FOLLOWUP] Add metrics for each user's quota usage (#1206) 2023-02-06 14:02:57 +08:00
Angerszhuuuu
04427f2b16
[CELEBORN-247] Add metrics for each user's quota usage (#1182) 2023-02-02 18:31:08 +08:00
Angerszhuuuu
ced08a1d89
[CELEBORN-266] Fix wrong old version configurations (#1198) 2023-02-02 14:03:45 +08:00
Angerszhuuuu
0d5809ff0c
[CELEBORN-192][IMPROVEMENT] Change FAILED status to REQUEST_FAILED since it's all used when RPC request failed. (#1139) 2023-01-06 16:53:04 +08:00
Fu Chen
ab449ffdd7
[CELEBORN-198] Fix the wrong configuration path of plugin protobuf-maven-plugin and … (#1146) 2023-01-05 20:09:31 +08:00
Angerszhuuuu
b13ddac9d2
[CELEBORN-172][Refactor] Load/Make snapshot use Protobuf serde (#1118) 2022-12-29 11:51:14 +08:00
Angerszhuuuu
5603e62e95
[CELEBORN-174][REFACTOR] Move AppDiskUsage related to meta package (#1117) 2022-12-27 15:24:42 +08:00
nafiy
ddab27a1d7
[CELEBORN-145][REFACTOR] Add reason in CheckQuotaResponse (#1093)
* [CELEBORN-145][REFACTOR] Add reason in CheckQuotaResponse
2022-12-15 18:16:34 +08:00
William Song
ce86e11d50
[CELEBORN-133] Improve snapshot loading (#1078)
Co-authored-by: Cheng Pan <pan3793@gmail.com>
2022-12-14 14:26:31 +08:00
Binjie Yang
d6ee3c18bc
[CELEBORN-98][IMPROVEMENT] Remove unreachable code block in master/work arguments (#1042) 2022-12-02 22:53:28 +08:00
Angerszhuuuu
017b3d2b41
[CELEBORN-94][BUG] StateMachine should implement pause to change status (#1033) 2022-12-01 12:16:44 +08:00
Angerszhuuuu
5b9102d792
[CELEBORN-93][BUG] Rss Raft reject install snapshot (#1032) 2022-11-30 19:43:47 +08:00
William Song
735ba4ce0c
[CELEBORN-44][BUG] StateMachine not update currentSnapshot after takeSnapshot cause getLatestSnapshot return null (#996) 2022-11-23 16:00:14 +08:00
Ethan Feng
ee243f286d
[CELEBORN-4] Add metrics about top disk used apps. (#985) 2022-11-22 20:06:36 +08:00
Angerszhuuuu
827ba9e0f7
[ISSUE-939][REFACTOR] Bump up ratis to 2.4.0 (#940) 2022-11-08 15:12:00 +08:00
Cheng Pan
b1c1961e60
Fix MasterNode rpc endpoint info (#916) 2022-11-03 21:02:31 +08:00
Angerszhuuuu
ea4ed10e5c
[ISSUE-901][BUG] During worker graceful shutdown, worker should report itself as unavailable and avoid master allocate slots on it. (#905) 2022-11-02 16:09:58 +08:00
Angerszhuuuu
87fcfa767f
[ISSUE-887][REFACTOR] Configuration type convert to Enum (#888)
* [ISSUE-332][FOLLOWUP] Add deps in worker's pom

* [Refactor] Modify package name of utils to keep consistence

* [Refactor] Modify package name of utils to keep consistence

* [REFACTOR] Remove unused isRegistered in controller

* [ISSUE-887][REFACTOR] Configuration type convert to Enum

* update

* update

* Update RssShuffleManager.java
2022-10-29 13:41:06 +08:00
Cheng Pan
d7be6006e7
Migrate network related conf to structured conf system (#875)
* Migrate network related conf to structured conf system

* migrate

* fix

* fix

* worker

* fix

* nit

* review

* nit
2022-10-28 10:45:52 +08:00
Angerszhuuuu
d283cca4e1
[ISSUE-869][REFACTOR] Migrate partition size/sorter related conf to Celeborn ConfigEntity (#870) 2022-10-27 16:49:55 +08:00
Angerszhuuuu
26dcc118c6
[ISSUE-871][REFACTOR] Migrate Worker conf to Celeborn Configuration System (#873)
* [ISSUE-871][REFACTOR] Migrate Worker conf to Celeborn Configuration System
2022-10-27 15:35:29 +08:00
Angerszhuuuu
399236c880
[ISSUE-849][REFACTOR] Migrate master and common Celeborn Configuration System (#850) 2022-10-26 17:09:27 +08:00
AngersZhuuuu
a773c8e6db
[ISSUE-820][Refactor] Rename RssConf to CelebornConf (#826) 2022-10-20 20:13:13 +08:00
AngersZhuuuu
8344479df1
[ISSUE-818][REFACTOR] Move existing RssConf.xxx conf method to RssConf class (#822)
* [ISSUE-818][REFACTOR] Move existing RssConf.xxx conf method to RssConf class


Co-authored-by: Ethan Feng <ethan.aquarius.fmx@gmail.com>
2022-10-20 18:10:59 +08:00
Ethan Feng
5c761a8df3
[ISSUE-813][Refactor] Refactor flusher configurations. (#813)
* Refactor flusher configurations.

* Refactor flusher configurations.

* Update.

* remove brackets.

* update docs.

* rename.

* update.

* update docs.

* update.

* update.

* update.

* update.

* update.

* update.

* update.

* format.

* update.

* update.
2022-10-20 15:23:17 +08:00
nafiy
1e5bed2da7
[ISSUE-806][REFACTOR] Remove ResourceConsumption out of ControlMessage (#810)
* [ISSUE-806][REFACTOR] Remove ResourceConsumption out of ControlMessage

* add line before method

* reformat
2022-10-19 17:14:51 +08:00
nafiy
a75bce905e
[ISSUE-805][REFACTOR] Remove UserIdentifier out of ControlMessage (#808) 2022-10-19 15:32:53 +08:00
Cheng Pan
efad4abb5d
Migrate a bunch of configurations (#786) 2022-10-18 10:44:01 +08:00
nafiy
0dcf946c9b
[ISSUE-751][REFACTOR] Move userResourceConsumption to WorkerInfo's parameter and format WorkerInfo's toString() (#767) 2022-10-17 17:58:39 +08:00
Cheng Pan
ea67f4e060
Introduce categories to ConfigEntry and migrate configurations (#775) 2022-10-17 16:56:54 +08:00
Cheng Pan
96e969f46e
[BUILD] Extract project.version to Maven Property (#772) 2022-10-16 19:01:40 +08:00
Cheng Pan
5829bda21a
Rework and migrate HA configuration system (#763) 2022-10-13 22:35:01 +08:00
Cheng Pan
f01a696313
Migrate and refactor configuration for master endpoints (#752) 2022-10-11 21:33:21 +08:00
nafiy
3ed38f1e72
[ISSUE-642][FEATURE] worker storage manger store user to shuffke key relation and recover from level db (#706) 2022-10-10 18:18:34 +08:00
Cheng Pan
189b7a4fa8
Celeborn should respect CELEBORN_* env vars (#749) 2022-10-10 12:39:48 +08:00
AngersZhuuuu
13aeb4b644
[ISSUE-736][BUG] Heartbeat worker should update disk info into WorkInfo too to keep consistence with master (#737) 2022-10-09 15:41:01 +08:00
AngersZhuuuu
f2a234f870
[ISSUE-739][REFACTOR] Use object wrap pb message method (#740) 2022-10-09 11:53:48 +08:00
AngersZhuuuu
ae4bb12d5e
[ISSUE-630][REFACTOR] Minor change of storage resource quota, include code style, comment unused code etc.. (#728) 2022-10-08 20:15:25 +08:00
Cheng Pan
ab16b4f101
[INFRA] Rename modules w/ celeborn prefix (#723) 2022-10-08 08:05:57 +08:00