jiaoqingbo
|
84795bc63b
|
[CELEBORN-382] Call checkDiskFullAndSplit in the handlePushData method to avoid repeated definitions (#1313)
|
2023-03-07 18:55:46 +08:00 |
|
Ethan Feng
|
675a7da393
|
[CELEBORN-368][FLINK] Pass exceptions in buffer stream. (#1304)
|
2023-03-03 15:43:30 +08:00 |
|
Keyong Zhou
|
dcedf7b0a9
|
[CELEBORN-348] Support fetchTime in load-aware slots assignment strategy (#1287)
|
2023-03-02 18:31:50 +08:00 |
|
Angerszhuuuu
|
eda21ead24
|
[CELEBORN-344] Change PUSH_DATA_FAIL_MASTER/SALVE to PUSH_DATA_WRITE_FAIL_MASTER/SALVE (#1281)
|
2023-02-28 11:29:40 +08:00 |
|
Keyong Zhou
|
7adf1fca41
|
[CELEBORN-295] Optimize data push (#1232)
* [CELEBORN-295] Add double buffer for sort pusher
|
2023-02-28 10:35:55 +08:00 |
|
Angerszhuuuu
|
24f5478adc
|
[CELEBORN-338] Clean duplicated exception message of handling push data (#1274)
|
2023-02-28 10:35:18 +08:00 |
|
Rex(Hui) An
|
798ff90bb7
|
[CELEBORN-342] Fix the wrong avg produce bytes in Congestion control (#1279)
|
2023-02-27 16:29:37 +08:00 |
|
Keyong Zhou
|
3c8c58e09d
|
[CELEBORN-301] Refactor PartitionLocationInfo to use ConcurrentHashMap (#1278)
|
2023-02-26 16:46:30 +08:00 |
|
Angerszhuuuu
|
a7587c3fe7
|
[CELEBORN-337] Remove unnecessary StatusCode.message (#1272)
* [CELEBORN-337] Remove unnecessary StatusCode.message
|
2023-02-24 15:11:07 +08:00 |
|
Shuang
|
9754616d79
|
[CELEBORN-330] fix deadlock when use the same netty channel to receive data while other thread wait the response (#1265)
|
2023-02-23 17:57:43 +08:00 |
|
Angerszhuuuu
|
fc8540a2e6
|
[CELEBORN-325] After worker restart, throw NPE when receive not found partition (#1259)
* [CELEBORN-325] After worker restart, throw NPE when receive not found partition
|
2023-02-22 15:19:34 +08:00 |
|
Ethan Feng
|
0df08fbdf3
|
[CELEBORN-320][FLINK] fix handle wrong message type in FetchHandler. (#1254)
|
2023-02-21 11:51:01 +08:00 |
|
Ethan Feng
|
26a3bb5e72
|
[CELEBORN-308] Fix flusher will exit unexpectedly if flush task write failed. (#1249)
|
2023-02-20 21:45:37 +08:00 |
|
Ethan Feng
|
0c8bb83114
|
[CELEBORN-234] Implement buffer stream. (#1221)
|
2023-02-17 17:38:36 +08:00 |
|
zhongqiangchen
|
5236df68af
|
[CELEBORN-292] optimize mappartitionfilewriter flushing index and reading data header (#1225)
|
2023-02-17 13:42:28 +08:00 |
|
zhongqiangchen
|
79096d60d0
|
[CELEBORN-293] WorkerSource registers timer for mappartition message metrics (#1226)
|
2023-02-17 11:29:54 +08:00 |
|
Ethan Feng
|
1dcfdb0c8f
|
[CELEBORN-281] Add metrics about buffer stream read buffer. (#1216)
|
2023-02-17 11:20:07 +08:00 |
|
Angerszhuuuu
|
57f775a7e9
|
[CELEBORN-273] Move push data timeout checker into TransportResponseHandler to keep callback status consistence (#1208)
|
2023-02-16 18:27:37 +08:00 |
|
Ethan Feng
|
534853bf8a
|
[CELEBORN-278] Add openStreamWithCredit RPC. (#1214)
|
2023-02-16 14:07:13 +08:00 |
|
zhongqiangchen
|
2c508dae0f
|
[CELEBORN-307] fix ArrayComparisonFailure while running lz4 ut (#1241)
|
2023-02-16 13:41:17 +08:00 |
|
Rex(Hui) An
|
2068e6ae37
|
[CELEBORN-279] Add user level push data speed metric (#1213)
|
2023-02-13 12:04:44 +08:00 |
|
Rex(Hui) An
|
adb6592d31
|
[CELEBORN-277] PushDataHandle callback could miss soft split status (#1212)
|
2023-02-09 14:57:18 +08:00 |
|
Rex(Hui) An
|
f88f5fcf55
|
[CELEBORN-207][FOLLOW_UP] Master could miss the congestion status if enable push.data.replicate
|
2023-02-07 22:57:39 +08:00 |
|
Rex(Hui) An
|
cfe81969c9
|
[CELEBORN-275] WrappedCallback should only handle response from replica (#1209)
|
2023-02-07 18:18:13 +08:00 |
|
Rex(Hui) An
|
bb113ec9be
|
[CELEBORN-207] Support network congestion control (#1066)
|
2023-02-07 12:06:18 +08:00 |
|
Angerszhuuuu
|
c4020100db
|
[CELEBORN-271][BUG] PushState in PushDataHandler should should use peer's location
|
2023-02-06 11:31:57 +08:00 |
|
Angerszhuuuu
|
ecc3a0e52f
|
[CELEBORN-272][BUG] Don't do replication should directly use callback not wrappedCallback (#1205)
|
2023-02-06 11:28:12 +08:00 |
|
zhongqiangchen
|
8e903840af
|
[CELEBORN-243][REWORK]fix bug that os's disk usage is low but celeborn thinks that it's high_disk_usage (#1202)
|
2023-02-04 14:27:44 +08:00 |
|
Angerszhuuuu
|
2e68912812
|
[CELEBORN-269][BUG] Disable replication throw NPE when removeBatch in pushDataHandler (#1203)
|
2023-02-03 20:06:59 +08:00 |
|
Shuang
|
2634476758
|
[CELEBORN-267] reuse stream when client channel reconnected (#1200)
|
2023-02-03 15:12:45 +08:00 |
|
Angerszhuuuu
|
4b6f7e4593
|
[CELEBORN-239][IMPROVEMENT] Worker replicate should enable push data timeout too (#1185)
|
2023-02-03 11:53:15 +08:00 |
|
zhongqiangczq
|
ff17a61ec5
|
[CELEBORN-243] fix bug that os's disk usage is low but celeborn thinks that it's high_disk_usage (#1184)
|
2023-02-02 10:41:11 +08:00 |
|
Shuang
|
7162be2fae
|
[CELEBORN-201] Separate partitionLocationInfo in LifecycleManager and worker (#1149)
|
2023-01-31 18:53:36 +08:00 |
|
Angerszhuuuu
|
1311fb53d1
|
[CELEBORN-243][CELEBORN-245][IMPROVEMENT] Create push client failed and connection failed cause push failed should have their own ERROR type (#1181)
* [CELEBORN-243][IMPROVEMENT] Create push client failed should have a ERROR type
|
2023-01-30 17:47:22 +08:00 |
|
Angerszhuuuu
|
8611a64400
|
[CELEBORN-237][IMPROVEMENT] push failed error message should show partition info (#1178)
* [CELEBORN-237][IMPROVEMENT] push failed error message should show partition info
|
2023-01-28 18:41:54 +08:00 |
|
Ethan Feng
|
a239f9f284
|
[CELEBORN-228]Refactor PartitionFileSorter to avoid specific JDK dependency. (#1168)
|
2023-01-16 20:06:47 +08:00 |
|
zy.jordan
|
bb96700415
|
[CELEBORN-223] The default rpc thread num of pushServer/replicateServer/fetchServer should be the number of total of Flusher's thread (#1163)
|
2023-01-16 12:03:46 +08:00 |
|
zhongqiangczq
|
3661222d98
|
[CELEBORN-195] add implementation to MapPartitionFileWriter (#1141)
|
2023-01-13 16:41:11 +08:00 |
|
zy.jordan
|
19197b9190
|
[CELEBORN-214] Push/Replicate/Fetch io threads default value is 16 (#1158)
|
2023-01-10 17:46:56 +08:00 |
|
nafiy
|
9635725480
|
[CELEBORN-204][IMPROVEMENT]Collect disk usage metrics in byte unit by default (#1153)
|
2023-01-09 17:36:18 +08:00 |
|
Ethan Feng
|
5595f2f4b3
|
[CELEBORN-124]Add buffer stream. (#1069)
|
2023-01-06 15:54:52 +08:00 |
|
Shuang
|
3b2be25a50
|
[CELEBORN-173] refactor minicluster and fix ut (#1147)
|
2023-01-05 20:39:19 +08:00 |
|
Angerszhuuuu
|
5edb21d210
|
[CELEBORN-168][FOLLOWUP] Device metrics should use long value and add size unit in metric name (#1143)
* [CELEBORN-168][FOLLOWUP] Device metrics should use long value and add size unit in metric name
|
2023-01-05 11:45:19 +08:00 |
|
nafiy
|
3e80cf2b87
|
[CELEBORN-168][FEATURE] Add disk usage related metrics for Worker (#1127)
|
2023-01-05 10:35:51 +08:00 |
|
Angerszhuuuu
|
425e31797c
|
[CELEBORN-182][BUG] StorageManager should not delete shuffle file when enable graceful shutdown (#1126)
|
2022-12-30 18:13:36 +08:00 |
|
Angerszhuuuu
|
7d7192af14
|
[CELEBORN-179][BUG] Repeat remove expired shuffle throw NPE (#1124)
|
2022-12-29 15:47:05 +08:00 |
|
Angerszhuuuu
|
6411fe71b1
|
[CELEBORN-178][BUG] Default registered flag should be false, not null (#1123)
|
2022-12-29 15:24:09 +08:00 |
|
nafiy
|
77cb7a0477
|
[CELEBORN-169][REFACTOR] Extract ObservedDevice out from LocalDeviceMonitor (#1113)
* [CELEBORN-169][REFACTOR] Extract ObservedDevice out from LocalDeviceMonitor
|
2022-12-28 14:29:00 +08:00 |
|
Ethan Feng
|
5aa959a335
|
[CELEBORN-157] Change prefix of configurations to celeborn. (#1104)
|
2022-12-21 15:17:28 +08:00 |
|
nafiy
|
f13dfb7421
|
[CELEBORN-113][FEATURE] Add metrics to monitor non-critical error number on local device (#1100)
|
2022-12-20 22:30:55 +08:00 |
|