Angerszhuuuu
|
786fcd6744
|
[CELEBORN-336] Revive Failed should use keep the corresponding StatusCode (#1283)
* [CELEBORN-336] Revive Failed should use keep the corresponding StatusCode
|
2023-03-01 18:57:51 +08:00 |
|
Shuang
|
bc7da3154f
|
[CELEBORN-354][Flink] fix succeedPartitionIds may contain new added partitionIds (#1289)
|
2023-03-01 15:45:24 +08:00 |
|
Angerszhuuuu
|
eda21ead24
|
[CELEBORN-344] Change PUSH_DATA_FAIL_MASTER/SALVE to PUSH_DATA_WRITE_FAIL_MASTER/SALVE (#1281)
|
2023-02-28 11:29:40 +08:00 |
|
Keyong Zhou
|
7adf1fca41
|
[CELEBORN-295] Optimize data push (#1232)
* [CELEBORN-295] Add double buffer for sort pusher
|
2023-02-28 10:35:55 +08:00 |
|
Angerszhuuuu
|
24f5478adc
|
[CELEBORN-338] Clean duplicated exception message of handling push data (#1274)
|
2023-02-28 10:35:18 +08:00 |
|
Shuang
|
935806f036
|
[CELEBORN-341][Flink] cache file group for map partition in Flink plugin (#1277)
|
2023-02-26 20:31:20 +08:00 |
|
Angerszhuuuu
|
a7587c3fe7
|
[CELEBORN-337] Remove unnecessary StatusCode.message (#1272)
* [CELEBORN-337] Remove unnecessary StatusCode.message
|
2023-02-24 15:11:07 +08:00 |
|
Angerszhuuuu
|
81f7ffd767
|
[CELEBORN-332] Unify the log of ShuffleClientImpl (#1267)
* [CELEBORN-332] Unify the log of ShuffleClientImpl
|
2023-02-24 14:07:25 +08:00 |
|
Angerszhuuuu
|
3067efcfd3
|
[CELEBORN-331] submitRetryPushData should throw PUSH_DATA_CREATE_CONNECTION_FAIL_MASTER too (#1266)
* [CELEBORN-331] submitRetryPushData should throw PUSH_DATA_CREATE_CONNECTION_FAIL_MASTER too
|
2023-02-23 14:57:11 +08:00 |
|
Angerszhuuuu
|
f7948190cf
|
[CELEBORN-316][FOLLOWUP] Should not wrap CelebornIOException with CelebornIOException (#1264)
|
2023-02-23 11:48:46 +08:00 |
|
Angerszhuuuu
|
1132cc25ab
|
[CELEBORN-328][MPROVEMENT] Too much noisy log when reserve slot failed (#1262)
|
2023-02-22 17:19:52 +08:00 |
|
Angerszhuuuu
|
322f0d2b41
|
[CELEBORN-316] Wrap Celeborn exception with CelebornIOException (#1253)
|
2023-02-22 16:10:11 +08:00 |
|
Shuang
|
3da615972e
|
[CELEBORN-326)] [Flink] lifecycleManager supports flink-yarn-session mode to handle multiple Flink jobs. (#1260)
|
2023-02-22 15:37:24 +08:00 |
|
Angerszhuuuu
|
251b923b5b
|
[CELEBORN-321] When register shuffle failed, DataPushQueue should directly take the task queue to avoid NPE (#1258)
|
2023-02-21 17:02:37 +08:00 |
|
Shuang
|
61065230bd
|
[CELEBORN-311] not retry when register for map partition occurs exception (#1246)
|
2023-02-21 10:16:10 +08:00 |
|
Ethan Feng
|
bfb39632d9
|
[CELEBORN-235] Implement flink plugin. (#1244)
|
2023-02-17 19:31:12 +08:00 |
|
zhongqiangchen
|
b5dc106af8
|
[CELEBORN-291] optimize shuffleclientimpl creating client and pushdata for mappartition (#1224)
|
2023-02-17 19:07:19 +08:00 |
|
Shuang
|
b7ef9cf216
|
[CELEBORN-297] don't cache file groups for map partition shuffle type (#1237)
|
2023-02-17 11:28:47 +08:00 |
|
Angerszhuuuu
|
57f775a7e9
|
[CELEBORN-273] Move push data timeout checker into TransportResponseHandler to keep callback status consistence (#1208)
|
2023-02-16 18:27:37 +08:00 |
|
jiaoqingbo
|
318157e3e9
|
[CELEBORN-305] Change the parameter passed in the registerShuffle method to numPartitions instead of numMappers (#1240)
|
2023-02-15 17:35:43 +08:00 |
|
jiaoqingbo
|
bd9e0ddc1f
|
[CELEBORN-304] Missing setIfMissing celeborn.$module.io.serverThreads (#1238)
|
2023-02-15 15:49:08 +08:00 |
|
Shuang
|
75c83093f2
|
[CELEBORN-296] fix map partition commit using wrong partitionId and result (#1233)
|
2023-02-14 20:54:06 +08:00 |
|
Rex(Hui) An
|
bff6e91e0b
|
[CELEBORN-227] Support different push strategies to control the push speed (#1167)
|
2023-02-07 14:24:30 +08:00 |
|
Angerszhuuuu
|
ff683ffc91
|
[CELEBORN-238][IMPROVEMENT] Revive caused by PUSH_DATA_TIMEOUT_MASTER and PUSH_DATA_TIMEOUT_SLAVE should add corresponding worker into blacklist (#1180)
|
2023-02-03 17:47:24 +08:00 |
|
Angerszhuuuu
|
4b6f7e4593
|
[CELEBORN-239][IMPROVEMENT] Worker replicate should enable push data timeout too (#1185)
|
2023-02-03 11:53:15 +08:00 |
|
Rex(Hui) An
|
021004714b
|
[CELEBORN-264] InFlight requests should not be expired if it's not pushed yet (#1196)
|
2023-02-01 22:16:55 +08:00 |
|
Shuang
|
7162be2fae
|
[CELEBORN-201] Separate partitionLocationInfo in LifecycleManager and worker (#1149)
|
2023-01-31 18:53:36 +08:00 |
|
Angerszhuuuu
|
1311fb53d1
|
[CELEBORN-243][CELEBORN-245][IMPROVEMENT] Create push client failed and connection failed cause push failed should have their own ERROR type (#1181)
* [CELEBORN-243][IMPROVEMENT] Create push client failed should have a ERROR type
|
2023-01-30 17:47:22 +08:00 |
|
Angerszhuuuu
|
8611a64400
|
[CELEBORN-237][IMPROVEMENT] push failed error message should show partition info (#1178)
* [CELEBORN-237][IMPROVEMENT] push failed error message should show partition info
|
2023-01-28 18:41:54 +08:00 |
|
Keyong Zhou
|
e47f1e33b0
|
[CELEBORN-55][FOLLOWUP] Code refine (#1175)
|
2023-01-20 16:22:47 +08:00 |
|
zy.jordan
|
c5be79ee3d
|
[CELEBORN-55][FEATURE] Split maxReqsInFlight limitation into level of target worker (#1102)
|
2023-01-20 10:18:45 +08:00 |
|
zhongqiangczq
|
1836fe187b
|
[CELEBORN-197] in mappartition, check transportClient whether changed while sending messages (#1145)
|
2023-01-13 16:45:26 +08:00 |
|
Shuang
|
810a8d01e0
|
[CELEBORN-212] refresh client if current client is inactive. (#1159)
|
2023-01-11 11:54:50 +08:00 |
|
Shuang
|
1332362bff
|
[CELEBORN-213] Add configuration for whether to close idle connections in client side (#1157)
|
2023-01-10 19:13:33 +08:00 |
|
Angerszhuuuu
|
e155ec122a
|
[CELEBORN-190] doPushMergedData should also support revive multiple times, not only twice (#1136)
|
2023-01-10 11:39:40 +08:00 |
|
Shuang
|
2ec06472fe
|
[CELEBORN-203] fix NPE when removeExpiredShuffle in LifecycleManager. (#1151)
|
2023-01-06 18:32:17 +08:00 |
|
Angerszhuuuu
|
0d5809ff0c
|
[CELEBORN-192][IMPROVEMENT] Change FAILED status to REQUEST_FAILED since it's all used when RPC request failed. (#1139)
|
2023-01-06 16:53:04 +08:00 |
|
Shuang
|
3b2be25a50
|
[CELEBORN-173] refactor minicluster and fix ut (#1147)
|
2023-01-05 20:39:19 +08:00 |
|
Angerszhuuuu
|
415452d9c4
|
[CELEBORN-189][IMPROVEMENT] PushDataFailedSlave should add slave worker to blacklist (#1135)
|
2023-01-05 20:12:07 +08:00 |
|
Angerszhuuuu
|
fe8dfb05f3
|
[CELEBORN-196][REFACTOR] Rename batchHandleRequestPartitions to handleRequestPartitions (#1144)
|
2023-01-05 14:37:10 +08:00 |
|
Angerszhuuuu
|
2315f2f988
|
[CELEBORN-191][BUG] ShuffleClient registerShuffle return RESERVE_SLOTS_FAILED should also been print out (#1138)
|
2023-01-03 17:13:31 +08:00 |
|
Shuang
|
5cba307189
|
[CELEBORN-146] refactor ShuffleMapperAttempts & GetReducerFileGroup (#1116)
|
2022-12-30 18:15:23 +08:00 |
|
Cheng Pan
|
b8758a7cb6
|
[CELEBORN-181][TEST] Rename RssFunSuite to CelebornFunSuite (#1125)
|
2022-12-29 18:10:14 +08:00 |
|
RexAn
|
6432a129be
|
[CELEBORN-61][CELEBORN-62][FOLLOW_UP] Fix some issues for slow start (#1119)
|
2022-12-29 12:07:20 +08:00 |
|
Binjie Yang
|
63943cd5cc
|
[CELEBORN-147][IT]Extraction of common integration test cases (#1092)
|
2022-12-29 12:03:09 +08:00 |
|
Keyong Zhou
|
2f0682265e
|
[CELEBORN-119] Add timeout for pushdata (#1097)
|
2022-12-20 20:40:42 +08:00 |
|
Keyong Zhou
|
a2dd72f20c
|
[CELEBORN-155] Wrong TimeUnit for registerShuffleRetryWait in Shuffle… (#1099)
|
2022-12-19 17:32:18 +08:00 |
|
Shuang
|
13769f0f0a
|
[CELEBORN-121] Refactor batchHandleCommitPartition (#1089)
|
2022-12-19 12:39:39 +08:00 |
|
Ethan Feng
|
39394526a8
|
[CELEBORN-142]Keep committed partition locations semantic consistent when commit files on HDFS. (#1091)
|
2022-12-16 19:02:02 +08:00 |
|
nafiy
|
ddab27a1d7
|
[CELEBORN-145][REFACTOR] Add reason in CheckQuotaResponse (#1093)
* [CELEBORN-145][REFACTOR] Add reason in CheckQuotaResponse
|
2022-12-15 18:16:34 +08:00 |
|