Angerszhuuuu
791d72d45f
[CELEBORN-590] Remove hadoop prefix of WORKER_WORKING_DIR ( #1494 )
2023-05-17 17:57:27 +08:00
Angerszhuuuu
7c6cb2f3bb
[CELEBORN-588] Remove test conf's category ( #1491 )
2023-05-17 17:37:28 +08:00
Angerszhuuuu
64a3534f71
[CELEBORN-584] Worker side should expose push/replicate/fetch Netty allocator metrics ( #1489 )
2023-05-16 17:51:33 +08:00
Shuang
f83304c337
[CELEBORN-581][Flink] Support JobManager failover. ( #1485 )
2023-05-16 14:51:53 +08:00
Angerszhuuuu
d657f8268a
[CELEBORN-586] Add SystemMiscSource to indicate system running status ( #1488 )
2023-05-16 14:03:07 +08:00
zhongqiangchen
5769c3fdc7
[CELEBORN-552] Add HeartBeat between the client and worker to keep alive ( #1457 )
2023-05-10 19:35:51 +08:00
Shuang
fb753fd48e
[CELEBORN-573] Guarantee resource/app/worker change persistent to raft in Ha Mode. ( #1477 )
2023-05-10 14:28:52 +08:00
Angerszhuuuu
778b5440bc
[CELEBORN-556][BUG] ReserveSlot should not use default RPC time out since register shuffle max timeout is network timeout ( #1461 )
2023-05-10 12:29:06 +08:00
Shuang
2fea818fa8
[CELEBORN-579] revert Destroy Message rename for compatibility. ( #1482 )
2023-05-09 15:24:02 +08:00
Ethan Feng
3e0d779962
[CELEBORN-576] Add static identity provider and manually settable identity provider for non-hadoop environment. ( #1480 )
2023-05-08 17:29:01 +08:00
Angerszhuuuu
a315a2eb41
[CELEBORN-575] PartitionLocationInfo change cause quick upgrade impacted ( #1479 )
2023-05-08 16:56:42 +08:00
Angerszhuuuu
ef4c12e0fe
[CELEBORN-565] FETCH_MAX_RETRIES should double when enable replicates ( #1471 )
2023-04-28 14:27:35 +08:00
Angerszhuuuu
bfce6052d7
[CELEBORN-560][FOLLOWUP] Follow the original design for handling rerun & speculative task after handleStageEnd ( #1468 )
2023-04-28 11:18:42 +08:00
Angerszhuuuu
7a4f2ebd8a
[CELEBORN-547] Refactor request related API ( #1452 )
2023-04-27 16:25:41 +08:00
Angerszhuuuu
be84e8ba0d
[CELEBORN-562][REFACTOR] Rename Destroy and DestroyResponse to make it more clear ( #1467 )
2023-04-27 12:31:32 +08:00
Shuang
64a4f7274c
[CELEBORN-554][Tuning] Improve For LM to avoid reserve/commit empty worker resources ( #1459 )
2023-04-26 18:04:50 +08:00
Angerszhuuuu
13ce04f8a1
[CELEBORN-557] HA_CLIENT_RPC_ASK_TIMEOUT should fallback to RPC_ASK_TIMEOUT ( #1462 )
...
* [CELEBORN-557] HA_CLIENT_RPC_ASK_TIMEOUT should fallback to RPC_ASK_TIMEOUT
2023-04-26 15:19:34 +08:00
Shuang
0b2e4877bd
[CELEBORN-553] Improve IO ( #1458 )
2023-04-25 21:14:06 +08:00
Shuang
d68deecaaa
[CELEBORN-546][FLINK] Use autoIncrement partitionId replace encode(mapId, attemptId) for generating partitionId ( #1447 )
2023-04-22 16:33:22 +08:00
Angerszhuuuu
181c1bfcd6
[CELEBORN-524][PERF] CongestionControl call too much ChannelsLimiter onTrim cause CPU stuck or occupy too much CPU cause no cpu for handlePushData ( #1428 )
2023-04-21 15:44:56 +08:00
Angerszhuuuu
6830cb61ef
[CELEBORN-540][Refactor] Add config entity of celeborn.rpc.io.threads ( #1443 )
...
* [CELEBORN-540][CONF] Add config entity of celeborn.rpc.io.threads
2023-04-21 11:21:41 +08:00
Shuang
62d60de8c5
[CELEBORN-537] Improve blacklist compute & minor fix for Flink ( #1441 )
...
[CELEBORN-537] improve blacklist compute & minor fix for flink
2023-04-20 18:30:10 +08:00
Ethan Feng
6378a386d0
[CELEBORN-530][REFACTOR] Move stream manager and memory manager to worker module. ( #1439 )
2023-04-20 10:17:26 +08:00
Ethan Feng
8be82548e1
[CELEBORN-520][FLINK] Tune map partition reading performance. ( #1424 )
2023-04-17 16:47:09 +08:00
Shuang
412d10b7dc
[CELEBORN-479][FLINK] support stopTrackingAndReleasePartitions when worker is not available ( #1405 )
2023-04-17 14:44:24 +08:00
Angerszhuuuu
938aec0e9f
[CELEBORN-528][REFACTOR] limitZeroInFlight should show inflight target ( #1433 )
2023-04-17 11:53:34 +08:00
Angerszhuuuu
932ccd0841
[CELEBORN-523][REFACTOR] Remove unnecessary code in WorkerPartitionLocationInfo ( #1427 )
2023-04-15 22:36:48 +08:00
Shuang
a22c6ca749
[CELEBORN-521] correct exception and unify unRetryableException ( #1425 )
2023-04-15 22:27:28 +08:00
Angerszhuuuu
3a21362265
[CELEBORN-511][IMPROVE] Move onTrim tag to StorageManager to avoid frequent trim action ( #1415 )
...
* [CELEBORN-511][IMPROVE] Move onTrim tag to StorageManager to avoid frequent trim action
2023-04-14 10:35:51 +08:00
Angerszhuuuu
480d7ac0d9
[CELEBORN-519][PERF] getMaster/SlaveLocation directly use uniqueId as key ( #1421 )
2023-04-13 21:53:33 +08:00
Ethan Feng
9cccfc9872
[CELEBORN-431][FLINK] Support dynamic buffer allocation in reading map partition. ( #1407 )
2023-04-13 10:37:47 +08:00
Angerszhuuuu
32b497973e
[CELEBORN-517][IMPROVEMENT] Optimize stopTimer/startTimer cpu cost ( #1419 )
2023-04-12 20:12:01 +08:00
Angerszhuuuu
da98ed9bea
[CELEBORN-516][PERF] Remove RPCSource since it cost too much CPU ( #1420 )
2023-04-12 18:47:06 +08:00
Angerszhuuuu
e5722126e9
[CELEBORN-502][REFACTOR] Merge GetBlacklistResponse to HeartbeatFromApplication ( #1408 )
...
* [CELEBORN-502][REFACTOR] Merge GetBlacklistResponse to HeartbeatFromApplication
2023-04-12 14:59:32 +08:00
Keyong Zhou
7dd2230a04
[CELEBORN-510][FLINK] DataPartitionReader.addBuffer should not call s… ( #1413 )
2023-04-07 18:17:55 +08:00
Shuang
9b2b8a01ec
[CELEBORN-507] don't set up worker endpoint when update meta and remove compare worker meta with workers ( #1412 )
2023-04-07 11:46:24 +08:00
Angerszhuuuu
cad2836e85
[CELEBORN-505] Fix typo of SHUFFLE_CHUCK_SIZE ( #1411 )
2023-04-04 19:15:30 +08:00
Keyong Zhou
2e1598c011
[CELEBORN-485] Make celeborn.push.replicate.enabled default to false ( #1394 )
2023-04-03 16:36:29 +08:00
Angerszhuuuu
bf46336d54
[CELEBORN-487][PERF] ShuffleClientSide support blacklist to avoid client side timeout in same worker multiple times ( #1399 )
2023-04-03 11:50:04 +08:00
Angerszhuuuu
b4f8ab19bd
[CELEBORN-484][PERF] Master trigger LifecycleManager commit shutdown worker's partition location. ( #1395 )
...
* [CELEBORN-484][PERF] Master trigger LifecycleManager commit shutdown worker's partition location.
2023-04-02 09:18:12 +08:00
Keyong Zhou
61416a828d
[CELEBORN-497]Fix and enable JDK 11 for CI ( #1401 )
2023-03-31 13:39:02 +08:00
Shuang
45013b8bae
[CELEBORN-489][FLINK]fix retry client for open stream ( #1397 )
2023-03-30 11:44:19 +08:00
zhongqiangchen
cd92c423cd
[CELEBORN-475] Support extra tags for prometheus metrics ( #1385 )
...
[CELEBORN-475] Support extra tags for prometheus metrics
2023-03-28 21:22:28 +08:00
Ethan Feng
6cee85748d
[CELEBORN-477][FLINK] Report failed partition to flink framework. ( #1391 )
2023-03-28 15:54:37 +08:00
Keyong Zhou
cb19ed1c66
[CELEBORN-479][PERF] Refactor DataPushQueue.takePushTask to avoid busy wait ( #1386 )
2023-03-27 16:18:55 +08:00
Fei Wang
b40c573069
[CELEBORN-474][FOLLOWUP] Using inner static ConcurrentHashMap class and only apply for JDK8 ( #1384 )
2023-03-27 16:16:23 +08:00
Fei Wang
7c444cb0c5
[CELEBORN-474] Speed up ConcurrentHashMap#computeIfAbsent ( #1383 )
2023-03-26 09:41:59 +08:00
Fei Wang
c609c0ebaa
[MINOR] Fix typo and remove unused code ( #1381 )
...
* fix typo
* remove unused
2023-03-25 23:20:33 +08:00
Angerszhuuuu
acf6fd3bd2
[CELEBORN-345] TransportResponseHandler create too much thread ( #1373 )
2023-03-24 17:16:26 +08:00
Shuang
89b3f3887d
[CELEBORN-356] [FLINK] Support release single partition resource ( #1314 )
2023-03-24 17:15:28 +08:00
Keyong Zhou
2bfa7e8965
[CELEBORN-466][FLINK] ReadBufferDispatcher.recycle should log error when refCnt != 1 ( #1377 )
2023-03-23 20:33:28 +08:00
Lianne Li
a071bdf6d7
[CELEBORN-449] Repair the hdfs path regex ( #1367 )
...
* Path protocols are all started with xxx://, and is unnecessary to restrict the content after that. Actually, it makes an error when write shuffle files which like "xxx://abc/shuffle/hadoop/rss-worker/shuffle_data/spark-0de72e2ce2e24f6db69c2228dd12a514/0/0-0-0"
---------
Co-authored-by: ming.li2 <ming.li2@dmall.com>
Co-authored-by: Cheng Pan <pan3793@gmail.com>
Co-authored-by: Ethan Feng <ethanfeng@apache.org>
2023-03-23 12:40:41 +08:00
Keyong Zhou
885e0cef32
[CELEBORN-459] Remove chunkTracker from FileManagedBuffers to avoid conflict with stream reuse ( #1372 )
2023-03-22 11:43:38 +08:00
Keyong Zhou
3d6fba553b
[CELEBORN-454] Code refine for worker ( #1371 )
2023-03-22 10:39:14 +08:00
Angerszhuuuu
f16c7b414e
[CELEBORN-445] Add CelebornRackResolver to support rack reoslve ( #1366 )
2023-03-21 16:40:46 +08:00
Angerszhuuuu
56d796638f
[CELEBORN-438] Move ServletPath to MetricsSytsem ( #1364 )
2023-03-20 18:22:40 +08:00
Keyong Zhou
9401db2bc8
[CELEBORN-443] Code refine for client and common ( #1362 )
2023-03-20 10:37:43 +08:00
乐活优格
0b78c6d325
[CELEBORN-442]Support hdfs compatible file system ( #1360 )
2023-03-18 11:47:46 +08:00
Ethan Feng
0ebad677d7
[CELEBORN-434] Add constrain about memory manager's parameters. ( #1356 )
2023-03-17 15:14:03 +08:00
Ethan Feng
6f317c77ee
[CELEBORN-422][FLINK] Remove unused fields in ReadData. ( #1347 )
2023-03-14 19:49:00 +08:00
Shuang
1fa00c0317
[CELEBORN-391][FLINK][FOLLOW UP] fix clean stream twice & refine log & add ut ( #1344 )
2023-03-14 15:38:49 +08:00
Shuang
cd5241d399
[CELEBORN-381][FLINK] notify the task with the error message when channel in active. ( #1341 )
2023-03-14 11:28:03 +08:00
Ethan Feng
971c93d4d9
[CELEBORN-419][FLINK] Fix memory leak when receive RPCs with body. ( #1343 )
2023-03-14 11:27:36 +08:00
Ethan Feng
2385215578
[CELEBORN-394] Refine memory manager's log. ( #1332 )
2023-03-13 15:13:33 +08:00
Angerszhuuuu
b56624d3c1
[CELEBORN-405] Add metrics about lost workers ( #1330 )
...
* [CELEBORN-405] Add metrics about lost workers
2023-03-13 14:49:49 +08:00
Ethan Feng
c78023824a
[CELEBORN-397][FLINK] Flink plugin support UnpooledByteBufAllocator. ( #1324 )
2023-03-13 11:36:13 +08:00
Ethan Feng
bb8401e401
[CELEBORN-403][FLINK] Add metrics about buffer dispatcher request queue length. ( #1329 )
2023-03-13 11:15:00 +08:00
Angerszhuuuu
a336f12cc8
[CELEBORN-400] Add RPC metrics for OpenStream ( #1326 )
2023-03-10 21:22:05 +08:00
Angerszhuuuu
4b334df7a6
[CELEBORN-399] Make fileSorterExecutors thread num can be customized ( #1325 )
2023-03-10 21:10:43 +08:00
Shuang
ec745e36d1
[CELEBORN-391][Flink] Refine register/release synchronization ( #1321 )
2023-03-09 20:00:50 +08:00
Ethan Feng
aebb870d08
[CELEBORN-386][FLINK] Async open DataPartitionReader to release Netty thread earlier. ( #1318 )
2023-03-09 12:31:01 +08:00
Ethan Feng
675a7da393
[CELEBORN-368][FLINK] Pass exceptions in buffer stream. ( #1304 )
2023-03-03 15:43:30 +08:00
Keyong Zhou
9aabb43699
[CELEBORN-372] Remove the standard Apache License header from the top of third-party source files ( #1301 )
2023-03-02 19:07:01 +08:00
Keyong Zhou
dcedf7b0a9
[CELEBORN-348] Support fetchTime in load-aware slots assignment strategy ( #1287 )
2023-03-02 18:31:50 +08:00
Ethan Feng
d4af8fd094
[CELEBORN-353][FLINK] Fix incorrect read buffer metric. ( #1288 )
2023-03-01 11:08:13 +08:00
zhongqiangchen
cb76c4de4c
[CELEBORN-350][FLINK] Add PluginConf to be compatible with old configurations
2023-02-28 20:36:11 +08:00
jiaoqingbo
7dc1ab13db
[CELEBORN-351] Add \n to the log to make log print clearer ( #1285 )
2023-02-28 17:55:17 +08:00
Shuang
5654c62f35
[CELEBORN-347][Flink] fix memory leak and refactor BufferStreamManager ( #1282 )
2023-02-28 15:18:59 +08:00
Angerszhuuuu
eda21ead24
[CELEBORN-344] Change PUSH_DATA_FAIL_MASTER/SALVE to PUSH_DATA_WRITE_FAIL_MASTER/SALVE ( #1281 )
2023-02-28 11:29:40 +08:00
Keyong Zhou
7adf1fca41
[CELEBORN-295] Optimize data push ( #1232 )
...
* [CELEBORN-295] Add double buffer for sort pusher
2023-02-28 10:35:55 +08:00
Angerszhuuuu
24f5478adc
[CELEBORN-338] Clean duplicated exception message of handling push data ( #1274 )
2023-02-28 10:35:18 +08:00
Shuang
935806f036
[CELEBORN-341][Flink] cache file group for map partition in Flink plugin ( #1277 )
2023-02-26 20:31:20 +08:00
Keyong Zhou
3c8c58e09d
[CELEBORN-301] Refactor PartitionLocationInfo to use ConcurrentHashMap ( #1278 )
2023-02-26 16:46:30 +08:00
Ethan Feng
f0b9236ff2
[CELEBORN-340][FLINK] Reuse file channels in map partition read. ( #1276 )
2023-02-24 19:26:51 +08:00
Angerszhuuuu
a7587c3fe7
[CELEBORN-337] Remove unnecessary StatusCode.message ( #1272 )
...
* [CELEBORN-337] Remove unnecessary StatusCode.message
2023-02-24 15:11:07 +08:00
zhongqiangchen
af9e8366c9
[CELEBORN-329] Add rpc address to exception message when failed to sendrpc ( #1263 )
2023-02-23 19:32:21 +08:00
Shuang
9754616d79
[CELEBORN-330] fix deadlock when use the same netty channel to receive data while other thread wait the response ( #1265 )
2023-02-23 17:57:43 +08:00
Angerszhuuuu
322f0d2b41
[CELEBORN-316] Wrap Celeborn exception with CelebornIOException ( #1253 )
2023-02-22 16:10:11 +08:00
Ethan Feng
1704aff95c
[CELEBORN-327][Flink] BufferStreamMananger should recycle buffer in reader thread. ( #1261 )
2023-02-22 16:02:58 +08:00
Ethan Feng
cb8df62ec5
[CELEBORN-324][FLINK] Flink plugin needs reuse connections. ( #1257 )
2023-02-21 18:32:00 +08:00
Shuang
1b1517c7b4
[CELEBORN-323] readBuffers need synchronized as recycle buffer will call readers in multiple threads ( #1256 )
2023-02-21 15:58:19 +08:00
Ethan Feng
5dd5e97225
[CELEBORN-322][Flink] Copy out message if it‘s readData only. ( #1255 )
2023-02-21 15:51:13 +08:00
Ethan Feng
c649655933
Revert "[CELEBORN-322][Flink] Copy out message if it‘s readData only."
...
This reverts commit 0aa37ed7d3 .
2023-02-21 14:48:08 +08:00
Ethan Feng
0aa37ed7d3
[CELEBORN-322][Flink] Copy out message if it‘s readData only.
2023-02-21 14:45:39 +08:00
Ethan Feng
d7798127c9
[CELEBORN-319] FlinkTransportClient should not reuse connection. ( #1252 )
2023-02-21 11:16:30 +08:00
Shuang
cf833e568c
[CELEBORN-318] fix deadlock & bugs in bufferStreamManager ( #1251 )
2023-02-21 11:12:16 +08:00
Shuang
a6103e4bf8
[CELEBORN-317] add REGISTER_MAP_PARTITION_TASK message type ( #1250 )
2023-02-20 22:01:35 +08:00
Ethan Feng
7e9ba19d58
[CELEBORN-302] Fix workers count out of sync in HA mode. ( #1239 )
2023-02-20 21:46:33 +08:00
zhongqiangchen
b5dc106af8
[CELEBORN-291] optimize shuffleclientimpl creating client and pushdata for mappartition ( #1224 )
2023-02-17 19:07:19 +08:00
Ethan Feng
0c8bb83114
[CELEBORN-234] Implement buffer stream. ( #1221 )
2023-02-17 17:38:36 +08:00
Ethan Feng
3aacede5f8
[CELEBORN-283] Derive network layer for flink plugin. ( #1222 )
2023-02-17 14:12:54 +08:00
zhongqiangchen
5236df68af
[CELEBORN-292] optimize mappartitionfilewriter flushing index and reading data header ( #1225 )
2023-02-17 13:42:28 +08:00
Ethan Feng
1dcfdb0c8f
[CELEBORN-281] Add metrics about buffer stream read buffer. ( #1216 )
2023-02-17 11:20:07 +08:00
Keyong Zhou
89b4eab3b6
[CELEBORN-309] Fix some potential concurrent issues in InFlightRequestTracker ( #1243 )
2023-02-17 10:01:19 +08:00
Angerszhuuuu
57f775a7e9
[CELEBORN-273] Move push data timeout checker into TransportResponseHandler to keep callback status consistence ( #1208 )
2023-02-16 18:27:37 +08:00
Ethan Feng
a364fb27b2
[CELEBORN-282] Add BacklogAnnouncement RPC. ( #1217 )
2023-02-16 14:58:39 +08:00
Ethan Feng
534853bf8a
[CELEBORN-278] Add openStreamWithCredit RPC. ( #1214 )
2023-02-16 14:07:13 +08:00
jiaoqingbo
bd9e0ddc1f
[CELEBORN-304] Missing setIfMissing celeborn.$module.io.serverThreads ( #1238 )
2023-02-15 15:49:08 +08:00
Rex(Hui) An
2068e6ae37
[CELEBORN-279] Add user level push data speed metric ( #1213 )
2023-02-13 12:04:44 +08:00
jiaoqingbo
3a92b0d911
[CELEBORN-284] fix typo in CelebornConf ( #1218 )
...
Co-authored-by: jiaoqb <jiaoqb@asiainfo.com>
2023-02-10 14:59:36 +08:00
Angerszhuuuu
dae58a664c
[CELEBORN-239][FOLLOWUP] PUSH_DATA_TIMEOUT_MASTER/SLAVE should support convert through RPC ( #1211 )
...
* [CELEBORN-239][FOLLOWUP] PUSH_DATA_TIMEOUT_MASTER/SLAVE should support convert through RP
2023-02-08 17:16:29 +08:00
Rex(Hui) An
bff6e91e0b
[CELEBORN-227] Support different push strategies to control the push speed ( #1167 )
2023-02-07 14:24:30 +08:00
Rex(Hui) An
bb113ec9be
[CELEBORN-207] Support network congestion control ( #1066 )
2023-02-07 12:06:18 +08:00
Shuang
2634476758
[CELEBORN-267] reuse stream when client channel reconnected ( #1200 )
2023-02-03 15:12:45 +08:00
Angerszhuuuu
4b6f7e4593
[CELEBORN-239][IMPROVEMENT] Worker replicate should enable push data timeout too ( #1185 )
2023-02-03 11:53:15 +08:00
Angerszhuuuu
04427f2b16
[CELEBORN-247] Add metrics for each user's quota usage ( #1182 )
2023-02-02 18:31:08 +08:00
Ethan Feng
a43e3141bc
[CELEBORN-224][FOLLOWUP] Correct license and notices. ( #1189 )
2023-02-02 10:52:11 +08:00
Angerszhuuuu
98a5a3e16e
[CELEBORN-257][IMPROVEMENT] Avoid one hash searching when process message in TransportResponseHandler ( #1193 )
2023-02-01 14:59:53 +08:00
Angerszhuuuu
9ce48a648f
[CELEBORN-244][IMPROVEMENT] Separate outstandingPushes from outstandingRpcs ( #1190 )
2023-02-01 11:12:16 +08:00
Shuang
7162be2fae
[CELEBORN-201] Separate partitionLocationInfo in LifecycleManager and worker ( #1149 )
2023-01-31 18:53:36 +08:00
Rex(Hui) An
6e82e7dd6c
[CELEBORN-253][MINOR] Fix the wrongly resolve celeborn.ha.master.node.id issue if enable HA ( #1188 )
2023-01-31 15:39:58 +08:00
Angerszhuuuu
1311fb53d1
[CELEBORN-243][CELEBORN-245][IMPROVEMENT] Create push client failed and connection failed cause push failed should have their own ERROR type ( #1181 )
...
* [CELEBORN-243][IMPROVEMENT] Create push client failed should have a ERROR type
2023-01-30 17:47:22 +08:00
Angerszhuuuu
122da47815
[CELEBORN-241][IMPROVEMENT] limit inflight push timeout should > push data timeout ( #1179 )
2023-01-30 11:57:07 +08:00
Kaijie Chen
3da338a716
[CELEBORN-248] Non-ASCII characters in source code ( #1183 )
2023-01-29 21:07:41 +08:00
nafiy
d6d537df93
[CELEBORN-229][FOLLOWUP] Support collect metrics with customized labels ( #1174 )
2023-01-28 16:02:58 +08:00
Keyong Zhou
e47f1e33b0
[CELEBORN-55][FOLLOWUP] Code refine ( #1175 )
2023-01-20 16:22:47 +08:00
zy.jordan
c5be79ee3d
[CELEBORN-55][FEATURE] Split maxReqsInFlight limitation into level of target worker ( #1102 )
2023-01-20 10:18:45 +08:00
nafiy
e09b629da2
[CELEBORN-229][FEATURE] Support collect metrics with customized labels ( #1173 )
2023-01-19 11:59:48 +08:00
Kaijie Chen
2b6822e3c7
[CELEBORN-230] AppDiskUsageSnapShot overrides equals() without override hashCode() ( #1172 )
2023-01-18 17:21:32 +08:00
Ethan Feng
a239f9f284
[CELEBORN-228]Refactor PartitionFileSorter to avoid specific JDK dependency. ( #1168 )
2023-01-16 20:06:47 +08:00
zy.jordan
bb96700415
[CELEBORN-223] The default rpc thread num of pushServer/replicateServer/fetchServer should be the number of total of Flusher's thread ( #1163 )
2023-01-16 12:03:46 +08:00
Keyong Zhou
fa7ba43136
[CELEBORN-225] Add global default configuration for number of flusher… ( #1165 )
2023-01-14 13:20:44 +08:00
zhongqiangczq
411ab09ffb
[CELEBORN-158][Flink] Add ShuffleServiceFactory to Support MapPartition in … ( #1105 )
2023-01-13 16:38:46 +08:00
Shuang
810a8d01e0
[CELEBORN-212] refresh client if current client is inactive. ( #1159 )
2023-01-11 11:54:50 +08:00
Shuang
1332362bff
[CELEBORN-213] Add configuration for whether to close idle connections in client side ( #1157 )
2023-01-10 19:13:33 +08:00
zy.jordan
19197b9190
[CELEBORN-214] Push/Replicate/Fetch io threads default value is 16 ( #1158 )
2023-01-10 17:46:56 +08:00
Angerszhuuuu
e155ec122a
[CELEBORN-190] doPushMergedData should also support revive multiple times, not only twice ( #1136 )
2023-01-10 11:39:40 +08:00
Angerszhuuuu
0d5809ff0c
[CELEBORN-192][IMPROVEMENT] Change FAILED status to REQUEST_FAILED since it's all used when RPC request failed. ( #1139 )
2023-01-06 16:53:04 +08:00
Ethan Feng
5595f2f4b3
[CELEBORN-124]Add buffer stream. ( #1069 )
2023-01-06 15:54:52 +08:00
Angerszhuuuu
415452d9c4
[CELEBORN-189][IMPROVEMENT] PushDataFailedSlave should add slave worker to blacklist ( #1135 )
2023-01-05 20:12:07 +08:00
Fu Chen
ab449ffdd7
[CELEBORN-198] Fix the wrong configuration path of plugin protobuf-maven-plugin and … ( #1146 )
2023-01-05 20:09:31 +08:00
Cheng Pan
b8758a7cb6
[CELEBORN-181][TEST] Rename RssFunSuite to CelebornFunSuite ( #1125 )
2022-12-29 18:10:14 +08:00
RexAn
6432a129be
[CELEBORN-61][CELEBORN-62][FOLLOW_UP] Fix some issues for slow start ( #1119 )
2022-12-29 12:07:20 +08:00
Angerszhuuuu
b13ddac9d2
[CELEBORN-172][Refactor] Load/Make snapshot use Protobuf serde ( #1118 )
2022-12-29 11:51:14 +08:00
Angerszhuuuu
829f35c753
[CELEBORN-176][BUG] Fix wrong alternative conf of celeborn.worker.flusher.ssd.threads ( #1121 )
2022-12-29 11:11:20 +08:00
Angerszhuuuu
5603e62e95
[CELEBORN-174][REFACTOR] Move AppDiskUsage related to meta package ( #1117 )
2022-12-27 15:24:42 +08:00
Ethan Feng
3cdc25286d
[CELEBORN-165] Fix ut RetryCommitFilesTest failure. ( #1111 )
2022-12-22 11:39:40 +08:00
Ethan Feng
5aa959a335
[CELEBORN-157] Change prefix of configurations to celeborn. ( #1104 )
2022-12-21 15:17:28 +08:00
nafiy
f13dfb7421
[CELEBORN-113][FEATURE] Add metrics to monitor non-critical error number on local device ( #1100 )
2022-12-20 22:30:55 +08:00
Keyong Zhou
2f0682265e
[CELEBORN-119] Add timeout for pushdata ( #1097 )
2022-12-20 20:40:42 +08:00