Cheng Pan
9b6ec58e2a
Add profile for Spark 3.2/3.3 ( #380 )
2022-08-17 22:27:43 +08:00
AngersZhuuuu
e1ece6123d
[ISSUE-364][FEATURE] Worker should reject ReserveSlots request when shutting down ( #365 )
2022-08-17 21:12:53 +08:00
Cheng Pan
bb0c9b21fc
[ISSUE-350] Rewrite RssShuffleManager using Java to pass compile on Spark 3.1+ ( #370 )
2022-08-17 15:59:50 +08:00
Ethan Feng
b3acb740e9
[Feature] choose usable disk while reserving slots. ( #372 )
2022-08-17 15:56:20 +08:00
Keyong Zhou
f0794dcb9a
Add DOC template ( #373 )
2022-08-17 11:36:43 +08:00
Keyong Zhou
8207b6bff7
[ISSUE-368] Not split if file size is smaller than minimum threshold; fix concurrent issues ( #369 )
2022-08-16 23:15:43 +08:00
nafiy
96b14e2205
[ISSUE-304][BUG]HA port being occupied makes master cannot normally launch ( #317 )
...
[ISSUE-304][BUG]HA port being occupied makes master cannot normally launch
2022-08-16 20:37:01 +08:00
Cheng Pan
f1f4b894af
Build: Enhance build system ( #349 )
2022-08-15 14:59:01 +08:00
AngersZhuuuu
ba41a2c2e8
[ISSUE-357][REFACTOR] Remove unused handleStageEnd ( #358 )
2022-08-15 12:26:15 +08:00
AngersZhuuuu
9477b9cf24
[ISSUE-321][Feature] Worker trigger shutdown hook when run command stop-worker.sh ( #324 )
2022-08-15 12:20:35 +08:00
Keyong Zhou
937ac54e7c
[ISSUE-351] Trigger split when reaching disk space limitation ( #356 )
2022-08-15 00:24:25 +08:00
Keyong Zhou
e372464187
[ISSUE-273][FOLLOW-UP] Only count avgTime if flush count exceeds mini… ( #355 )
2022-08-14 22:56:25 +08:00
Keyong Zhou
2b5e997ec3
[ISSUE-353] 1.Only remove dirOperator for IOHang bug; 2.Fix NPE when convert DestroyResponse to PB ( #354 )
2022-08-14 19:34:23 +08:00
Keyong Zhou
c2672c2d9d
[ISSUE-273][FOLLOW-UP] 1.Heartbeat use workerInfo's diskInfos instead… ( #352 )
2022-08-14 16:54:08 +08:00
Keyong Zhou
20a3ba4e56
[ISSUE-273][FOLLOW-UP] Merge MountInfo with DiskInfo ( #348 )
2022-08-13 22:58:13 +08:00
Keyong Zhou
9516a63eb5
[ISSUE-273][FOLLOW-UP] Remove duplicate handleWorkerHeartBeat ( #347 )
2022-08-13 18:32:47 +08:00
Keyong Zhou
89903d162e
[ISSUE-273][FOLLOW-UP] Add minimum partition size threshold for parti… ( #346 )
2022-08-13 18:29:43 +08:00
Keyong Zhou
766b3118d7
[ISSUE-273][FOLLOW-UP] Fix device stat file not found exception ( #345 )
2022-08-13 16:47:25 +08:00
Keyong Zhou
6d1a2db663
[ISSUE-273][FOLLOW-UP] Fix IndexOutOfBoundsException when release slots ( #344 )
...
```
java.lang.IndexOutOfBoundsException: Index: 2, Size: 1
at java.util.ArrayList.rangeCheck(ArrayList.java:659)
at java.util.ArrayList.get(ArrayList.java:435)
at com.aliyun.emr.rss.service.deploy.master.clustermeta.AbstractMetaManager.updateReleaseSlotsMeta(AbstractMetaManager.java:104)
at com.aliyun.emr.rss.service.deploy.master.clustermeta.SingleMasterMetaManager.handleReleaseSlots(SingleMasterMetaManager.java:53)
at com.aliyun.emr.rss.service.deploy.master.Master.handleReleaseSlots(Master.scala:456)
at com.aliyun.emr.rss.service.deploy.master.Master$$anonfun$receiveAndReply$1.$anonfun$applyOrElse$12(Master.scala:189)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at com.aliyun.emr.rss.service.deploy.master.Master.executeWithLeaderChecker(Master.scala:156)
at com.aliyun.emr.rss.service.deploy.master.Master$$anonfun$receiveAndReply$1.applyOrElse(Master.scala:189)
at com.aliyun.emr.rss.common.rpc.netty.Inbox.$anonfun$process$1(Inbox.scala:110)
at com.aliyun.emr.rss.common.rpc.netty.Inbox.safelyCall(Inbox.scala:214)
at com.aliyun.emr.rss.common.rpc.netty.Inbox.process(Inbox.scala:107)
at com.aliyun.emr.rss.common.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:222)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
```
2022-08-13 12:44:03 +08:00
Cheng Pan
08f647ea3e
Remove unexpected log4j configuration file from package jar and minor improvement ( #331 )
2022-08-13 11:10:48 +08:00
AngersZhuuuu
221b068773
[ISSUE-327][REFACTOR] Remove unused logAvailableFlushBuffersTask ( #328 )
2022-08-13 11:09:59 +08:00
Ethan Feng
f3bcb7f6a8
[ISSUE-146]update slots distribution mechanism ( #273 )
2022-08-12 23:38:19 +08:00
Keyong Zhou
d166e042be
[ISSUE-329] Should not sleep if reserve slots successfully in reserveSlotsWithRetry ( #330 )
2022-08-12 12:27:27 +08:00
Keyong Zhou
46cbe4fb04
[ISSUE-288] fix netty memory leak( #310 )
2022-08-12 00:49:42 +08:00
Keyong Zhou
d19b475500
[ISSUE-325] Log error msg for ChunkFetchFailure ( #326 )
2022-08-12 00:18:52 +08:00
Binjie Yang
2137e84ab7
add rss version ( #316 )
2022-08-10 21:03:46 +08:00
Binjie Yang
8ececd60a6
fix ( #314 )
2022-08-10 16:37:43 +08:00
Fan Yilun
7a858bd70b
[DOC] Add doc and helm chart to describe how to deploy on k8s ( #309 )
2022-08-09 21:15:15 +08:00
nafiy
50c0399940
[ISSUE-308][DOC] Fix the wrong configuration doc of rss.ha.port and provide example in readme. ( #311 )
2022-08-08 17:32:47 +08:00
nafiy
eeda030599
Add metrics for marking active master ( #307 )
2022-08-07 18:00:49 +08:00
AngersZhuuuu
a30fb5b3b0
[ISSUE-297] NPE in TaskCompletionListener due to Spark OOM in ShuffleInMemorySorter causing tasks to hang ( #298 )
2022-08-04 22:47:04 +08:00
Cheng Pan
d01ee81ee6
Bump Ratis 2.3.0 and related toolchains ( #299 )
2022-08-04 21:59:42 +08:00
Cheng Pan
efdf283918
ShuffleBlockInfo should be static inner class ( #301 )
2022-08-04 21:50:58 +08:00
Cheng Pan
ce147a45e1
Bump maven 3.6.3 ( #300 )
2022-08-04 21:05:21 +08:00
AngersZhuuuu
cf2b895afb
[ISSUE-293][REFACTOR] Init worker rpc endpoint and reserve slot in parallel to speed up register shuffle process ( #294 )
...
[ISSUE-293][REFACTOR] Init worker rpc endpoint and reserve slot in parallel to speed up register shuffle process (#294 )
2022-08-03 20:00:30 +08:00
Ethan Feng
594feab279
fix unexpected index update. ( #295 )
2022-08-02 18:16:26 +08:00
AngersZhuuuu
e57ad27887
[ISSUE-291][REFACTOR] When worker endpoint initializing failed, print clear warning log ( #292 )
2022-08-02 12:03:59 +08:00
zky.zhoukeyong
67ac3c029c
[DOC] Fix typo
2022-08-01 22:15:09 +08:00
AngersZhuuuu
c3cb2b9fc0
[ISSUE-289][BUG] Write data should use default object class to avoid NPE ( #290 )
...
* [ISSUE-289][BUG] Write data should use default object class to avoid NPE
2022-08-01 16:02:29 +08:00
dxheming
8e3f48ec12
Refactor deprecated netty ConcurrentSet ( #285 )
2022-07-27 20:35:46 +08:00
Ethan Feng
82f0475d9b
[BUG] Fix reserve slots failed due to take buffer stuck ( #283 )
...
* [BUG] Fix reserve slots failed due to take buffer stuck
2022-07-26 18:20:39 +08:00
AngersZhuuuu
7a760466aa
[ISSUE-281][BUG] Use correct maxDestLength to check if buffer can satisfy compress result ( #282 )
2022-07-26 15:56:05 +08:00
AngersZhuuuu
9324b1e89a
[ISSUE-257][FEATURE] Reserve slots support customized retry times ( #258 )
2022-07-26 15:23:25 +08:00
zky.zhoukeyong
457f5874a2
Delete System.out.println
2022-07-25 20:03:21 +08:00
AngersZhuuuu
fe17914942
Refactor pom import issue ( #277 )
2022-07-25 17:49:55 +08:00
Keyong Zhou
e11af5d948
Support passed-in buffer supplier for FrameDecoder ( #278 )
2022-07-25 16:46:29 +08:00
Ethan Feng
cb42b2fa5c
[BUG] multi-thread flusher causes data inconsistent with chunk offsets ( #275 )
2022-07-23 11:17:38 +08:00
Keyong Zhou
ebadb13070
[ISSUE-269] Remove unused inceptor in TransportFrameDecoder ( #270 )
2022-07-17 17:15:35 +08:00
Keyong Zhou
6442f38a33
[ISSUE-267] Extend API to support more partition types: MapPartition,… ( #268 )
2022-07-17 16:28:37 +08:00
Keyong Zhou
56a0b9072b
[ISSUE-261] Refine message class hierarchy ( #266 )
2022-07-16 17:00:09 +08:00