Commit Graph

1099 Commits

Author SHA1 Message Date
zhongqiangczq
843618877b
[CELEBORN-10] Add Message Support MapPartition (#977) 2022-11-20 20:47:06 +08:00
Gabriel
5ecb09d62a
[ISSUE-911] Decrease numConnectionsPerPeer to achieve better performance (#983) 2022-11-20 11:46:17 +08:00
Shuang
fb6d1de108
[CELEBORN-8] [ISSUE-952][FEATURE] support register shuffle task in map partition mode (#973) 2022-11-16 21:46:19 +08:00
Gabriel
a6e89f3b63
[CELEBORN-9] [ISSUE-861] Support multiple JDK version build (#974) 2022-11-16 16:38:51 +08:00
zhongqiangczq
7adcb5b933
[CELEBORN-6] [REFACTOR] PushDataHandler code refactor (#966) 2022-11-16 11:04:24 +08:00
Ethan Feng
98864889c6
[CELEBORN-5] Update README for jira and slack. (#972) 2022-11-15 18:42:36 +08:00
Gabriel
0b78cbfee0
[COMMUNITY] Update README (#971) 2022-11-15 16:10:02 +08:00
Zouxxyy
37c7525b8d
[CELEBORN-1] Test celeborn jira (#969) 2022-11-15 10:03:26 +08:00
Cheng Pan
df7cb8550b
[INFRA] Inroduce checkout_pr.sh shell script (#968) 2022-11-14 22:28:43 +08:00
Cheng Pan
0d1247306f
[INFRA] Setup .asf.yaml (#967)
* [INFRA] Setup .asf.yaml

* nit
2022-11-14 22:27:12 +08:00
nafiy
529bb22781
[ISSUE-958][REFACTOR] Add and modify log of fallback policy (#965) 2022-11-14 20:16:33 +08:00
nafiy
e33139a169
[ISSUE-948][REFACTOR] Replace userResourceConsumption of WorkerInfo with empty value for unnecessary ControlMessages (#956) 2022-11-14 12:16:38 +08:00
Angerszhuuuu
64e8ebf158
[ISSUE-925][FOLLOWUP] Refactor class name of RetryingChunkReceiveCallback (#954) 2022-11-11 14:00:47 +08:00
leesf
0b8376e2c7
Cleanup some code (#943) 2022-11-11 13:58:39 +08:00
Ethan Feng
6f043f8ae9
[BUG] Fix fetch incorrect data chunk (#926) 2022-11-09 22:31:39 +08:00
Cheng Pan
1b2ad16b94
Exclude unused files from Spark shaded client (#942) 2022-11-09 11:20:33 +08:00
Angerszhuuuu
827ba9e0f7
[ISSUE-939][REFACTOR] Bump up ratis to 2.4.0 (#940) 2022-11-08 15:12:00 +08:00
Kerwin Zhang
b052a94516
[FEATURE] Optimize columnar shuffle writer performance without encoding (#936) 2022-11-08 13:58:46 +08:00
leesf
aac68c3571
Rename RssException to CelebornException (#938) 2022-11-08 10:08:21 +08:00
leesf
496f44eda4
Shutdown worker if initialized failed. (#931) 2022-11-07 19:33:35 +08:00
Angerszhuuuu
99a7b85708
[ISSUE-932][REFACTOR] Device check should not directly reportError (#933)
* [ISSUE-932][REFACTOR] Device check should not directly reportError
2022-11-07 15:15:08 +08:00
nafiy
11081eac6c
[ISSUE-879][BUG] When notifyError, should destroy corresponding file writers (#912)
* [ISSUE-879][BUG] When notifyError, should destroy corresponding file writers
2022-11-07 14:01:51 +08:00
Angerszhuuuu
100e0057e8
[ISSUE-921][BUG] Flush Error should report non critical error (#928) 2022-11-07 11:56:11 +08:00
leesf
3699683a3b
Fix and migrate some configs (#927) 2022-11-07 09:41:38 +08:00
Kerwin Zhang
db08d49032
[FEATURE] Support columnar shuffle codegen (#915) 2022-11-04 20:54:13 +08:00
Binjie Yang
25a8d78634
remove dup affinity (#923) 2022-11-04 20:46:05 +08:00
Angerszhuuuu
bd7be934a2
[ISSUE-902][FOLLOWUP] Complete Utils.toStatus (#922) 2022-11-04 16:38:51 +08:00
Angerszhuuuu
38e15d89e6
[ISSUE-902][IMPROVEMENT][FOLLOWUP] LifecycleManager should reserve blacklist with irrecoverable status (#914) 2022-11-04 15:54:45 +08:00
Angerszhuuuu
2e48cdfdf8
[ISSUE-918][BUG] Utils.toStatus code is not complete (#920) 2022-11-04 14:58:07 +08:00
Cheng Pan
b1c1961e60
Fix MasterNode rpc endpoint info (#916) 2022-11-03 21:02:31 +08:00
Angerszhuuuu
e68ca75a9e
[ISSUE-902][BUG] LifecycleManager should not reallocate slots in failed worker during retry (#906) 2022-11-02 21:07:28 +08:00
Ethan Feng
b06ab31cee
[Feature] Shade netty native libraries (#908)
* shade netty native libraries.

* To ensure netty use the correct native library.

* To ensure netty use the correct native library.

* update.
2022-11-02 19:02:03 +08:00
Angerszhuuuu
ea4ed10e5c
[ISSUE-901][BUG] During worker graceful shutdown, worker should report itself as unavailable and avoid master allocate slots on it. (#905) 2022-11-02 16:09:58 +08:00
leesf
f1694f3d20
[MINOR][CLEANUP] clean up some code in LifecycleManager and ShuffleClientImpl (#896) 2022-11-01 11:40:19 +08:00
Zhen Wang
643eb84541
[MINOR] Fix typo (#898) 2022-11-01 10:03:15 +08:00
nafiy
ce3dc889fa
[ISSUE-867][BUG] Create writer failed should report non-critical error instead of critical error (#883) 2022-10-31 21:23:16 +08:00
nafiy
5a0282f53e
[ISSUE-827][REFACTOR] Collect all PbSerDe methods into PbSerDeUtils and change PbSerDeUtils to scala code (#877) 2022-10-31 15:05:31 +08:00
nafiy
9b1c70f219
[ISSUE-880][BUG] onTrim when flushFileWriters() should catch each file writer's exception, avoid block flush all file writers (#894) 2022-10-31 14:31:22 +08:00
Binjie Yang
9ef4751d22
[ISSUE-882][REFACTOR][K8S] Refactor dockerfile to build docker image from binary instead of ADD tgz (#884)
* init

* fix
2022-10-31 10:45:00 +08:00
Cheng Pan
25fbedd2b1
Bump Spark from 3.3.0 to 3.3.1 (#892) 2022-10-29 19:56:16 +08:00
Cheng Pan
96aa5e5850
Introduce reflect helper tools (#891) 2022-10-29 19:55:13 +08:00
Angerszhuuuu
87fcfa767f
[ISSUE-887][REFACTOR] Configuration type convert to Enum (#888)
* [ISSUE-332][FOLLOWUP] Add deps in worker's pom

* [Refactor] Modify package name of utils to keep consistence

* [Refactor] Modify package name of utils to keep consistence

* [REFACTOR] Remove unused isRegistered in controller

* [ISSUE-887][REFACTOR] Configuration type convert to Enum

* update

* update

* Update RssShuffleManager.java
2022-10-29 13:41:06 +08:00
Binjie Yang
f51fae6c75
[REFACTOR] Replace the missing Remote Shuffle Service (#885) 2022-10-28 17:37:59 +08:00
Cheng Pan
d7be6006e7
Migrate network related conf to structured conf system (#875)
* Migrate network related conf to structured conf system

* migrate

* fix

* fix

* worker

* fix

* nit

* review

* nit
2022-10-28 10:45:52 +08:00
Cheng Pan
65614edfbb
[BUILD] Create shaded module for Spark client (#878) 2022-10-27 22:11:54 +08:00
Angerszhuuuu
f9ecde3b2b
[ISSUE-863][BUG]LifecycleManager should ignore change partition request when shuffle ended and not remove workersnapshot when commit success (#864) 2022-10-27 22:04:18 +08:00
Angerszhuuuu
d283cca4e1
[ISSUE-869][REFACTOR] Migrate partition size/sorter related conf to Celeborn ConfigEntity (#870) 2022-10-27 16:49:55 +08:00
Angerszhuuuu
26dcc118c6
[ISSUE-871][REFACTOR] Migrate Worker conf to Celeborn Configuration System (#873)
* [ISSUE-871][REFACTOR] Migrate Worker conf to Celeborn Configuration System
2022-10-27 15:35:29 +08:00
Angerszhuuuu
5333819cb0
[ISSUE-866][BUG] Create File twice should show clear log (#876) 2022-10-27 14:52:45 +08:00
Angerszhuuuu
399236c880
[ISSUE-849][REFACTOR] Migrate master and common Celeborn Configuration System (#850) 2022-10-26 17:09:27 +08:00