Commit Graph

1099 Commits

Author SHA1 Message Date
zky.zhoukeyong
dab865b68b [CELEBORN-662] Report worker unavailable regardless graceful shutdown
### What changes were proposed in this pull request?
In this PR, worker always report node unavailable regardless graceful shutdown is turned on or off.

### Why are the changes needed?
To inform master the shutting down worker as soon as possible.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Manual test.

Closes #1575 from waitinfuture/662.

Authored-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
Signed-off-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
2023-06-10 18:36:25 +08:00
Angerszhuuuu
6b725202a2 [CELEBORN-640][WORKER] DataPushQueue should not keep waiting take tasks
### What changes were proposed in this pull request?
In our prod meet many times of push queue stuck caused by PushState's status was not being removed.
Caused DataPushQueue to keep waiting for taking task.

Although have resolved some bugs, here we'd better add a max wait time for taking tasks since we already have the `PUSH_DATA_TIMEOUT` check method. If the target worker is really stuck, we can retry our task finally.

### Why are the changes needed?

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

Closes #1552 from AngersZhuuuu/CELEBORN-640.

Authored-by: Angerszhuuuu <angers.zhu@gmail.com>
Signed-off-by: Angerszhuuuu <angers.zhu@gmail.com>
2023-06-09 14:06:47 +08:00
Angerszhuuuu
45503238b3 [CELEBORN-657][BUG] DataPushQueue return task should always remove iterator
### What changes were proposed in this pull request?
 DataPushQueue return task should always remove iterator
Related to
251b923b5b
cb19ed1c66

### Why are the changes needed?

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

Closes #1568 from AngersZhuuuu/CELEBORN-657.

Authored-by: Angerszhuuuu <angers.zhu@gmail.com>
Signed-off-by: Shuang <lvshuang.tb@gmail.com>
2023-06-09 13:37:07 +08:00
Cheng Pan
588dbdfbe0
[CELEBORN-653][TEST] Fix invalid configuration key in SparkTestBase
### What changes were proposed in this pull request?

Dot is missing after `spark`

### Why are the changes needed?

Correct the configuration key.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass GA.

Closes #1563 from pan3793/CELEBORN-653.

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2023-06-09 10:31:25 +08:00
Cheng Pan
76533d7324
[CELEBORN-650][TEST] Upgrade scalatest and unify mockito version
### What changes were proposed in this pull request?

This PR upgrades

- `mockito` from 1.10.19 and 3.6.0 to 4.11.0
- `scalatest` from 3.2.3 to 3.2.16
- `mockito-scalatest` from 1.16.37 to 1.17.14

### Why are the changes needed?

Housekeeping, making test dependencies up-to-date and unified.

### Does this PR introduce _any_ user-facing change?

No, it only affects test.

### How was this patch tested?

Pass GA.

Closes #1562 from pan3793/CELEBORN-650.

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2023-06-09 10:04:14 +08:00
Cheng Pan
6b64b1de9c
[CELEBORN-648][SPARK] Improve perf of SendBufferPool and logs about memory
### What changes were proposed in this pull request?

- Replace index-based item access with an iterator for LinkedList.
- Always try to remove a buffer if SendBufferPool does not have a matched candidate, this change makes the total buffer number from `capacity+N-1` to `capacity` in worst cases.
- Some logs and code polish.

### Why are the changes needed?

Improve performance and logs, reduce memory consumption.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass GA.

Closes #1560 from pan3793/CELEBORN-648.

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2023-06-09 09:45:27 +08:00
Cheng Pan
0636e3ca40
[CELEBORN-654][SPARK] SortBasedShuffleWriter does not require mapStatusRecords in Spark 3
### What changes were proposed in this pull request?

`mapStatusRecords` is required in Spark 2 for constructing `MapStatus` when AQE is enabled, but not in Spark 3, so remove it to save memory and compute resources.

This PR also simplifies the `for loop` code.

### Why are the changes needed?

Remove unnecessary variables to save resources and clean up code.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass GA.

Closes #1564 from pan3793/CELEBORN-654.

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2023-06-09 09:43:08 +08:00
Cheng Pan
1ae8eb7145 [CELEBORN-655][SPARK] Rename newAppId to appUniqueId
### What changes were proposed in this pull request?

Rename variable `newAppId` to `appUniqueId` in Spark client.

### Why are the changes needed?

Make the variable name intuitive.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass GA.

Closes #1565 from pan3793/CELEBORN-655.

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
2023-06-08 22:14:20 +08:00
onebox-li
0c869ac9a0
[CELEBORN-642] Improve metrics and update grafana
### What changes were proposed in this pull request?
Change in grafana

(ALL)
add:
JVMCPUTime
LastMinuteSystemLoad
AvailableProcessors
(For Master)
add:
LostWorkers
IsActiveMaster
PartitionSize
(For Worker)
add:
PushDataFailCount -> WriteDataFailCount
ReplicateDataFailCount
ReplicateDataWriteFailCount
ReplicateDataCreateConnectionFailCount
ReplicateDataConnectionExceptionCount
ReplicateDataTimeoutCount
SortedFileSize
PushDataHandshakeFailCount
RegionStartFailCount
RegionFinishFailCount
MasterPushDataHandshakeTime
SlavePushDataHandshakeTime
MasterRegionStartTime
SlaveRegionStartTime
MasterRegionFinishTime
SlaveRegionFinishTime
PotentialConsumeSpeed
UserProduceSpeed
WorkerConsumeSpeed
DeviceOSFreeBytes
DeviceCelebornFreeBytes
push usedHeapMemory/usedDirectMemory
fetch usedHeapMemory/usedDirectMemory
replicate usedHeapMemory/usedDirectMemory
remove:
dup ReserveSlotsTime

Change dashboard layout.

Fix support for multiple labels.

Modify some metrics docs.

### Why are the changes needed?
For better use of metrics.

### Does this PR introduce _any_ user-facing change?
Below metrics change name, extract some value to the label.
DeviceOSFreeCapacity(B) -> DeviceOSFreeBytes
DeviceOSTotalCapacity(B) -> DeviceOSTotalBytes
DeviceCelebornFreeCapacity(B) -> DeviceCelebornFreeBytes
DeviceCelebornTotalCapacity(B) -> DeviceCelebornTotalBytes
push usedHeapMemory/usedDirectMemory
fetch usedHeapMemory/usedDirectMemory
replicate usedHeapMemory/usedDirectMemory

### How was this patch tested?
Cluster test.

Closes #1557 from onebox-li/improve-metrics.

Authored-by: onebox-li <lyh-36@163.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2023-06-08 18:10:06 +08:00
Angerszhuuuu
2f054cd7d5 [CELEBORN-647][BUG] Fix potential NPE when remove push status
### What changes were proposed in this pull request?
Fix potential NPE when remove push status

### Why are the changes needed?

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

Closes #1559 from AngersZhuuuu/CELEBORN-647.

Authored-by: Angerszhuuuu <angers.zhu@gmail.com>
Signed-off-by: Angerszhuuuu <angers.zhu@gmail.com>
2023-06-08 17:22:48 +08:00
Cheng Pan
db66670253
[CELEBORN-649][BUILD] Speed up make-distribution.sh
### What changes were proposed in this pull request?

This PR aims to improve `build/make-distribution.sh` by

- skip building javadoc and source artifacts
- skip building unnecessary modules
- allow skipping client modules

### Why are the changes needed?

Speed up the packaging process.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Tested with

```
build/make-distribution.sh
```

```
build/make-distribution.sh -Pspark-3.3
```

```
build/make-distribution.sh -Pflink-1.17
```

Closes #1561 from pan3793/CELEBORN-649.

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2023-06-08 15:09:56 +08:00
zhongqiang.czq
586785c88d [CELEBORN-617][FLINK] MapPartitionFileWriter updates flushing file length
…ngth

### What changes were proposed in this pull request?

### Why are the changes needed?

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

Closes #1519 from zhongqiangczq/mapfilelength.

Authored-by: zhongqiang.czq <zhongqiang.czq@alibaba-inc.com>
Signed-off-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
2023-06-08 10:47:36 +08:00
Cheng Pan
5bc37f1286
[CELEBORN-637] Remove support for rss.* configuration alias
### What changes were proposed in this pull request?

Remove support for `rss.*` configuration alias

### Why are the changes needed?

The legacy `rss.*` configuration alias was added during Celeborn entering Apache Incubator, to simplify users' migration from RSS to Celeborn.

Lots of configuration changes happened after Celeborn 0.2, the `rss.*` configuration alias become less helpful, so remove it to clean up the code.

### Does this PR introduce _any_ user-facing change?

Yes, but it's expected, the `rss.*` compatibility has never been documented.

### How was this patch tested?

Pass GA.

Closes #1547 from pan3793/CELEBORN-637.

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2023-06-07 22:28:36 +08:00
Shuang
2711b8253a
[CELEBORN-641] Upgrade flink scala version to 2.12.15
### What changes were proposed in this pull request?
Use scala 2.12.15 as default scala version for flink.

### Why are the changes needed?
There is incompatible serialize problem between scala 2.12.7 to scala 2.12.15/scala 2.11.12,  when use different scala version, the generated serialVersionUID is different, Then we may encounter deserialize problem between client/server rpc, refer [scala ](https://users.scala-lang.org/t/serialversionuid-change-between-scala-2-12-6-and-2-12-7/3478/3)

![image](https://github.com/apache/incubator-celeborn/assets/28799061/19ddd25e-7db5-458d-95d0-bc6ab66cd40b)

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Manual test use Flink scala2.12.7 runtime with Celeborn scala 2.12.15 compiled Flink client

Closes #1553 from RexXiong/CELEBORN-641.

Authored-by: Shuang <lvshuang.tb@gmail.com>
Signed-off-by: Ethan Feng <ethanfeng@apache.org>
2023-06-07 20:46:10 +08:00
Angerszhuuuu
d4cb6dd8ab [CELEBORN-645][REFACTOR] Refine logic about handle HeartbeatFromWorkerResponse
### What changes were proposed in this pull request?
Refine the logic here to make it easier understand.

### Why are the changes needed?

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

Closes #1555 from AngersZhuuuu/CELEBORN-645.

Authored-by: Angerszhuuuu <angers.zhu@gmail.com>
Signed-off-by: Angerszhuuuu <angers.zhu@gmail.com>
2023-06-07 16:34:44 +08:00
Angerszhuuuu
9502b1f26d [CELEBORN-639][BUG] ShuffleClient get push exception cause should handle NPE
### What changes were proposed in this pull request?
If we meet some unexpected exception, `getPushDataFailCause ` will throw NPE and broke the process of revive and remove push states. Here we should handle the NPE

### Why are the changes needed?

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

Closes #1551 from AngersZhuuuu/CELEBORN-639.

Authored-by: Angerszhuuuu <angers.zhu@gmail.com>
Signed-off-by: Angerszhuuuu <angers.zhu@gmail.com>
2023-06-07 15:15:58 +08:00
Cheng Pan
3c7d179e05
[CELEBORN-636] Replace SimpleDateFormat with FastDateFormat
### What changes were proposed in this pull request?

`SimpleDateFormat` is not thread-safe, replace it with a thread-safe `FastDateFormat`

### Why are the changes needed?

`FastDateFormat` is a fast and thread-safe version of `java.text.SimpleDateFormat`.

### Does this PR introduce _any_ user-facing change?

Yes, it's a bug fix.

### How was this patch tested?

Manually review.

Closes #1545 from pan3793/CELEBORN-636.

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Ethan Feng <ethanfeng@apache.org>
2023-06-06 12:59:32 +08:00
Ethan Feng
3bd232dda0
[CELEBORN-619][CORE][SHUFFLE][FOLLOWUP] Support enable DRA with Apache Celeborn
### What changes were proposed in this pull request?

Adapt Spark DRA patch for spark 3.4

### Why are the changes needed?

To support enabling DRA w/ Celeborn on Spark 3.4

### Does this PR introduce _any_ user-facing change?

Yes, this PR provides a DRA patch for Spark 3.4

### How was this patch tested?

Compiled with Spark 3.4

Closes #1546 from FMX/CELEBORN-619.

Lead-authored-by: Ethan Feng <ethanfeng@apache.org>
Co-authored-by: Ethan Feng <fengmingxiao.fmx@alibaba-inc.com>
Signed-off-by: Ethan Feng <ethanfeng@apache.org>
2023-06-06 12:57:16 +08:00
Ethan Feng
76a42beab0
[CELEBORN-610][FLINK] Eliminate pluginconf and merge its content to CelebornConf
### What changes were proposed in this pull request?
Pluginconf might be hard to understand why Celeborn needs to config class.

### Why are the changes needed?
Ditto.

### Does this PR introduce _any_ user-facing change?
NO.

### How was this patch tested?
UT.

Closes #1524 from FMX/CELEBORN-610.

Authored-by: Ethan Feng <ethanfeng@apache.org>
Signed-off-by: Ethan Feng <ethanfeng@apache.org>
2023-06-05 14:08:53 +08:00
zhongqiangchen
d396758646
[CELEBORN-634][LICENSE] Update LICENSE and NOTICE (#1541)
[CELEBORN-634] [LICENSE] assembly source and binary license and notice for the bundled dependencies
2023-06-05 13:55:16 +08:00
zhongqiangchen
98676cf79b
[CELEBORN-635] Exclude netty-handler-ssl-ocsp from netty dependency (#1544) 2023-06-05 13:54:33 +08:00
zwangsheng
5068d6e897
[CELEBORN-105][TEST] Kubernetes Integration Test
### What changes were proposed in this pull request?
Add Kubernetes Integration Test
- [x] test helm install deploy
- [ ] test shuffle

### Why are the changes needed?
Add integration test

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Ci test

Closes #1484 from zwangsheng/CELEBORN-105.

Authored-by: zwangsheng <2213335496@qq.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2023-06-05 12:11:29 +08:00
xiyu.zk
82bdea7085 [CELEBORN-620] Fix columnar shuffle codegen exception
### What changes were proposed in this pull request?
Fix columnar shuffle codegen exception. This is a refactoring of #1523。

Closes #1543 from kerwin-zk/issue-620.

Authored-by: xiyu.zk <xiyu.zk@alibaba-inc.com>
Signed-off-by: xiyu.zk <xiyu.zk@alibaba-inc.com>
2023-06-05 12:05:06 +08:00
Angerszhuuuu
218bfc78a5
[CELEBORN-629][DOC] Add doc about enable rac-awareness
### What changes were proposed in this pull request?

Add doc about enabling rac-awareness

### Why are the changes needed?

Document new features.

### Does this PR introduce _any_ user-facing change?

Yes, the docs are updated.

### How was this patch tested?

<img width="1085" alt="截屏2023-06-02 下午3 19 10" src="https://github.com/apache/incubator-celeborn/assets/46485123/c8c51a4c-40be-40ea-befd-3c369b9f7600">

Closes #1536 from AngersZhuuuu/CELEBORN-629.

Authored-by: Angerszhuuuu <angers.zhu@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2023-06-05 10:28:26 +08:00
Ethan Feng
5600728149
[CELEBORN-619][CORE][SHUFFLE] Support enable DRA with Apache Celeborn
### What changes were proposed in this pull request?

Adapt Spark DRA patch for spark 3.4

### Why are the changes needed?

To support enabling DRA w/ Celeborn on Spark 3.4

### Does this PR introduce _any_ user-facing change?

Yes, this PR provides a DRA patch for Spark 3.4

### How was this patch tested?

Compiled with Spark 3.4

Closes #1529 from FMX/CELEBORN-619.

Authored-by: Ethan Feng <ethanfeng@apache.org>
Signed-off-by: Ethan Feng <ethanfeng@apache.org>
2023-06-05 09:50:05 +08:00
Angerszhuuuu
3883fe2c80
[CELEBORN-623][FOLLUPUP] Refine doc about use ratis shell with RSS cluster
### What changes were proposed in this pull request?
Refine this doc since:

1. It didn't mention our cluster default RPC type is  `NETTY`
2. If the user use the ratis shell with `GRPC` but didn't know the ratis cluster is `NETTY`, the error is not clear and hard to debug.

### Why are the changes needed?

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

Closes #1542 from AngersZhuuuu/CELEBORN-623-FOLLOWUP.

Lead-authored-by: Angerszhuuuu <angers.zhu@gmail.com>
Co-authored-by: Cheng Pan <pan3793@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2023-06-02 22:09:05 +08:00
Angerszhuuuu
4df4775524
[CELEBORN-632][DOC] Add spark name space to spark specify properties (#1538) 2023-06-02 21:48:56 +08:00
liyihe
188b069710
[CELEBORN-623][DOCS] Document how to change RPC type in celeborn-ratis
### What changes were proposed in this pull request?
Ratis-shell use GRPC by default. Celeborn support Netty for ratis, if `raft.rpc.type` is not specified, commands may fail.
e.g.
```
org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: DEADLINE_EXCEEDED: deadline exceeded after 14.947369960s. [closed=[], open=[[buffered_nanos=14962358255, waiting_for_connection]]]
```
So I think we should update the document to mention how to change the RPC type to in `celeborn-ratis`.

### Why are the changes needed?

Improve user experience

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Manually test

Closes #1530 from onebox-li/ratis-shell-default-rpc.

Lead-authored-by: liyihe <liyihe@bigo.sg>
Co-authored-by: Cheng Pan <pan3793@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2023-06-02 20:23:09 +08:00
Cheng Pan
007b716b64
[CELEBORN-633][INFRA] Introduce PR merge script
### What changes were proposed in this pull request?

Introduce PR merge script `dev/merge_pr.py`, which is borrowed from Apache Spark

### Why are the changes needed?

This script simplifies the PR merge procedure

- auto backport to release branches
- auto close the JIRA ticket
- auto fill in the JIRA fixed version
- reserve the PR description in git log
- reserve the author and committer in git log

### Does this PR introduce _any_ user-facing change?

No, it's for committers.

### How was this patch tested?

a1de16a80f was merged by this tool

Closes #1539 from pan3793/CELEBORN-633.

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2023-06-02 19:52:04 +08:00
zwangsheng
67762783d0
[CELEBORN-628][HELM] Separate mount & host path on hostPath case
### What changes were proposed in this pull request?
Seperate Mount & Volumes Path On Kubernete Case

### Why are the changes needed?
See detail in https://github.com/apache/incubator-celeborn/pull/1508#discussion_r1208803085

### Does this PR introduce _any_ user-facing change?
Yes

### How was this patch tested?
Local Test

> Values.yaml
```yaml
volumes:
  master:
    - mountPath: /mnt/rss_ratis
      hostPath: /spark/data1
      type: hostPath
      size: 1Gi
  worker:
    - mountPath: /mnt/disk1
      hostPath: /spark/data1
      type: hostPath
      size: 1Gi
    - mountPath: /mnt/disk2
      hostPath: /spark/data2
      type: hostPath
      size: 1Gi
```

>Celeborn Worker Pod
```yaml
containers:
  volumeMounts:
    - mountPath: /mnt/disk1
      name: celeborn-worker-vol-0
    - mountPath: /mnt/disk2
      name: celeborn-worker-vol-1
volumes:
  - hostPath:
      path: /spark/data1/worker
      type: DirectoryOrCreate
    name: celeborn-worker-vol-0
  - hostPath:
      path: /spark/data2/worker
      type: DirectoryOrCreate
    name: celeborn-worker-vol-1
```

Closes #1535 from zwangsheng/CELEBORN-628.

Authored-by: zwangsheng <2213335496@qq.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2023-06-02 19:47:13 +08:00
Shuang
a1de16a80f
[CELEBORN-626] Fix potential deadlock in filewriter
### What changes were proposed in this pull request?
Lock flushBuffer field and flush method to make sure thread safe access.

### Why are the changes needed?
When stageEnd, worker will commit files and filewriters would be closed, the speculative task may still push data to the file writer, if the push task increment numPendingWrites. the commit thread which hold the filewriter object lock will need wait the pending writes decrement to 0. but push thread need the filewriter object lock to  decrement numPendingWrites, this cause deadlock..

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
UT

Closes #1534 from RexXiong/CELEBORN-626.

Authored-by: Shuang <lvshuang.tb@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2023-06-02 17:47:39 +08:00
zhongqiang.czq
3d9a28a98d
[CELEBORN-630] Binary release artifact should package all versions of Spark and Flink clients
…link version

### What changes were proposed in this pull request?

### Why are the changes needed?

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

Closes #1537 from zhongqiangczq/release-content.

Authored-by: zhongqiang.czq <zhongqiang.czq@alibaba-inc.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2023-06-02 17:40:41 +08:00
Angerszhuuuu
4f1ca8c960
[CELEBORN-621][BUG] Push merged data task timeout and mapended should also remove push states (#1526) 2023-06-02 14:04:39 +08:00
Angerszhuuuu
e18a5ea769
[CELEBORN-624] StorageManager should only remove expired app dirs (#1531) 2023-06-02 11:33:33 +08:00
Ethan Feng
d33916e571
[CELEBORN-625] Add a config to enable/disable UnsafeRow fast write. (#1532) 2023-06-01 20:55:45 +08:00
Angerszhuuuu
cf308aa057
[CLEBORN-595] Refine code frame of CelebornConf (#1525) 2023-06-01 10:37:58 +08:00
Binjie Yang
b785c7c565
[CELEBORN-612][HELM] Tackle hostPath directory permission (#1521) 2023-06-01 10:34:25 +08:00
ulysses
fa920ab0d5
Relax isRssEnabled condition (#1528)
Co-authored-by: youxiduo <youxiduo@corp.netease.com>
2023-05-31 15:26:05 +08:00
Angerszhuuuu
6d5dd50915
[CELEBORN-595][FOLLOWUP] Fix change version to 0.3.0. (#1522) 2023-05-30 20:12:56 +08:00
Angerszhuuuu
62681ba85d
[CELEBORN-595] Rename and refactor the configuration doc. (#1501) 2023-05-30 15:14:12 +08:00
zhongqiangchen
f117cff776
[CELEBORN-618] [FLINK] worker side adds partition split configuration options (#1520) 2023-05-30 14:13:31 +08:00
Angerszhuuuu
c4bff654b0
[CELEBORN-614] Simplify StorageManager's flushFileWriters to avoid too much cost on collection operation (#1517) 2023-05-30 11:38:05 +08:00
Binjie Yang
d30f45ad63
[CELEBORN-450][HELM] Configurable volumes in the values.yaml (#1508)
* [CELEBORN-450] Configure the mount & volume in the Values.yaml

* fix comments

* fix wrong name

* fix comments

* fix typo

* fix into array

* Wiht User Note Comments

* fix comments

* Update charts/celeborn/templates/worker-statefulset.yaml

---------

Co-authored-by: Cheng Pan <pan3793@gmail.com>
2023-05-29 13:48:23 +08:00
Angerszhuuuu
07011f5a4d
[CELEBORN-601] Consolidate configsWithAlternatives with ConfigBuilder.withAlternative (#1506) 2023-05-28 09:13:05 +08:00
Shuang
2972c5f7d3
[CELEBORN-611] Improve log4j's configuration for deleting old log files when match the conditions. (#1516) 2023-05-25 20:51:53 +08:00
Cheng Pan
df385bedd3
[CELEBORN-608][BUILD] Exclude macOS fflags in make-distribution.sh (#1513) 2023-05-25 14:25:13 +08:00
Cheng Pan
c29f2f0aa8
[CELEBORN-605][BUILD] Remove redundant exclusions from hadoop-client-api (#1510) 2023-05-25 10:40:15 +08:00
Cheng Pan
a3ad8bbcd5
[CELEBORN-607] Simplify bootstrap scripts for adding --add-opens java opts (#1512) 2023-05-24 23:20:25 +08:00
Ethan Feng
4ee7d9eba8
[CELEBORN-597][FLINK] Support flink floating buffer for input gate and output gate. (#1503) 2023-05-24 23:15:57 +08:00
Cheng Pan
ef8e556202
[CELEBORN-604][SPARK] Support Spark 3.4 (#1509) 2023-05-24 23:10:13 +08:00