Commit Graph

27 Commits

Author SHA1 Message Date
sychen
b94fea8e17
[CELEBORN-1207] SBT http repository documentation
### What changes were proposed in this pull request?

### Why are the changes needed?

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

Closes #2201 from cxzl25/CELEBORN-1207.

Lead-authored-by: sychen <sychen@ctrip.com>
Co-authored-by: cxzl25 <3898450+cxzl25@users.noreply.github.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2024-01-02 22:12:28 +08:00
Fu Chen
41df4ebbea [CELEBORN-1156][BUILD] SBT publish support
### What changes were proposed in this pull request?

As title

### Why are the changes needed?

As title

### Does this PR introduce _any_ user-facing change?

Yes, the user can publish shade clients via SBT

### How was this patch tested?

```shell
docker run -d -p 8081:8081 sonatype/nexus3
```

```shell
export SONATYPE_SNAPSHOTS_URL=http://192.168.3.46:8081/repository/maven-snapshots/
export SONATYPE_RELEASES_URL=http://192.168.3.46:8081/repository/maven-releases/
export ASF_USERNAME=admin
export ASF_PASSWORD=123456
```

- Publish the shade client for Spark 3.5:
```shell
./build/sbt -Pspark-3.4 celeborn-client-spark-3-shaded/publish
```

<img width="1673" alt="截屏2023-12-08 下午10 22 07" src="https://github.com/apache/incubator-celeborn/assets/8537877/1e87e7e2-cf3b-4bc0-8272-0f5b03ee65bf">

- Publish the shade client for Flink 1.18:

```shell
$ ./build/sbt -Pflink-1.18 celeborn-client-flink-1_18-shaded/publish
```
<img width="1676" alt="截屏2023-12-08 下午10 25 28" src="https://github.com/apache/incubator-celeborn/assets/8537877/62d0c3c4-e105-4e8a-8d8d-e78650a2eb09">

- Publish the shade client for MapReduce:
```shell
$ ./build/sbt -Pmr celeborn-client-mr-shaded/publish
```
<img width="1672" alt="截屏2023-12-08 下午10 25 47" src="https://github.com/apache/incubator-celeborn/assets/8537877/563d5ad5-fa6d-46fc-9465-8279ef96385a">

Closes #2129 from cfmcgrady/sbt-publish.

Authored-by: Fu Chen <cfmcgrady@gmail.com>
Signed-off-by: Fu Chen <cfmcgrady@gmail.com>
2023-12-15 11:22:35 +08:00
mingji
113311df3e [CELEBORN-1081][FOLLOWUP] Remove UNKNOWN_DISK and allocate all slots to disk
### What changes were proposed in this pull request?
1. Remove UNKNOWN_DISK from StorageInfo.
2. Enable load-aware slots allocation when there is HDFS.

### Why are the changes needed?
To support the application's config about available storage types.

### Does this PR introduce _any_ user-facing change?
no.

### How was this patch tested?
GA and Cluster.

Closes #2098 from FMX/B1081-1.

Authored-by: mingji <fengmingxiao.fmx@alibaba-inc.com>
Signed-off-by: Shuang <lvshuang.tb@gmail.com>
2023-11-28 11:26:00 +08:00
jiaoqingbo
39153c8c2d [MINOR] Updated sbt.md documentation to be consistent with description
### What changes were proposed in this pull request?

add --release parameter to create a Celeborn distribution like those distributed by the Celeborn Downloads page

### Why are the changes needed?

Without --release parameter, the created Celeborn distribution is different from the Celeborn Downloads page and lacks client-related packages.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

PASS GA

Closes #2080 from jiaoqingbo/minor-sbt.

Authored-by: jiaoqingbo <1178404354@qq.com>
Signed-off-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
2023-11-08 21:07:43 +08:00
sychen
efa22a4936 [CELEBORN-1105][FLINK] Support Flink 1.18
### What changes were proposed in this pull request?

### Why are the changes needed?

```bash
flink-1.18.0
./bin/start-cluster.sh
./bin/flink run examples/streaming/WordCount.jar --execution-mode BATCH
```

```java
Caused by: java.lang.NoSuchMethodError: org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.<init>(Ljava/lang/String;ILorg/apache/flink/runtime/jobgraph/IntermediateDataSetID;Lorg/apache/flink/runtime/io/network/partition/ResultPartitionType;Lorg/apache/flink/runtime/executiongraph/IndexRange;ILorg/apache/flink/runtime/io/network/partition/PartitionProducerStateProvider;Lorg/apache/flink/util/function/SupplierWithException;Lorg/apache/flink/runtime/io/network/buffer/BufferDecompressor;Lorg/apache/flink/core/memory/MemorySegmentProvider;ILorg/apache/flink/runtime/throughput/ThroughputCalculator;Lorg/apache/flink/runtime/throughput/BufferDebloater;)V
	at org.apache.celeborn.plugin.flink.RemoteShuffleInputGate$FakedRemoteInputChannel.<init>(RemoteShuffleInputGate.java:225)
	at org.apache.celeborn.plugin.flink.RemoteShuffleInputGate.getChannel(RemoteShuffleInputGate.java:179)
	at org.apache.flink.runtime.io.network.partition.consumer.InputGate.setChannelStateWriter(InputGate.java:90)
	at org.apache.flink.runtime.taskmanager.InputGateWithMetrics.setChannelStateWriter(InputGateWithMetrics.java:120)
	at org.apache.flink.streaming.runtime.tasks.StreamTask.injectChannelStateWriterIntoChannels(StreamTask.java:524)
	at org.apache.flink.streaming.runtime.tasks.StreamTask.<init>(StreamTask.java:496)
```

Flink 1.18.0 release
https://flink.apache.org/2023/10/24/announcing-the-release-of-apache-flink-1.18/

Interface `org.apache.flink.runtime.io.network.buffer.Buffer` adds `setRecycler` method.
[[FLINK-32549](https://issues.apache.org/jira/browse/FLINK-32549)][network] Tiered storage memory manager supports ownership transfer for buffers

`org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate` constructor adds parameters.
[[FLINK-31638](https://issues.apache.org/jira/browse/FLINK-31638)][network] Introduce the TieredStorageConsumerClient to SingleInputGate
[[FLINK-31642](https://issues.apache.org/jira/browse/FLINK-31642)][network] Introduce the MemoryTierConsumerAgent to TieredStorageConsumerClient

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?
```bash
flink-1.18.0 ./bin/flink run examples/streaming/WordCount.jar --execution-mode BATCH
Executing example with default input data.
Use --input to specify file input.
Printing result to stdout. Use --output to specify output path.
Job has been submitted with JobID d7fc5f0ca018a54e9453c4d35f7c598a
Program execution finished
Job with JobID d7fc5f0ca018a54e9453c4d35f7c598a has finished.
Job Runtime: 1635 ms
```

<img width="1297" alt="image" src="https://github.com/apache/incubator-celeborn/assets/3898450/6a5266bf-2386-4386-b98b-a60d2570fa99">

Closes #2063 from cxzl25/CELEBORN-1105.

Authored-by: sychen <sychen@ctrip.com>
Signed-off-by: Shuang <lvshuang.tb@gmail.com>
2023-11-06 15:53:39 +08:00
sychen
e437228dc8 [CELEBORN-1104][DOC] Fix SBT documentation incorrect command
### What changes were proposed in this pull request?

### Why are the changes needed?

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

Closes #2062 from cxzl25/CELEBORN-1104.

Authored-by: sychen <sychen@ctrip.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2023-11-01 17:00:09 +08:00
SteNicholas
f61fe17551 [CELEBORN-987][FOLLOWUP][DOC] README#Build and sbt#System Requirements should extend to Scala 2.13 and Spark 3.5
### What changes were proposed in this pull request?

`README#Build` and `sbt#System Requirements` extends to Scala 2.13.

### Why are the changes needed?

`README#Build` and `sbt#System Requirements`should extend to Scala 2.13 to align the SBT CI test results.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

SBT CI tests.

Closes #1987 from SteNicholas/CELEBORN-987.

Authored-by: SteNicholas <programgeek@163.com>
Signed-off-by: Fu Chen <cfmcgrady@gmail.com>
2023-10-14 09:54:22 +08:00
onebox-li
a47f6169d8 [MINOR] Fix some typos
### What changes were proposed in this pull request?
Fix some typos

### Why are the changes needed?
Ditto

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
-

Closes #1983 from onebox-li/fix-typo.

Authored-by: onebox-li <lyh-36@163.com>
Signed-off-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
2023-10-12 20:34:07 +08:00
mingji
95c9ccfc3e [CELEBORN-1010] Update docs about spark.shuffle.service.enabled
### What changes were proposed in this pull request?
To clarify a spark config to work with Celeborn.

### Why are the changes needed?
After some tests, I found that Spark 3.1 and newer can work with Celeborn with `spark.shuffle.service.enabled=true`.

ExternalShuffleBlockResolver won't check the shuffle manager's type since Spark 3.1 and newer.

### Does this PR introduce _any_ user-facing change?
NO.

### How was this patch tested?
I tested two scenarios about this PR.
1. Check whether Spark can release the executors in time.
2. Check data correctness by running TPC-DS.
All checks are good.

Closes #1955 from FMX/CELEBORN-1010.

Authored-by: mingji <fengmingxiao.fmx@alibaba-inc.com>
Signed-off-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
2023-10-08 09:15:42 +08:00
jiaoqingbo
f1713dacaf [MINOR] Fix incorrect default resume ratio in trafficcontrol doc
<!--
Thanks for sending a pull request!  Here are some tips for you:
  - Make sure the PR title start w/ a JIRA ticket, e.g. '[CELEBORN-XXXX] Your PR title ...'.
  - Be sure to keep the PR description updated to reflect all changes.
  - Please write your PR title to summarize what this PR proposes.
  - If possible, provide a concise example to reproduce the issue for a faster review.
-->

### What changes were proposed in this pull request?

As Title

### Why are the changes needed?

Since 0.3.1, Celeborn changed the default value of `celeborn.worker.directMemoryRatioToResume` from `0.5` to `0.7`.

the doc should be update

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

PASS GA

Closes #1931 from jiaoqingbo/ratiofix.

Authored-by: jiaoqingbo <1178404354@qq.com>
Signed-off-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
2023-09-21 11:18:48 +08:00
sychen
bb50618780
[CELEBORN-997][DOC] Fix Rolling upgrade broken link
### What changes were proposed in this pull request?
https://celeborn.apache.org/docs/latest/developers/overview/

> For more details, please refer to Rolling upgrade

### Why are the changes needed?

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

Closes #1927 from cxzl25/CELEBORN-997.

Authored-by: sychen <sychen@ctrip.com>
Signed-off-by: mingji <fengmingxiao.fmx@alibaba-inc.com>
2023-09-20 16:44:42 +08:00
zhouyifan279
dc5bdfadcc
[CELEBORN-923][DOC] docs/developers/overview.md has a broken link
### What changes were proposed in this pull request?
Fix a broken link in docs/developers/overview.md.

### Why are the changes needed?

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Locally tested.

Closes #1845 from zhouyifan279/upgrade-page-link.

Authored-by: zhouyifan279 <zhouyifan279@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2023-08-28 12:07:43 +08:00
Fu Chen
efc334a6aa [CELEBORN-877][FOLLOWUP][DOC] Expand 'note' blocks by default in the docs sbt.md
### What changes were proposed in this pull request?

As title

### Why are the changes needed?

As title

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Pass GA

Closes #1806 from cfmcgrady/sbt-docs-followup.

Authored-by: Fu Chen <cfmcgrady@gmail.com>
Signed-off-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
2023-08-11 21:54:24 +08:00
Fu Chen
516bdc7e08
[CELEBORN-877][DOC] Document on SBT
### What changes were proposed in this pull request?

As title

### Why are the changes needed?

As title

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Manual test

Closes #1795 from cfmcgrady/sbt-docs.

Authored-by: Fu Chen <cfmcgrady@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2023-08-11 12:17:55 +08:00
Kerwin Zhang
4fb3f31a2d
[CELEBORN-870][FOLLOWUP][DOC] Document on usage together with Gluten (#1793) 2023-08-08 10:37:13 +08:00
xiyu.zk
35fe63e4a9 [CELEBORN-870][DOC] Document on usage together with Gluten
### What changes were proposed in this pull request?
As title.

### Why are the changes needed?
As title.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Manual test.

Closes #1784 from kerwin-zk/gluten_celeborn.

Lead-authored-by: xiyu.zk <xiyu.zk@alibaba-inc.com>
Co-authored-by: Kerwin Zhang <xiyu.zk@alibaba-inc.com>
Co-authored-by: Keyong Zhou <zhouky@apache.org>
Signed-off-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
2023-08-04 11:32:13 +08:00
zky.zhoukeyong
3ee0674058 [CELEBORN-869][FOLLOWUP][DOC] Document on Integrating Celeborn
### What changes were proposed in this pull request?
As title.

### Why are the changes needed?
As title.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Manual test.

Closes #1788 from waitinfuture/869-fu.

Authored-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
Signed-off-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
2023-08-02 18:17:17 +08:00
Keyong Zhou
8c473c038b [CELEBORN-869][DOC] Document on Integrating Celeborn
### What changes were proposed in this pull request?
As title.

### Why are the changes needed?
As title.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Manual test.

Closes #1787 from waitinfuture/869.

Lead-authored-by: Keyong Zhou <waitinfuture@gmail.com>
Co-authored-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
Signed-off-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
2023-08-02 17:22:41 +08:00
zky.zhoukeyong
bee8648421 [CELEBORN-864][DOC] Document on blacklist
### What changes were proposed in this pull request?
As title.

### Why are the changes needed?
As title.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Manual test.

Closes #1782 from waitinfuture/864.

Lead-authored-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
Co-authored-by: Keyong Zhou <waitinfuture@gmail.com>
Signed-off-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
2023-08-01 21:23:55 +08:00
zky.zhoukeyong
3593adf12d [CELEBORN-860][DOC] Document on ShuffleClient
### What changes were proposed in this pull request?
As title.

### Why are the changes needed?
As title.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Manual test.

Closes #1778 from waitinfuture/860-1.

Lead-authored-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
Co-authored-by: Keyong Zhou <waitinfuture@gmail.com>
Signed-off-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
2023-07-31 20:07:20 +08:00
zky.zhoukeyong
37a9c633b3 [CELEBORN-853][DOC] Document on LifecycleManager
### What changes were proposed in this pull request?
As title.

### Why are the changes needed?
As title.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Manual test.

Closes #1775 from waitinfuture/853.

Authored-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
Signed-off-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
2023-07-31 17:36:42 +08:00
zky.zhoukeyong
b36ea39001 [CELEBORN-834][DOC] Add fault tolerant document
### What changes were proposed in this pull request?
As title.

### Why are the changes needed?
As title.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Manual test.

Closes #1769 from waitinfuture/834.

Authored-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
Signed-off-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
2023-07-28 10:39:08 +08:00
zky.zhoukeyong
41509d6e7e [CELEBORN-849][DOC] Document on Master
### What changes were proposed in this pull request?
As title.

### Why are the changes needed?
As title.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Manual test.

Closes #1772 from waitinfuture/849.

Lead-authored-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
Co-authored-by: Keyong Zhou <waitinfuture@gmail.com>
Signed-off-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
2023-07-27 22:09:43 +08:00
zky.zhoukeyong
b8cdf36b40 [CELEBORN-831][DOC] Add traffic control document
### What changes were proposed in this pull request?
As title.

### Why are the changes needed?
As title.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Manual test.

Closes #1754 from waitinfuture/831.

Authored-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
Signed-off-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
2023-07-24 19:51:02 +08:00
zky.zhoukeyong
070d8bc0f8 [CELEBORN-826][DOC] Add storage document
### What changes were proposed in this pull request?
As title.

### Why are the changes needed?
As title.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
No.

Closes #1752 from waitinfuture/826.

Authored-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
Signed-off-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
2023-07-24 16:12:42 +08:00
zky.zhoukeyong
8e849645eb [CELEBORN-824][DOC] Add PushData document
### What changes were proposed in this pull request?
As title.

### Why are the changes needed?
As title

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
No.

Closes #1747 from waitinfuture/824.

Authored-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
Signed-off-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
2023-07-24 10:38:46 +08:00
zky.zhoukeyong
27521547f0 [CELEBORN-823][DOC] Add Celeborn architecture document
### What changes were proposed in this pull request?
As title.

### Why are the changes needed?
As title.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
No.

Closes #1746 from waitinfuture/823.

Authored-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
Signed-off-by: zky.zhoukeyong <zky.zhoukeyong@alibaba-inc.com>
2023-07-22 23:57:22 +08:00