Commit Graph

  • 61c90e3a07 [CELEBORN-1818] Fix incorrect timeout exception when waiting on no pending writes xinyuwang1 2025-01-07 13:50:39 +0800
  • ca60613f2f [CELEBORN-1817] add committed file size metrics Nan 2025-01-07 10:17:45 +0800
  • 6853b23b49 [CELEBORN-1819][CIP-14] Refactor cppClient with nested namespace HolyLow 2025-01-04 15:56:45 +0800
  • f886751e80 [CELEBORN-1812] Distinguish sorting-file from sort-tasks waiting to be submitted wuziyi 2025-01-04 10:27:53 +0800
  • 4ccb0c7fce [MINOR] Rename org.apache.celeborn.plugin.flink.readclient to org.apache.celeborn.plugin.flink.client SteNicholas 2025-01-03 20:53:54 +0800
  • 8b096ea879 [CELEBORN-1814][CIP-14] Add transportMessage to cppClient HolyLow 2025-01-03 16:18:50 +0800
  • 0c3ceeb0a5 [CELEBORN-1791] All NettyMemoryMetrics should register to source Xianming Lei 2025-01-03 11:28:57 +0800
  • 5d2831bbad [CELEBORN-1816] Bump scala-maven-plugin to avoid compilation loop mingji 2025-01-03 09:27:47 +0800
  • b7e3eaa46d [CELEBORN-1477][FOLLOWUP] Minor fix for v1 RESTful apis before release Wang, Fei 2025-01-02 23:00:15 +0800
  • 16762c659c
    [CELEBORN-1774][FOLLOWUP] Change celeborn.<module>.io.mode optional to explain default behavior in description SteNicholas 2025-01-02 21:15:19 +0800
  • a318eb43ab
    [CELEBORN-1815] Support UnpooledByteBufAllocator Cheng Pan 2025-01-02 20:54:34 +0800
  • d6496ae183 [CELEBORN-1801][FOLLOWUP] Extract RemoteShuffleEnvironment, NettyShuffleEnvironmentWrapper, SimpleResultPartitionAdapter to flink common module SteNicholas 2024-12-31 17:39:51 +0800
  • 4ec02286e8
    [CELEBORN-1811] Update default value for celeborn.master.slot.assign.extraSlots mingji 2024-12-31 15:37:28 +0800
  • 56019c714a [CELEBORN-1804] Shuffle environment metrics of RemoteShuffleEnvironment should use Shuffle.Remote metric group SteNicholas 2024-12-31 14:39:43 +0800
  • a57238024e
    [CELEBORN-1801] Remove out-of-dated flink 1.14 and 1.15 codenohup 2024-12-30 15:33:44 +0800
  • d0d8edfe22 [CELEBORN-1737] Support build tez client package hongguangwei 2024-12-30 11:01:19 +0800
  • 4714e91420 [CELEBORN-1809][CIP-14] Add partitionLocation to cppClient HolyLow 2024-12-27 20:38:43 +0800
  • eb59c17638
    [CELEBORN-1806] Bump Spark from 3.5.3 to 3.5.4 SteNicholas 2024-12-27 16:29:35 +0800
  • 1b3bd6eb38 [CELEBORN-1802] Fail the celeborn master/worker start if CELEBORN_CONF_DIR is not directory Wang, Fei 2024-12-26 20:22:34 +0800
  • 52fa151aa4
    [CELEBORN-1701][FOLLOWUP] Support stage rerun for shuffle data lost mingji 2024-12-26 17:58:41 +0800
  • 7f030d424d [CELEBORN-1799][CIP-14] Add celebornConf to cppClient HolyLow 2024-12-25 20:07:35 +0800
  • fde6365f68 [CELEBORN-1413] Support Spark 4.0 mingji 2024-12-24 18:12:27 +0800
  • 4b60dae0f0 [CELEBORN-1789][DOC] Document on Java Columnar Shuffle xiyu.zk 2024-12-24 11:40:18 +0800
  • 6028a049df
    [CELEBORN-1700][FOLLOWUP] Fix flaky test RemoteShuffleMasterSuiteJ - testRegisterPartitionWithProducer Wang, Fei 2024-12-24 11:33:17 +0800
  • 27e34ecad0 [CELEBORN-1797] Support to adjust the logger level with RESTful API during runtime Wang, Fei 2024-12-24 11:24:30 +0800
  • 03656b5b1c [CELEBORN-1634][FOLLOWUP] Add rpc metrics into grafana dashboard Wang, Fei 2024-12-24 11:13:49 +0800
  • 2eb4c23eb8 [CELEBORN-1771] Bring forward PUSH_DATA_CREATE_CONNECTION_FAIL_REPLICA zhengtao 2024-12-24 11:09:19 +0800
  • 680b072b5b [CELEBORN-1753] Optimize the code for exists and find method Wang, Fei 2024-12-23 17:56:20 +0800
  • 406ceb64c1 [CELEBORN-1794] Fix TestCongestionController occasional failing zhengtao 2024-12-23 17:47:28 +0800
  • e496a3cfae [CELEBORN-1785][CIP-14] Add baseConf to cppClient HolyLow 2024-12-23 16:45:02 +0800
  • 9e04ff4a9f [CELEBORN-1763] Fix DataPusher be blocked for a long time zhangzhao.08 2024-12-22 23:08:36 -0800
  • eaa0726c5c [CELEBORN-1788] Add role and roleBinding helm charts zhaohehuhu 2024-12-23 11:42:16 +0800
  • 80523214e4 [MINOR] Add documentation for CELEBORN_NO_DAEMONIZE Sanskar Modi 2024-12-23 10:31:37 +0800
  • a357df94f1 [CELEBORN-1700][FOLLOWUP] Support ShuffleFallbackCount metric for fallback to vanilla Flink built-in shuffle implementation SteNicholas 2024-12-20 14:12:24 +0800
  • 6b884dee66 [CELEBORN-1777] Add java.security.jgss/sun.security.krb5 to DEFAULT_MODULE_OPTIONS Fei Wang 2024-12-19 14:54:21 +0800
  • 67971df68f [CELEBORN-1783] Fix Pending task in commitThreadPool wont be canceled zhengtao 2024-12-19 14:24:36 +0800
  • cec88b2def [CELEBORN-1782] Worker in congestion control should be in blacklist to avoid impact new shuffle Xianming Lei 2024-12-19 10:12:09 +0800
  • 6cce51e597 [CELEBORN-1786] add serviceAccount helm chart zhaohehuhu 2024-12-18 20:06:37 +0800
  • e75d84fc19 [CELEBORN-1772][CIP-14] Add memory module to cppClient HolyLow 2024-12-17 17:52:38 +0800
  • f3dac7e879 [CELEBORN-1712] Bump Netty version from 4.1.109.Final to 4.1.115.Final SteNicholas 2024-12-17 17:29:07 +0800
  • 0eb8af98de [CELEBORN-1774] Update default value of celeborn.<module>.io.mode to whether epoll mode is available SteNicholas 2024-12-17 15:26:01 +0800
  • b24f867784 [CELEBORN-1510] Partial task unable to switch to the replica zhangzhao.08 2024-12-17 15:13:14 +0800
  • 2efdf755cc [CELEBORN-1711][TEST] Fix flaky test caused by master/worker setup issue Wang, Fei 2024-12-17 10:45:40 +0800
  • 17df678c77
    [CELEBORN-1780] Add support for NodePort Service per Master replica ShlomiTubul 2024-12-16 16:54:47 +0800
  • 33ba0e02f5 [CELEBORN-1775] Improve some logs onebox-li 2024-12-16 16:24:18 +0800
  • c40f69b941 [CELEBORN-1766] Add detail metrics about fetch chunk mingji 2024-12-16 16:17:14 +0800
  • d85fb7826f [CELEBORN-1711][TEST] RetryReviveTest - shutdownMiniCluster after each test Wang, Fei 2024-12-16 15:43:41 +0800
  • 4aabe37765 [CELEBORN-1778] Fix commitInfo NPE and add assert in LifecycleManagerCommitFilesSuite zhengtao 2024-12-16 15:41:37 +0800
  • 74c1ec0a7f [CELEBORN-1670] Avoid swallowing InterruptedException in ShuffleClientImpl jiang13021 2024-12-16 11:26:17 +0800
  • ca8831e55f [CELEBORN-1736] Add tez integration tests hongguangwei 2024-12-13 14:06:08 +0800
  • c316fdbdfb Revert "[CELEBORN-1376] Push data failed should always release request body" zhengtao 2024-12-13 11:20:11 +0800
  • f7b036d4c7 [CELEBORN-1770] FlushNotifier should setException for all Throwables in Flusher zhengtao 2024-12-12 14:23:04 +0800
  • 069e5b6c18 [CELEBORN-1769] Fix packed partition location cause GetReducerFileGroupResponse lose location mingji 2024-12-10 18:03:00 +0800
  • 11cbacb049 [CELEBORN-1767] Fix occasional errors in UT when creating workers zhengtao 2024-12-10 17:51:08 +0800
  • 80ebb19836 [CELEBORN-1761][CIP-14] Add cppProto to cppClient HolyLow 2024-12-10 17:04:46 +0800
  • 91d8f955ca [CELEBORN-1622][CIP-11] Adding documentation for Worker Tags feature Sanskar Modi 2024-12-10 15:56:58 +0800
  • 22ee8bfed5 [CELEBORN-1765] Fix NPE when removeFileInfo in StorageManager zhengtao 2024-12-10 14:08:54 +0800
  • 372ef79a08 [CELEBORN-1760] OOM causes disk buffer unable to be released Xianming Lei 2024-12-10 13:47:57 +0800
  • 6cffc915c1 [CELEBORN-1731] Support merged kv input for Tez hongguangwei 2024-12-06 15:37:04 +0800
  • cfd20fa0d0 [CELEBORN-1732] Support unordered kv input for Tez hongguangwei 2024-12-06 15:35:06 +0800
  • 8948df17f9 [CELEBORN-1733] Support ordered grouped kv input for Tez hongguangwei 2024-12-06 10:54:44 +0800
  • e41ee2dc9b [CELEBORN-1721][CIP-12] Support HARD_SPLIT in PushMergedData jiang13021 2024-12-06 09:20:36 +0800
  • 6c4f6c7b6c
    [CELEBORN-1758] Remove the empty user resource consumption from worker heartbeat Wang, Fei 2024-12-06 08:20:54 +0800
  • c8def22c2a [CELEBORN-1729] Support ordered KV output for Tez hongguangwei 2024-12-05 20:17:21 +0800
  • b4c7dacb0c [CELEBORN-1730] Support unordered KV output for Tez hongguangwei 2024-12-05 20:14:36 +0800
  • b2b9a0ab4b [CELEBORN-1754][CIP-14] Add exceptions and checking utils to cppClient HolyLow 2024-12-04 14:05:38 +0800
  • cc04d1315e [CELEBORN-1743] Resolve the metrics data interruption and the job failure caused by locked resources zhengtao 2024-12-04 10:11:09 +0800
  • 782393af05 [CELEBORN-1748] Deprecate identity provider configs tied with quota Sanskar Modi 2024-12-04 09:28:40 +0800
  • 7102174eda [CELEBORN-1759] Fix reserve slots might lost partition location between 0.4 client and 0.5 server onebox-li 2024-12-03 16:57:53 +0800
  • 3dd810cd9b [CELEBORN-1612] Add a basic reader writer class to Tez hongguangwei 2024-12-03 14:51:52 +0800
  • c893287bea [CELEBORN-1756] Only gauge hdfs metrics if HDFS storage enabled to reduce metrics Wang, Fei 2024-12-02 14:11:56 +0800
  • 878a83cfa7 [CELEBORN-1750] Return struct worker resource consumption information with RESTful api Wang, Fei 2024-12-01 19:58:01 -0800
  • b204a26010 [CELEBORN-1755] Update doc to include S3 as one of storage layers zhaohehuhu 2024-12-02 11:00:18 +0800
  • c84733fcf8 [CELEBORN-1725][FOLLOWUP] Optimize isAllMapTasksEnd performance Wang, Fei 2024-11-29 17:58:08 +0800
  • 3bf91929b6 [CELEBORN-1746] Reduce the size of aws dependencies zhaohehuhu 2024-11-28 19:45:01 +0800
  • 34d70ca7a4 [CELEBORN-1530][FOLLOWUP] Exclude web modules by default mingji 2024-11-28 16:28:08 +0800
  • aea680d1cc
    [CELEBORN-1752] Migration guide for unexpected shuffle RESTful api change since 0.5.0 Wang, Fei 2024-11-28 16:16:47 +0800
  • 259dfcd988 [CELEBORN-1621][FOLLOWUP] Support enabling worker tags via config Sanskar Modi 2024-11-28 11:22:35 +0800
  • 6a0f763e23 [CELEBORN-1751][CIP-14] Add celebornException utils to cppClient HolyLow 2024-11-28 11:10:58 +0800
  • 59163c2a23 [CELEBORN-1745] Remove application top disk usage code Wang, Fei 2024-11-28 10:55:34 +0800
  • ed52e8d3b6 [CELEBORN-1725] Optimize performance of handling MapperEnd RPC in LifecycleManager Fu Chen 2024-11-27 10:43:36 -0800
  • 9cd6d96167 [CELEBORN-1700] Flink supports fallback to vanilla Flink built-in shuffle implementation SteNicholas 2024-11-27 21:44:07 +0800
  • 1aefd8f42e [CELEBORN-1740][CIP-14] Add stackTrace utils to cppClient HolyLow 2024-11-27 14:21:51 +0800
  • e642197b56
    [CELEBORN-1749] Fix incorrect application diskBytesWritten metrics Wang, Fei 2024-11-27 07:24:47 +0800
  • 712d9a496e [CELEBORN-1621][CIP-11] Predefined worker tags expr via dynamic configs Sanskar Modi 2024-11-26 20:40:30 +0800
  • 9255e4ff87
    [CELEBORN-1747] Fix flaky test - HybridShuffleWordCountTest Wang, Fei 2024-11-26 14:32:11 +0800
  • 43e1b8a246
    [MINOR] Update DingTalk group link mingji 2024-11-26 14:27:41 +0800
  • 77c7a8b91d [CELEBORN-1741][CIP-14] Add processBase utils to cppClient HolyLow 2024-11-26 13:38:16 +0800
  • 1b193aa196 [CELEBORN-1713] RpcTimeoutException should include RPC address in message SteNicholas 2024-11-26 11:06:54 +0800
  • 5fab0b5ae0 [CELEBORN-1543][FOLLOWUP] License check adds flink-1.20 profile SteNicholas 2024-11-25 17:24:35 +0800
  • be4f1ac309 [CELEBORN-1634][FOLLOWUP] Simplify the logic of the RpcSource.addTimer and RpcSource.updateTimer Fu Chen 2024-11-23 12:01:36 +0800
  • 71bd45577a [CELEBORN-1724][CIP-14] Add environment setup tools for CppClient development HolyLow 2024-11-22 19:58:45 +0800
  • 3590fa778e [CELEBORN-1545] Add Tez plugin skeleton and dag app master mingji 2024-11-22 18:38:25 +0800
  • 05ccd96905 [CELEBORN-1727][FOLLOWUP] Fix CelebornHashCheckDiskSuite flaky test onebox-li 2024-11-22 17:13:54 +0800
  • d722b7621d
    [CELEBORN-1726][FOLLOWUP] Avoid NPE when transition worker state Weijie Guo 2024-11-22 15:16:53 +0800
  • a2d3972318 [CELEBORN-1530] support MPU for S3 zhaohehuhu 2024-11-22 15:03:53 +0800
  • 317cb973dc [CELEBORN-1190][FOLLOWUP] Fix WARNING of error prone SteNicholas 2024-11-22 14:36:39 +0800
  • 99a34f36b5 [CELEBORN-1618][CIP-11] Supporting tags via DB Config Service Sanskar Modi 2024-11-22 14:29:30 +0800
  • 094fe2813d [CELEBORN-1728] Fix NPE when failing to connect to celeborn worker Wang, Fei 2024-11-21 16:23:26 +0800
  • 351173bacd [CELEBORN-1727] Correct the calculation of worker diskInfo actualUsableSpace onebox-li 2024-11-21 16:17:46 +0800