kyuubi

History

Cheng Pan fff1841054 [KYUUBI #6876 ] Support rolling `spark.kubernetes.file.upload.path` ### Why are the changes needed? The vanilla Spark neither support rolling nor expiration mechanism for `spark.kubernetes.file.upload.path`, if you use file system that does not support TTL, e.g. HDFS, additional cleanup mechanisms are needed to prevent the files in this directory from growing indefinitely. This PR proposes to let `spark.kubernetes.file.upload.path` support placeholders `{{YEAR}}`, `{{MONTH}}` and `{{DAY}}` and introduce a switch `kyuubi.kubernetes.spark.autoCreateFileUploadPath.enabled` to let Kyuubi server create the directory with 777 permission automatically before submitting Spark application. For example, the user can configure the below configurations in `kyuubi-defaults.conf` to enable monthly rolling support for `spark.kubernetes.file.upload.path` ``` kyuubi.kubernetes.spark.autoCreateFileUploadPath.enabled=true spark.kubernetes.file.upload.path=hdfs://hadoop-cluster/spark-upload-{{YEAR}}{{MONTH}} ``` Note that: spark would create sub dir `s"spark-upload-${UUID.randomUUID()}"` under the `spark.kubernetes.file.upload.path` for each uploading, the administer still needs to clean up the staging directory periodically. For example: ``` hdfs://hadoop-cluster/spark-upload-202412/spark-upload-f2b71340-dc1d-4940-89e2-c5fc31614eb4 hdfs://hadoop-cluster/spark-upload-202412/spark-upload-173a8653-4d3e-48c0-b8ab-b7f92ae582d6 hdfs://hadoop-cluster/spark-upload-202501/spark-upload-3b22710f-a4a0-40bb-a3a8-16e481038a63 ``` Administer can safely delete the `hdfs://hadoop-cluster/spark-upload-202412` after 20250101 ### How was this patch tested? New UTs are added. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #6876 from pan3793/rolling-upload. Closes #6876 6614bf29c [Cheng Pan] comment 5d5cb3eb3 [Cheng Pan] docs 343adaefb [Cheng Pan] review 3eade8bc4 [Cheng Pan] fix 706989778 [Cheng Pan] docs 38953dc3f [Cheng Pan] Support rolling spark.kubernetes.file.upload.path Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>		2025-01-15 01:27:12 +08:00
..
_static/css	Revert "[KYUUBI #5908 ] [DOCS] Remove workaround for malformed table"	2023-12-24 01:53:05 +08:00
appendix
client	[KYUUBI #6734 ] [DOC] add authentication example in REST API docs	2024-10-16 13:12:00 +08:00
configuration	[KYUUBI #6876 ] Support rolling `spark.kubernetes.file.upload.path`	2025-01-15 01:27:12 +08:00
connector	[KYUUBI #6804 ] Bump Iceberg from 1.6.1 to 1.7.0	2024-11-14 18:25:09 +08:00
contributing	[KYUUBI #6545 ] Deprecate and remove building support for Spark 3.2	2024-07-22 11:59:34 +08:00
deployment	[KYUUBI #6876 ] Support rolling `spark.kubernetes.file.upload.path`	2025-01-15 01:27:12 +08:00
extensions	[KYUUBI #6842 ] Bump Spark 3.5.4	2024-12-23 11:21:45 +08:00
imgs	[KYUUBI #5914 ] Update layer diagram on welcome page	2023-12-25 16:13:48 +08:00
monitor	[KYUUBI #6861 ] Configuration guide of structured logging for Kyuubi server	2024-12-25 17:22:53 +08:00
overview	[KYUUBI #6596 ] Fix typos in architecture page	2024-08-08 12:12:30 +00:00
quick_start	[KYUUBI #6557 ] Support Flink 1.20	2024-08-05 22:57:39 +08:00
security	[KYUUBI #6728 ] [DOC] update Authz plugin docs of build command with `-am` option	2024-10-16 13:31:14 +08:00
tools	[KYUUBI #6242 ] Remove block cleaner docs	2024-04-03 13:39:34 +08:00
conf.py	Revert "[KYUUBI #5908 ] [DOCS] Remove workaround for malformed table"	2023-12-24 01:53:05 +08:00
index.rst	[KYUUBI #6068 ] Remove community section from user docs	2024-02-21 05:20:42 +00:00
make.bat
Makefile
requirements.txt	[KYUUBI #6752 ] [DOC] Bump doc build requirements	2024-10-18 10:39:02 +08:00