--- license: | Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at https://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --- # Configuration The configuration of Celeborn is divided into static and dynamic categories, with details provided in the [Configuration Guide](../configuration/index.md). ## Static Configuration Static configuration, referred to as `CelebornConf`, loads configurations from the default file located at `$CELEBORN_HOME/conf/celeborn-defaults.conf`. ## Dynamic Configuration Dynamic configuration allows for changes to be applied at runtime, as necessary, and it takes precedence over the corresponding static configuration in the Celeborn `Master` and `Worker`. A configuration key's dynamic nature is indicated by the `isDynamic` property, as listed in [All Configurations](../configuration/index.md#all-configurations). This means that configurations tagged with the dynamic property can be updated and refreshed while Celeborn is running. ### Config Level At present dynamic configuration supports various config levels including: - `SYSTEM`: The system configurations. - `TENANT`: The dynamic configurations of tenant id. - `TENANT_USER`: The dynamic configurations of tenant id and username. When applying dynamic configuration, the following is the order of precedence for configuration levels: - `SYSTEM` level configuration takes precedence over static configuration and the default `CelebornConf`. If the system-level configuration is absent, it will fall back to the static configuration defined in `CelebornConf`. - `TENANT` level configuration supersedes the `SYSTEM` level, meaning that configurations specific to a tenant id will override those set at the system level. If tenant-level configuration is absent, it will fall back to the system-level dynamic configuration. - `TENANT_USER` level configuration takes precedence over `TENANT` level. Configurations specific to both a tenant id and username will override those set at the tenant level. If tenant-user-level configuration is missing, it will fall back to the tenant-level dynamic configuration. ## Config Service The config service provides a configuration management service with a local cache for both static and dynamic configurations. Moreover, `ConfigService` is a pluggable service interface whose implementation can vary based on different storage backends. The storage backend for `ConfigService` is specified by the configuration key `celeborn.dynamicConfig.store.backend`, and it currently supports filesystem (`FS`) and database (`DB`) as storage backends by default. Additionally, users can provide their own implementation by extending the `ConfigService` interface and using the fully qualified class name of the implementation as storage backend. If no storage backend is specified, this indicates that the config service is disabled. ### FileSystem Config Service The filesystem config service enables the use of dynamic configuration files, the location of which is set by the configuration key `celeborn.dynamicConfig.store.fs.path`. The template for the dynamic configuration is as follows: ```yaml # SYSTEM level configuration - level: SYSTEM config: [config_key]: [config_val] ... # TENANT level configuration - tenantId: [tenant_id] level: TENANT config: [config_key]: [config_val] ... users: # TENANT_USER level configuration - name: [name] config: [config_key]: [config_val] ... ``` For example, a Celeborn worker `celeborn-worker` has 10 storage directories or disks and the buffer size is set to 256 KiB. A tenant `tenantId1` only uses half of the storage and sets the buffer size to 128 KiB. Meanwhile, a user `user1` needs to change the buffer size to 96 KiB at runtime. The example configurations are as follows: ```yaml # SYSTEM level configuration - level: SYSTEM config: celeborn.worker.flusher.buffer.size: 256K # sets buffer size of worker to 256 KiB # TENANT level configuration - tenantId: tenantId1 level: TENANT config: celeborn.worker.flusher.buffer.size: 128K # sets buffer size of tenantId1 to 128 KiB users: # TENANT_USER level configuration - name: user1 config: celeborn.worker.flusher.buffer.size: 96K # sets buffer size of tenantId1 and user1 to 128 KiB ``` ### Database Config Service The database config service updates dynamic configurations stored in the database using the JDBC approach. Configuration settings for the database storage backend are defined by the `celeborn.dynamicConfig.store.db.*` series of configuration keys. To use the database as a config store backend, it is necessary to create tables for dynamic configurations at the various configuration levels. The sql script for MySQL configuration tables is located under `$CELEBORN_HOME/db-scripts` directory. After the creation of configuration tables, dynamic configuration of config levels is specified via inserting a configuration record in corresponding config level table. Above example dynamic configurations can be supported via the following sql: ```sql CREATE TABLE IF NOT EXISTS celeborn_cluster_info ( id int NOT NULL AUTO_INCREMENT, name varchar(255) NOT NULL COMMENT 'celeborn cluster name', namespace varchar(255) DEFAULT NULL COMMENT 'celeborn cluster namespace', endpoint varchar(255) DEFAULT NULL COMMENT 'celeborn cluster endpoint', gmt_create timestamp NOT NULL, gmt_modify timestamp NOT NULL, PRIMARY KEY (id), UNIQUE KEY `index_cluster_unique_name` (`name`) ); # SYSTEM level configuration CREATE TABLE IF NOT EXISTS celeborn_cluster_system_config ( id int NOT NULL AUTO_INCREMENT, cluster_id int NOT NULL, config_key varchar(255) NOT NULL, config_value varchar(255) NOT NULL, type varchar(255) DEFAULT NULL COMMENT 'conf categories, such as quota', gmt_create timestamp NOT NULL, gmt_modify timestamp NOT NULL, PRIMARY KEY (id), UNIQUE KEY `index_unique_system_config_key` (`cluster_id`, `config_key`) ); # TENANT/TENANT_USER level configuration CREATE TABLE IF NOT EXISTS celeborn_cluster_tenant_config ( id int NOT NULL AUTO_INCREMENT, cluster_id int NOT NULL, tenant_id varchar(255) NOT NULL, level varchar(255) NOT NULL COMMENT 'config level, valid level is TENANT,USER', name varchar(255) DEFAULT NULL COMMENT 'tenant sub user', config_key varchar(255) NOT NULL, config_value varchar(255) NOT NULL, type varchar(255) DEFAULT NULL COMMENT 'conf categories, such as quota', gmt_create timestamp NOT NULL, gmt_modify timestamp NOT NULL, PRIMARY KEY (id), UNIQUE KEY `index_unique_tenant_config_key` (`cluster_id`, `tenant_id`, `name`, `config_key`) ); INSERT INTO celeborn_cluster_info ( `id`, `name`, `namespace`, `endpoint`, `gmt_create`, `gmt_modify` ) VALUES ( 1, 'default', 'celeborn-worker', 'celeborn-namespace.endpoint.com', '2024-02-27 22:08:30', '2024-02-27 22:08:30' ); # SYSTEM level configuration # sets buffer size of celeborn-worker to 256 KiB INSERT INTO `celeborn_cluster_system_config` ( `id`, `cluster_id`, `config_key`, `config_value`, `type`, `gmt_create`, `gmt_modify` ) VALUES ( 1, 1, 'celeborn.worker.flusher.buffer.size', '256K', 'QUOTA', '2024-02-27 22:08:30', '2024-02-27 22:08:30' ); # TENANT/TENANT_USER level configuration # TENANT: sets buffer size of tenantId1 to 128 KiB # TENANT_USER: sets buffer size of tenantId1 and user1 to 96 KiB INSERT INTO `celeborn_cluster_tenant_config` ( `id`, `cluster_id`, `tenant_id`, `level`, `name`, `config_key`, `config_value`, `type`, `gmt_create`, `gmt_modify` ) VALUES ( 1, 1, 'tenantId1', 'TENANT', '', 'celeborn.worker.flusher.buffer.size', '128K', 'worker', '2024-02-27 22:08:30', '2024-02-27 22:08:30' ), ( 2, 1, 'tenantId1', 'TENANT_USER', 'user1', 'celeborn.worker.flusher.buffer.size', '96K', 'worker', '2024-02-27 22:08:30', '2024-02-27 22:08:30' ); ``` ## Rest API In addition to viewing the configurations, Celeborn support REST API available for both master and worker including: - `/conf`: List the conf setting of master and worker. - `/listDynamicConfigs`: List the dynamic configs of master and worker. The API providers of listing configurations refer to [Available API providers](../monitoring.md#available-api-providers)