diff --git a/conf/kyuubi-defaults.conf b/conf/kyuubi-defaults.conf index 4c3f98106..5fbd9bbc0 100644 --- a/conf/kyuubi-defaults.conf +++ b/conf/kyuubi-defaults.conf @@ -19,7 +19,7 @@ # # kyuubi.authentication NONE # kyuubi.frontend.bind.port 10009 - +# ## Spark Configurations # # spark.master local @@ -28,4 +28,4 @@ ## Hadoop Configurations # # kyuubi.hadoop.authentication KERBEROS -# \ No newline at end of file +# diff --git a/docs/deployment/index.rst b/docs/deployment/index.rst index 687a29e42..daec1fec8 100644 --- a/docs/deployment/index.rst +++ b/docs/deployment/index.rst @@ -6,7 +6,7 @@ Deploying Kyuubi .. toctree:: :maxdepth: 2 - :numbered: 3 + :numbered: 4 settings on_yarn diff --git a/docs/deployment/on_yarn.md b/docs/deployment/on_yarn.md index b93e6956b..323a222b3 100644 --- a/docs/deployment/on_yarn.md +++ b/docs/deployment/on_yarn.md @@ -65,9 +65,13 @@ the QUEUE configured at Kyuubi server side will be used as default. #### Sizing +Pass the configurations below through the JDBC connection string to set how many instances of Spark executor will be used +and how many cpus and memory will Spark driver, ApplicationMaster and each executor take. - | Default | Meaning --- | --- | --- +spark.executor.instances | 1 | The number of executors for static allocation +spark.executor.cores | 1 | The number of cores to use on each executor spark.yarn.am.memory | 512m | Amount of memory to use for the YARN Application Master in client mode spark.yarn.am.memoryOverhead | amMemory * 0.10, with minimum of 384 | Amount of non-heap memory to be allocated per am process in client mode spark.driver.memory | 1g | Amount of memory to use for the driver process @@ -75,15 +79,26 @@ spark.driver.memoryOverhead | driverMemory * 0.10, with minimum of 384 | Amount spark.executor.memory | 1g | Amount of memory to use for the executor process spark.executor.memoryOverhead | executorMemory * 0.10, with minimum of 384 | Amount of additional memory to be allocated per executor process. This is memory that accounts for things like VM overheads, interned strings other native overheads, etc +It is recommended to use [Dynamic Allocation](http://spark.apache.org/docs/3.0.1/configuration.html#dynamic-allocation) with Kyuubi, +since the SQL engine will be long-running for a period, execute user's queries from clients aperiodically, +and the demand for computing resources is not the same for those queries. +It is better for Spark to release some executors when either the query is lightweight, or the SQL engine is being idled. + + +#### Tuning + +You can specify `spark.yarn.archive` or `spark.yarn.jars` to point to a world-readable location that contains Spark jars on HDFS, +which allows YARN to cache it on nodes so that it doesn't need to be distributed each time an application runs. -#### - - #### Others -Acceptable [Spark properties](http://spark.apache.org/docs/latest/running-on-yarn.html#spark-properties) - +Please refer to [Spark properties](http://spark.apache.org/docs/latest/running-on-yarn.html#spark-properties) to check other acceptable configs. +## Kerberos +Kyuubi currently does not support Spark's [YARN-specific Kerberos Configuration](http://spark.apache.org/docs/3.0.1/running-on-yarn.html#kerberos), +so `spark.kerberos.keytab` and `spark.kerberos.principal` should not use now. +Instead, you can schedule a periodically `kinit` process via `crontab` task on the local machine that hosts Kyuubi server or simply use [Kyuubi Kinit](settings.html#kinit) + \ No newline at end of file diff --git a/docs/deployment/settings.md b/docs/deployment/settings.md index 27ca0c0b3..f28eeb1be 100644 --- a/docs/deployment/settings.md +++ b/docs/deployment/settings.md @@ -101,11 +101,9 @@ You can configure the Kyuubi properties in `$KYUUBI_HOME/conf/kyuubi-defaults.co Key | Default | Meaning | Since --- | --- | --- | --- kyuubi\.authentication|
NONE
|
Client authentication types.
|
1.0.0
-kyuubi\.authentication
\.keytab|
<undefined>
|
Location of Kyuubi server's keytab.
|
1.0.0
kyuubi\.authentication
\.ldap\.base\.dn|
<undefined>
|
LDAP base DN.
|
1.0.0
kyuubi\.authentication
\.ldap\.domain|
<undefined>
|
LDAP base DN.
|
1.0.0
kyuubi\.authentication
\.ldap\.url|
<undefined>
|
SPACE character separated LDAP connection URL(s).
|
1.0.0
-kyuubi\.authentication
\.principal|
<undefined>
|
Name of the Kerberos principal.
|
1.0.0
kyuubi\.authentication
\.sasl\.qop|
auth
|
Sasl QOP enable higher levels of protection for Kyuubi communication with clients.
|
1.0.0
### Delegation @@ -149,7 +147,9 @@ kyuubi\.ha\.zookeeper
\.session\.timeout|
PT1H
|
How often will Kyuubi server run `kinit -kt [keytab] [principal]` to renew the local Kerberos credentials cache
|
1.0.0
+kyuubi\.kinit\.keytab|
<undefined>
|
Location of Kyuubi server's keytab.
|
1.0.0
kyuubi\.kinit\.max
\.attempts|
10
|
How many times will `kinit` process retry
|
1.0.0
+kyuubi\.kinit
\.principal|
<undefined>
|
Name of the Kerberos principal.
|
1.0.0
### Operation diff --git a/docs/index.rst b/docs/index.rst index b12371886..9dc542bf4 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -10,7 +10,6 @@ Welcome to Kyuubi's documentation! ================================== - .. toctree:: :maxdepth: 2 :glob: diff --git a/docs/overview/index.rst b/docs/overview/index.rst index d8a728ad5..144090eb7 100644 --- a/docs/overview/index.rst +++ b/docs/overview/index.rst @@ -1,12 +1,12 @@ -.. image:: ../imgs/kyuubi_logo.png +.. image:: ../imgs/kyuubi.png :align: center Overview -=========== +======== + .. toctree:: :maxdepth: 2 - :numbered: 2 summary kyuubi_vs_hive diff --git a/docs/overview/summary.md b/docs/overview/summary.md index 9c2249722..b067cddd1 100644 --- a/docs/overview/summary.md +++ b/docs/overview/summary.md @@ -1,3 +1,65 @@ -# What is Kyuubi +# Kyuubi™ -Kyuubi is a unified multi-tenant JDBC interface for large-scale data processing. Currently, kyuubi use Apache Spark as SQL engine. \ No newline at end of file +Kyuubi™ is a unified multi-tenant JDBC interface for large-scale data processing, built on top of [Apache Spark™](http://spark.apache.org/). + +![](../imgs/kyuubi_layers.png) + +In general, the complete ecosystem of Kyuubi falls into the hierarchies shown in the above figure, with each layer loosely coupled to the other. + +For example, + +You can use Kyuubi, Spark and [Apache Iceberg](https://iceberg.apache.org/) to build and manage Data Lake with pure SQL. + +Kyuubi provides the following features: + +## Multi-tenancy + +Kyuubi supports the end-to-end multi-tenancy, +and this is why we want to create this project despite that the Spark [Thrift JDBC/ODBC server](http://spark.apache.org/docs/latest/sql-distributed-sql-engine.html#running-the-thrift-jdbcodbc-server) already exists. + +1. Supports multi-client concurrency and authentication +2. Supports one Spark application per account(SPA). +3. Supports QUEUE/NAMESPACE Access Control Lists (ACL) +4. Supports metadata & data Access Control Lists + +Users who have valid accounts could use all kinds of client tools, e.g. +Hive Beeline, [HUE](https://gethue.com/), [DBeaver](https://dbeaver.io/), +[SQuirreL SQL Client](http://squirrel-sql.sourceforge.net/), etc, +to operate with Kyuubi server concurrently. + +The SPA policy makes sure 1) a user account can only get computing resource with managed ACLs, e.g. +[Queue Access Control Lists](https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html#Queue_Access_Control_Lists), +from cluster managers, e.g. +[Apache Hadoop YARN](https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html), +[Kubernetes (K8s)](https://kubernetes.io/) to create the Spark application; +2) a user account can only access data and metadata from a storage system, e.g. +[Apache Hadoop HDFS](https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html), +with permissions. + +## Ease of Use + +You only need to be familiar with Structured Query Language (SQL) and Java Database Connectivity (JDBC) to handle massive data. +It helps you focus on the design and implementation of your business system. + +- SQL is the standard language for accessing relational databases, and very popular in big data eco too. + It turns out that everybody knows SQL. +- JDBC provides a standard API for tool/database developers and makes it possible to write database applications using a pure Java API. +- There are plenty of free or commercial JDBC tools out there. + +## Run Anywhere + +Kyuubi can submit Spark applications to all supported cluster managers, including YARN, Mesos, Kubernetes, Standalone, and local. + +The SPA policy also make it possible for you to launch different applications against different cluster managers. + +## High Performance + +Kyuubi is built on the Apache Spark, a lightning-fast unified analytics engine. + + - **Concurrent execution**: multiple Spark applications work together + - **Quick response**: long-running Spark applications without startup + - **Optimal execution plan**: fully supports Spark SQL Catalyst Optimizer, + +## Authentication & Authorization + +## High Availability \ No newline at end of file diff --git a/docs/quick_start/quick_start_with_hue.md b/docs/quick_start/quick_start_with_hue.md new file mode 100644 index 000000000..0e82ebe89 --- /dev/null +++ b/docs/quick_start/quick_start_with_hue.md @@ -0,0 +1,14 @@ +
+ +![](../imgs/kyuubi_logo_simple.png) + +
+ +# Getting Started With Cloudera Hue + + +docker run -it -p 8888:8888 gethue/hue:latest + +http://localhost:8888/ + +![](../imgs/hue_login.png) \ No newline at end of file diff --git a/kyuubi-common/src/main/scala/org/apache/kyuubi/config/KyuubiConf.scala b/kyuubi-common/src/main/scala/org/apache/kyuubi/config/KyuubiConf.scala index f73945b44..a194a27d0 100644 --- a/kyuubi-common/src/main/scala/org/apache/kyuubi/config/KyuubiConf.scala +++ b/kyuubi-common/src/main/scala/org/apache/kyuubi/config/KyuubiConf.scala @@ -165,13 +165,13 @@ object KyuubiConf { .stringConf .createWithDefault("embedded_zookeeper") - val SERVER_PRINCIPAL: OptionalConfigEntry[String] = buildConf("authentication.principal") + val SERVER_PRINCIPAL: OptionalConfigEntry[String] = buildConf("kinit.principal") .doc("Name of the Kerberos principal.") .version("1.0.0") .stringConf .createOptional - val SERVER_KEYTAB: OptionalConfigEntry[String] = buildConf("authentication.keytab") + val SERVER_KEYTAB: OptionalConfigEntry[String] = buildConf("kinit.keytab") .doc("Location of Kyuubi server's keytab.") .version("1.0.0") .stringConf