Add doc for overview

2020-11-13 14:17:42 +08:00 · 2020-11-13 14:17:42 +08:00 · 4396e59abe
commit 4396e59abe
parent fb4bace6a5
9 changed files with 108 additions and 18 deletions
--- a/conf/kyuubi-defaults.conf
+++ b/conf/kyuubi-defaults.conf
@ -19,7 +19,7 @@
 #
 # kyuubi.authentication           NONE
 # kyuubi.frontend.bind.port       10009
-
+#
 ## Spark Configurations
 #
 # spark.master                    local
@ -28,4 +28,4 @@
 ## Hadoop Configurations
 #
 # kyuubi.hadoop.authentication    KERBEROS
-#
+#
--- a/docs/deployment/index.rst
+++ b/docs/deployment/index.rst
@ -6,7 +6,7 @@ Deploying Kyuubi

 .. toctree::
    :maxdepth: 2
-    :numbered: 3
+    :numbered: 4

    settings
    on_yarn
--- a/docs/deployment/on_yarn.md
+++ b/docs/deployment/on_yarn.md
@ -65,9 +65,13 @@ the QUEUE configured at Kyuubi server side will be used as default.

 #### Sizing

+Pass the configurations below through the JDBC connection string to set how many instances of Spark executor will be used
+and how many cpus and memory will Spark driver, ApplicationMaster and each executor take.

 - | Default | Meaning
 --- | --- | ---
+spark.executor.instances | 1 | The number of executors for static allocation
+spark.executor.cores | 1 | The number of cores to use on each executor
 spark.yarn.am.memory | 512m | Amount of memory to use for the YARN Application Master in client mode
 spark.yarn.am.memoryOverhead | amMemory * 0.10, with minimum of 384 | Amount of non-heap memory to be allocated per am process in client mode
 spark.driver.memory | 1g | Amount of memory to use for the driver process
@ -75,15 +79,26 @@ spark.driver.memoryOverhead | driverMemory * 0.10, with minimum of 384 | Amount
 spark.executor.memory | 1g | Amount of memory to use for the executor process
 spark.executor.memoryOverhead | executorMemory * 0.10, with minimum of 384 | Amount of additional memory to be allocated per executor process. This is memory that accounts for things like VM overheads, interned strings other native overheads, etc

+It is recommended to use [Dynamic Allocation](http://spark.apache.org/docs/3.0.1/configuration.html#dynamic-allocation) with Kyuubi,
+since the SQL engine will be long-running for a period, execute user's queries from clients aperiodically,
+and the demand for computing resources is not the same for those queries.
+It is better for Spark to release some executors when either the query is lightweight, or the SQL engine is being idled. 
+
+
+#### Tuning
+
+You can specify `spark.yarn.archive` or `spark.yarn.jars` to point to a world-readable location that contains Spark jars on HDFS,
+which allows YARN to cache it on nodes so that it doesn't need to be distributed each time an application runs. 

-#### 
- 
- 
 #### Others

-Acceptable [Spark properties](http://spark.apache.org/docs/latest/running-on-yarn.html#spark-properties)
-
+Please refer to [Spark properties](http://spark.apache.org/docs/latest/running-on-yarn.html#spark-properties) to check other acceptable configs.


+## Kerberos

+Kyuubi currently does not support Spark's [YARN-specific Kerberos Configuration](http://spark.apache.org/docs/3.0.1/running-on-yarn.html#kerberos),
+so `spark.kerberos.keytab` and `spark.kerberos.principal` should not use now.

+Instead, you can schedule a periodically `kinit` process via `crontab` task on the local machine that hosts Kyuubi server or simply use [Kyuubi Kinit](settings.html#kinit)
+ 
--- a/docs/deployment/settings.md
+++ b/docs/deployment/settings.md
@ -101,11 +101,9 @@ You can configure the Kyuubi properties in `$KYUUBI_HOME/conf/kyuubi-defaults.co
 Key | Default | Meaning | Since
 --- | --- | --- | ---
 kyuubi\.authentication|<div style='width: 80pt;word-wrap: break-word;white-space: normal'>NONE</div>|<div style='width: 200pt;word-wrap: break-word;white-space: normal'>Client authentication types.<ul> <li>NONE: no authentication check.</li> <li>KERBEROS: Kerberos/GSSAPI authentication.</li> <li>LDAP: Lightweight Directory Access Protocol authentication.</li></ul></div>|<div style='width: 20pt'>1.0.0</div>
-kyuubi\.authentication<br>\.keytab|<div style='width: 80pt;word-wrap: break-word;white-space: normal'>&lt;undefined&gt;</div>|<div style='width: 200pt;word-wrap: break-word;white-space: normal'>Location of Kyuubi server's keytab.</div>|<div style='width: 20pt'>1.0.0</div>
 kyuubi\.authentication<br>\.ldap\.base\.dn|<div style='width: 80pt;word-wrap: break-word;white-space: normal'>&lt;undefined&gt;</div>|<div style='width: 200pt;word-wrap: break-word;white-space: normal'>LDAP base DN.</div>|<div style='width: 20pt'>1.0.0</div>
 kyuubi\.authentication<br>\.ldap\.domain|<div style='width: 80pt;word-wrap: break-word;white-space: normal'>&lt;undefined&gt;</div>|<div style='width: 200pt;word-wrap: break-word;white-space: normal'>LDAP base DN.</div>|<div style='width: 20pt'>1.0.0</div>
 kyuubi\.authentication<br>\.ldap\.url|<div style='width: 80pt;word-wrap: break-word;white-space: normal'>&lt;undefined&gt;</div>|<div style='width: 200pt;word-wrap: break-word;white-space: normal'>SPACE character separated LDAP connection URL(s).</div>|<div style='width: 20pt'>1.0.0</div>
-kyuubi\.authentication<br>\.principal|<div style='width: 80pt;word-wrap: break-word;white-space: normal'>&lt;undefined&gt;</div>|<div style='width: 200pt;word-wrap: break-word;white-space: normal'>Name of the Kerberos principal.</div>|<div style='width: 20pt'>1.0.0</div>
 kyuubi\.authentication<br>\.sasl\.qop|<div style='width: 80pt;word-wrap: break-word;white-space: normal'>auth</div>|<div style='width: 200pt;word-wrap: break-word;white-space: normal'>Sasl QOP enable higher levels of protection for Kyuubi communication with clients.<ul> <li>auth - authentication only (default)</li> <li>auth-int - authentication plus integrity protection</li> <li>auth-conf - authentication plus integrity and confidentiality protection. This is applicable only if Kyuubi is configured to use Kerberos authentication.</li> </ul></div>|<div style='width: 20pt'>1.0.0</div>

 ### Delegation
@ -149,7 +147,9 @@ kyuubi\.ha\.zookeeper<br>\.session\.timeout|<div style='width: 80pt;word-wrap: b
 Key | Default | Meaning | Since
 --- | --- | --- | ---
 kyuubi\.kinit\.interval|<div style='width: 80pt;word-wrap: break-word;white-space: normal'>PT1H</div>|<div style='width: 200pt;word-wrap: break-word;white-space: normal'>How often will Kyuubi server run `kinit -kt [keytab] [principal]` to renew the local Kerberos credentials cache</div>|<div style='width: 20pt'>1.0.0</div>
+kyuubi\.kinit\.keytab|<div style='width: 80pt;word-wrap: break-word;white-space: normal'>&lt;undefined&gt;</div>|<div style='width: 200pt;word-wrap: break-word;white-space: normal'>Location of Kyuubi server's keytab.</div>|<div style='width: 20pt'>1.0.0</div>
 kyuubi\.kinit\.max<br>\.attempts|<div style='width: 80pt;word-wrap: break-word;white-space: normal'>10</div>|<div style='width: 200pt;word-wrap: break-word;white-space: normal'>How many times will `kinit` process retry</div>|<div style='width: 20pt'>1.0.0</div>
+kyuubi\.kinit<br>\.principal|<div style='width: 80pt;word-wrap: break-word;white-space: normal'>&lt;undefined&gt;</div>|<div style='width: 200pt;word-wrap: break-word;white-space: normal'>Name of the Kerberos principal.</div>|<div style='width: 20pt'>1.0.0</div>

 ### Operation

--- a/docs/index.rst
+++ b/docs/index.rst
@ -10,7 +10,6 @@
 Welcome to Kyuubi's documentation!
 ==================================

-
 .. toctree::
   :maxdepth: 2
   :glob:
--- a/docs/overview/index.rst
+++ b/docs/overview/index.rst
@ -1,12 +1,12 @@
-.. image:: ../imgs/kyuubi_logo.png
+.. image:: ../imgs/kyuubi.png
   :align: center

 Overview
-===========
+========
+

 .. toctree::
    :maxdepth: 2
-    :numbered: 2

    summary
    kyuubi_vs_hive
--- a/docs/overview/summary.md
+++ b/docs/overview/summary.md
@ -1,3 +1,65 @@
-# What is Kyuubi
+# Kyuubi™

-Kyuubi is a unified multi-tenant JDBC interface for large-scale data processing. Currently, kyuubi use Apache Spark as SQL engine.
+Kyuubi™ is a unified multi-tenant JDBC interface for large-scale data processing, built on top of [Apache Spark™](http://spark.apache.org/).
+
+![](../imgs/kyuubi_layers.png)
+
+In general, the complete ecosystem of Kyuubi falls into the hierarchies shown in the above figure, with each layer loosely coupled to the other.
+
+For example,
+
+You can use Kyuubi, Spark and [Apache Iceberg](https://iceberg.apache.org/) to build and manage Data Lake with pure SQL.
+
+Kyuubi provides the following features:
+
+## Multi-tenancy
+
+Kyuubi supports the end-to-end multi-tenancy,
+and this is why we want to create this project despite that the Spark [Thrift JDBC/ODBC server](http://spark.apache.org/docs/latest/sql-distributed-sql-engine.html#running-the-thrift-jdbcodbc-server) already exists.
+
+1. Supports multi-client concurrency and authentication
+2. Supports one Spark application per account(SPA).
+3. Supports QUEUE/NAMESPACE Access Control Lists (ACL)
+4. Supports metadata & data Access Control Lists
+
+Users who have valid accounts could use all kinds of client tools, e.g.
+Hive Beeline, [HUE](https://gethue.com/), [DBeaver](https://dbeaver.io/),
+[SQuirreL SQL Client](http://squirrel-sql.sourceforge.net/), etc,
+to operate with Kyuubi server concurrently.
+
+The SPA policy makes sure 1) a user account can only get computing resource with managed ACLs, e.g.
+[Queue Access Control Lists](https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html#Queue_Access_Control_Lists),
+from cluster managers, e.g.
+[Apache Hadoop YARN](https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html),
+[Kubernetes (K8s)](https://kubernetes.io/) to create the Spark application;
+2) a user account can only access data and metadata from a storage system, e.g.
+[Apache Hadoop HDFS](https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html),
+with permissions.
+
+## Ease of Use
+
+You only need to be familiar with Structured Query Language (SQL) and Java Database Connectivity (JDBC) to handle massive data.
+It helps you focus on the design and implementation of your business system.
+
+- SQL is the standard language for accessing relational databases, and very popular in big data eco too.
+  It turns out that everybody knows SQL.
+- JDBC provides a standard API for tool/database developers and makes it possible to write database applications using a pure Java API.
+- There are plenty of free or commercial JDBC tools out there.
+
+## Run Anywhere
+
+Kyuubi can submit Spark applications to all supported cluster managers, including YARN, Mesos, Kubernetes, Standalone, and local.
+
+The SPA policy also make it possible for you to launch different applications against different cluster managers.
+
+## High Performance
+
+Kyuubi is built on the Apache Spark, a lightning-fast unified analytics engine.
+
+ - **Concurrent execution**: multiple Spark applications work together
+ - **Quick response**: long-running Spark applications without startup 
+ - **Optimal execution plan**: fully supports Spark SQL Catalyst Optimizer,
+
+## Authentication & Authorization
+
+## High Availability
--- a/docs/quick_start/quick_start_with_hue.md
+++ b/docs/quick_start/quick_start_with_hue.md
@ -0,0 +1,14 @@
+<div align=center>
+
+![](../imgs/kyuubi_logo_simple.png)
+
+</div>
+
+# Getting Started With Cloudera Hue
+
+
+docker run -it -p 8888:8888 gethue/hue:latest
+
+http://localhost:8888/
+
+![](../imgs/hue_login.png)
--- a/kyuubi-common/src/main/scala/org/apache/kyuubi/config/KyuubiConf.scala
+++ b/kyuubi-common/src/main/scala/org/apache/kyuubi/config/KyuubiConf.scala
@ -165,13 +165,13 @@ object KyuubiConf {
    .stringConf
    .createWithDefault("embedded_zookeeper")

-  val SERVER_PRINCIPAL: OptionalConfigEntry[String] = buildConf("authentication.principal")
+  val SERVER_PRINCIPAL: OptionalConfigEntry[String] = buildConf("kinit.principal")
    .doc("Name of the Kerberos principal.")
    .version("1.0.0")
    .stringConf
    .createOptional

-  val SERVER_KEYTAB: OptionalConfigEntry[String] = buildConf("authentication.keytab")
+  val SERVER_KEYTAB: OptionalConfigEntry[String] = buildConf("kinit.keytab")
    .doc("Location of Kyuubi server's keytab.")
    .version("1.0.0")
    .stringConf