### _Why are the changes needed?_ close #4338 . ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request Closes #4339 from a49a/bump-spark332. Closes #4338 6c741d82 [Luning Wang] [KYUUBI #4338] Bump Spark from 3.3.1 to 3.3.2 Authored-by: Luning Wang <wang4luning@gmail.com> Signed-off-by: Cheng Pan <chengpan@apache.org>
309 lines
9.6 KiB
ReStructuredText
309 lines
9.6 KiB
ReStructuredText
.. Licensed to the Apache Software Foundation (ASF) under one or more
|
|
contributor license agreements. See the NOTICE file distributed with
|
|
this work for additional information regarding copyright ownership.
|
|
The ASF licenses this file to You under the Apache License, Version 2.0
|
|
(the "License"); you may not use this file except in compliance with
|
|
the License. You may obtain a copy of the License at
|
|
|
|
.. http://www.apache.org/licenses/LICENSE-2.0
|
|
|
|
.. Unless required by applicable law or agreed to in writing, software
|
|
distributed under the License is distributed on an "AS IS" BASIS,
|
|
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
See the License for the specific language governing permissions and
|
|
limitations under the License.
|
|
|
|
|
|
Getting Started
|
|
===============
|
|
|
|
.. note::
|
|
|
|
This page covers how to start with kyuubi quickly on you
|
|
laptop in about 3~5 minutes.
|
|
|
|
Requirements
|
|
------------
|
|
|
|
For quick start deployment, we need to prepare the following stuffs:
|
|
|
|
- A **client** that connects and submits queries to the server. Here, we use the
|
|
kyuubi beeline for demonstration.
|
|
- A **server** that serves clients and manages engines.
|
|
- An **engine** that is used to instantiate query execution environments. Here we
|
|
use Spark for demonstration.
|
|
|
|
These essential components are JVM-based applications. So, the JRE needs to be
|
|
pre-installed and the `JAVA_HOME` is correctly set to each component.
|
|
|
|
================ ============ =============== ===========================================
|
|
Component Role Version Remarks
|
|
================ ============ =============== ===========================================
|
|
**Java** JRE 8/11 Officially released against JDK8
|
|
**Kyuubi** Gateway \ |release| \ - Kyuubi Server
|
|
Engine lib - Kyuubi Engine
|
|
Beeline - Kyuubi Hive Beeline
|
|
**Spark** Engine >=3.0.0 A Spark distribution
|
|
**Flink** Engine >=1.14.0 A Flink distribution
|
|
**Trino** Engine >=363 A Trino cluster
|
|
**Doris** Engine N/A A Doris cluster
|
|
**Hive** Engine - 3.1.x - A Hive distribution
|
|
Metastore - N/A - An optional and external metadata store,
|
|
whose version is decided by engines
|
|
**Zookeeper** HA >=3.4.x
|
|
**Disk** Storage N/A N/A
|
|
================ ============ =============== ===========================================
|
|
|
|
The other internal or external parts listed in the above sheet can be used individually
|
|
or all together. For example, you can use Kyuubi, Spark and Flink to build a streaming
|
|
data warehouse. And then, you can use Zookeeper to enable the load balancing for high
|
|
availability. The data could be stored in Hive, Apache Iceberg, or other DBMSs.
|
|
|
|
In what follows, we will only use Kyuubi and Spark.
|
|
|
|
Installation
|
|
------------
|
|
|
|
.. note::
|
|
:class: dropdown, toggle
|
|
|
|
This following instructions are based on binary releases. If you start with
|
|
source releases, please refer to the page for `building kyuubi`_.
|
|
|
|
Install Kyuubi
|
|
~~~~~~~~~~~~~~
|
|
|
|
The official releases, binary- and source-, are archived on the
|
|
`download page`_. Please download the most recent stable release
|
|
to start.
|
|
|
|
To install Kyuubi, you need to unpack the tarball. For example,
|
|
|
|
.. parsed-literal::
|
|
|
|
$ tar zxf apache-kyuubi-\ |release|\-bin.tgz
|
|
|
|
.. code-block::
|
|
:class: toggle
|
|
|
|
├── LICENSE
|
|
├── NOTICE
|
|
├── RELEASE
|
|
├── beeline-jars
|
|
├── bin
|
|
├── conf
|
|
| ├── kyuubi-defaults.conf.template
|
|
│ ├── kyuubi-env.sh.template
|
|
│ └── log4j2.properties.template
|
|
├── docker
|
|
│ ├── Dockerfile
|
|
│ ├── helm
|
|
│ ├── kyuubi-configmap.yaml
|
|
│ ├── kyuubi-deployment.yaml
|
|
│ ├── kyuubi-pod.yaml
|
|
│ └── kyuubi-service.yaml
|
|
├── externals
|
|
│ └── engines
|
|
├── jars
|
|
├── licenses
|
|
├── logs
|
|
├── pid
|
|
└── work
|
|
|
|
From top to bottom are:
|
|
|
|
- LICENSE: the APACHE `LICENSE`_, VERSION 2.0 we claim to obey.
|
|
- RELEASE: the build information of this package.
|
|
- NOTICE: the notice made by Apache Kyuubi Community about its project and dependencies.
|
|
- bin: the entry of the Kyuubi server with `kyuubi` as the startup script.
|
|
- conf: all the defaults used by Kyuubi Server itself or creating a session with engines.
|
|
- externals
|
|
- engines: contains all kinds of SQL engines that we support
|
|
- licenses: a bunch of licenses included.
|
|
- jars: packages needed by the Kyuubi server.
|
|
- logs: where the logs of the Kyuubi server locates.
|
|
- pid: stores the PID file of the Kyuubi server instance.
|
|
- work: the root of the working directories of all the forked sub-processes, a.k.a. SQL engines.
|
|
|
|
Install Spark
|
|
~~~~~~~~~~~~~
|
|
|
|
The official releases, binary- and source-, are archived on the
|
|
`spark download page`_. Please download the most recent stable
|
|
release to start.
|
|
|
|
.. note::
|
|
:class: dropdown, toggle
|
|
|
|
Currently, Kyuubi is compiled and pre-built against Spark 3 and Scala 2.12
|
|
You will probably meet runtime exceptions if you use Spark 2 or Spark with
|
|
unsupported Scala versions.
|
|
|
|
To install Spark, you need to unpack the tarball. For example,
|
|
|
|
.. code-block::
|
|
|
|
$ tar zxf spark-3.3.2-bin-hadoop3.tgz
|
|
|
|
Configuration
|
|
~~~~~~~~~~~~~
|
|
|
|
The `kyuubi-env.sh` file is used to set system environment variables to the kyuubi
|
|
server process and engine processes it creates.
|
|
|
|
The `kyuubi-defaults.conf` file is used to set system properties to the kyuubi server
|
|
process and engine processes it creates.
|
|
|
|
Each file has a template lays in `conf` directory for your information. The following
|
|
are examples of the parameters necessary for a quick start with Spark.
|
|
|
|
- **JAVA_HOME**
|
|
|
|
.. code-block::
|
|
|
|
$ echo 'export JAVA_HOME=/path/to/java' >> conf/kyuubi-env.sh
|
|
|
|
- **SPARK_HOME**
|
|
|
|
.. code-block::
|
|
|
|
$ echo 'export SPARK_HOME=/path/to/spark' >> conf/kyuubi-env.sh
|
|
|
|
|
|
Start Kyuubi
|
|
------------
|
|
|
|
.. code-block::
|
|
|
|
$ bin/kyuubi start
|
|
|
|
If script above runs successfully, it will store the `PID` of the server instance
|
|
into `pid/kyuubi-<username>-org.apache.kyuubi.server.KyuubiServer.pid`.
|
|
And you are able to get the JDBC connection URL from the log file -
|
|
`logs/kyuubi-<username>-org.apache.kyuubi.server.KyuubiServer-<hostname>.out`.
|
|
|
|
For example,
|
|
|
|
Starting and exposing JDBC connection at: jdbc:hive2://localhost:10009/
|
|
|
|
If something goes wrong, you shall be able to find some clues in the log file too.
|
|
|
|
.. note::
|
|
:class: toggle
|
|
|
|
Alternatively, it can run in the foreground, with the logs and other output
|
|
written to stdout/stderr. Both streams should be captured if using a
|
|
supervision system like `supervisord`.
|
|
|
|
.. code-block::
|
|
|
|
bin/kyuubi run
|
|
|
|
|
|
Operate Clients
|
|
---------------
|
|
|
|
Kyuubi delivers a beeline client, enabling a similar experience to Apache Hive use cases.
|
|
|
|
Open Connections
|
|
~~~~~~~~~~~~~~~~
|
|
|
|
Replace the `host` and `port` with the actual ones you've got in the step of server startup
|
|
for the following JDBC URL. The case below open a session for user named `apache`.
|
|
|
|
.. code-block::
|
|
|
|
$ bin/beeline -u 'jdbc:hive2://localhost:10009/' -n apache
|
|
|
|
.. note::
|
|
:class: toggle
|
|
|
|
Use `--help` to display the usage guide for the beeline tool.
|
|
|
|
.. code-block::
|
|
|
|
$ bin/beeline --help
|
|
|
|
Execute Statements
|
|
~~~~~~~~~~~~~~~~~~
|
|
|
|
After successfully connected with the server, you can run sql queries in the beeline
|
|
console. For instance,
|
|
|
|
.. code-block::
|
|
:class: sql
|
|
|
|
> SHOW DATABASES;
|
|
|
|
You will see a wall of operation logs, and a result table in the beeline console.
|
|
|
|
.. code-block::
|
|
|
|
omitted logs
|
|
+------------+
|
|
| namespace |
|
|
+------------+
|
|
| default |
|
|
+------------+
|
|
1 row selected (0.2 seconds)
|
|
|
|
Start Engines
|
|
~~~~~~~~~~~~~
|
|
|
|
Engines are launched by the server automatically without end users' attention.
|
|
|
|
If you use the same user in the above case to create another connection, the
|
|
engine will be reused. You may notice that the time cost for connection here is
|
|
much shorter than the last round.
|
|
|
|
If you use a different user to create a new connection, another engine will be
|
|
started.
|
|
|
|
.. code-block::
|
|
|
|
$ bin/beeline -u 'jdbc:hive2://localhost:10009/' -n kentyao
|
|
|
|
This may change depending on the `engine share level`_ you set.
|
|
|
|
Close Connections
|
|
~~~~~~~~~~~~~~~~~
|
|
|
|
Close the session between beeline and Kyuubi server by executing `!quit`, for example,
|
|
|
|
.. code-block::
|
|
|
|
> !quit
|
|
Closing: 0: jdbc:hive2://localhost:10009/
|
|
|
|
Stop Engines
|
|
~~~~~~~~~~~~
|
|
|
|
Engines are stop by the server automatically according `engine lifecycle`_
|
|
without end users' attention. Terminations of connections do not necessarily
|
|
mean terminations of engines. It depends on both the `engine share level`_ and
|
|
`engine lifecycle`_.
|
|
|
|
Stop Kyuubi
|
|
-----------
|
|
|
|
Stop Kyuubi by running the following in the `$KYUUBI_HOME` directory:
|
|
|
|
.. code-block::
|
|
|
|
$ bin/kyuubi stop
|
|
|
|
And then, you will see the Kyuubi server waving goodbye to you.
|
|
|
|
The Kyuubi server will be stopped immediately while
|
|
the engine will still be alive for a while.
|
|
|
|
If you start Kyuubi again before the engine terminates itself,
|
|
it will reconnect to the newly created one.
|
|
|
|
.. _DOWNLOAD PAGE: https://kyuubi.apache.org/releases.html
|
|
.. _BUILDING KYUUBI: ../develop_tools/distribution.html
|
|
.. _SPARK DOWNLOAD PAGE: https://spark.apache.org/downloads.html
|
|
.. _LICENSE: https://www.apache.org/licenses/LICENSE-2.0
|
|
.. _ENGINE SHARE LEVEL: ../deployment/engine_share_level.html
|
|
.. _ENGINE LIFECYCLE: ../deployment/engine_lifecycle.html
|