[KYUUBI #6512] Improve docs for KSHC
# 🔍 Description Canonicalize the words, and enrich the description for KSCH. ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 Review. --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) **Be nice. Be informative.** Closes #6512 from pan3793/kshc-doc. Closes #6512 201c11341 [Cheng Pan] nit 1becc1ebb [Cheng Pan] nit 8d48c7c93 [Cheng Pan] fix aea1e0386 [Cheng Pan] fix 5ba5094ab [Cheng Pan] fix 0c40de43d [Cheng Pan] fix 63dd21d11 [Cheng Pan] nit 1be266163 [Cheng Pan] Improve docs for KSHC Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>
This commit is contained in:
parent
d2dad1432c
commit
f66216b43c
@ -42,9 +42,9 @@ Dependencies
|
||||
|
||||
The **classpath** of kyuubi flink sql engine with Hudi supported consists of
|
||||
|
||||
1. kyuubi-flink-sql-engine-\ |release|\ _2.12.jar, the engine jar deployed with Kyuubi distributions
|
||||
1. kyuubi-flink-sql-engine-\ |release|\ _2.12.jar, the engine jar deployed with a Kyuubi distribution
|
||||
2. a copy of flink distribution
|
||||
3. hudi-flink<flink.version>-bundle_<scala.version>-<hudi.version>.jar (example: hudi-flink1.14-bundle_2.12-0.11.1.jar), which can be found in the `Maven Central`_
|
||||
3. hudi-flink<flink.version>-bundle-<hudi.version>.jar (example: hudi-flink1.18-bundle-0.15.0.jar), which can be found in the `Maven Central`_
|
||||
|
||||
In order to make the Hudi packages visible for the runtime classpath of engines, we can use one of these methods:
|
||||
|
||||
|
||||
@ -43,9 +43,9 @@ Dependencies
|
||||
|
||||
The **classpath** of kyuubi flink sql engine with Iceberg supported consists of
|
||||
|
||||
1. kyuubi-flink-sql-engine-\ |release|\ _2.12.jar, the engine jar deployed with Kyuubi distributions
|
||||
1. kyuubi-flink-sql-engine-\ |release|\ _2.12.jar, the engine jar deployed with a Kyuubi distribution
|
||||
2. a copy of flink distribution
|
||||
3. iceberg-flink-runtime-<flink.version>-<iceberg.version>.jar (example: iceberg-flink-runtime-1.14-0.14.0.jar), which can be found in the `Maven Central`_
|
||||
3. iceberg-flink-runtime-<flink.version>-<iceberg.version>.jar (example: iceberg-flink-runtime-1.18-1.5.2.jar), which can be found in the `Maven Central`_
|
||||
|
||||
In order to make the Iceberg packages visible for the runtime classpath of engines, we can use one of these methods:
|
||||
|
||||
|
||||
@ -40,9 +40,9 @@ Dependencies
|
||||
|
||||
The **classpath** of kyuubi flink sql engine with Apache Paimon (Incubating) supported consists of
|
||||
|
||||
1. kyuubi-flink-sql-engine-\ |release|\ _2.12.jar, the engine jar deployed with Kyuubi distributions
|
||||
1. kyuubi-flink-sql-engine-\ |release|\ _2.12.jar, the engine jar deployed with a Kyuubi distribution
|
||||
2. a copy of flink distribution
|
||||
3. paimon-flink-<version>.jar (example: paimon-flink-1.16-0.4-SNAPSHOT.jar), which can be found in the `Apache Paimon (Incubating) Supported Engines Flink`_
|
||||
3. paimon-flink-<version>.jar (example: paimon-flink-1.18-0.8.1.jar), which can be found in the `Apache Paimon (Incubating) Supported Engines Flink`_
|
||||
4. flink-shaded-hadoop-2-uber-<version>.jar, which code can be found in the `Pre-bundled Hadoop Jar`_
|
||||
|
||||
In order to make the Apache Paimon (Incubating) packages visible for the runtime classpath of engines, you need to:
|
||||
|
||||
@ -44,7 +44,7 @@ Dependencies
|
||||
|
||||
The **classpath** of kyuubi hive sql engine with Iceberg supported consists of
|
||||
|
||||
1. kyuubi-hive-sql-engine-\ |release|\ _2.12.jar, the engine jar deployed with Kyuubi distributions
|
||||
1. kyuubi-hive-sql-engine-\ |release|\ _2.12.jar, the engine jar deployed with a Kyuubi distribution
|
||||
2. a copy of hive distribution
|
||||
3. iceberg-hive-runtime-<hive.version>_<scala.version>-<iceberg.version>.jar (example: iceberg-hive-runtime-3.2_2.12-0.14.0.jar), which can be found in the `Maven Central`_
|
||||
|
||||
|
||||
@ -42,7 +42,7 @@ Dependencies
|
||||
|
||||
The **classpath** of kyuubi hive sql engine with Iceberg supported consists of
|
||||
|
||||
1. kyuubi-hive-sql-engine-\ |release|\ _2.12.jar, the engine jar deployed with Kyuubi distributions
|
||||
1. kyuubi-hive-sql-engine-\ |release|\ _2.12.jar, the engine jar deployed with a Kyuubi distribution
|
||||
2. a copy of hive distribution
|
||||
3. paimon-hive-connector-<hive.binary.version>-<paimon.version>.jar (example: paimon-hive-connector-3.1-0.4-SNAPSHOT.jar), which can be found in the `Apache Paimon (Incubating) Supported Engines Hive`_
|
||||
|
||||
|
||||
@ -16,38 +16,38 @@
|
||||
`Delta Lake`_
|
||||
=============
|
||||
|
||||
Delta lake is an open-source project that enables building a Lakehouse
|
||||
Delta Lake is an open-source project that enables building a Lakehouse
|
||||
Architecture on top of existing storage systems such as S3, ADLS, GCS,
|
||||
and HDFS.
|
||||
|
||||
.. tip::
|
||||
This article assumes that you have mastered the basic knowledge and
|
||||
operation of `Delta Lake`_.
|
||||
For the knowledge about delta lake not mentioned in this article,
|
||||
For the knowledge about Delta Lake not mentioned in this article,
|
||||
you can obtain it from its `Official Documentation`_.
|
||||
|
||||
By using kyuubi, we can run SQL queries towards delta lake which is more
|
||||
By using kyuubi, we can run SQL queries towards Delta Lake which is more
|
||||
convenient, easy to understand, and easy to expand than directly using
|
||||
spark to manipulate delta lake.
|
||||
spark to manipulate Delta Lake.
|
||||
|
||||
Delta Lake Integration
|
||||
----------------------
|
||||
|
||||
To enable the integration of kyuubi spark sql engine and delta lake through
|
||||
Apache Spark Datasource V2 and Catalog APIs, you need to:
|
||||
To enable the integration of Kyuubi Spark SQL engine and Delta Lake through
|
||||
Spark DataSource V2 API, you need to:
|
||||
|
||||
- Referencing the delta lake :ref:`dependencies<spark-delta-lake-deps>`
|
||||
- Setting the spark extension and catalog :ref:`configurations<spark-delta-lake-conf>`
|
||||
- Referencing the Delta Lake :ref:`dependencies<spark-delta-lake-deps>`
|
||||
- Setting the Spark extension and catalog :ref:`configurations<spark-delta-lake-conf>`
|
||||
|
||||
.. _spark-delta-lake-deps:
|
||||
|
||||
Dependencies
|
||||
************
|
||||
|
||||
The **classpath** of kyuubi spark sql engine with delta lake supported consists of
|
||||
The **classpath** of Kyuubi Spark SQL engine with Delta Lake supported consists of
|
||||
|
||||
1. kyuubi-spark-sql-engine-\ |release|\ _2.12.jar, the engine jar deployed with kyuubi distributions
|
||||
2. a copy of spark distribution
|
||||
1. kyuubi-spark-sql-engine-\ |release|\ _2.12.jar, the engine jar deployed with a Kyuubi distribution
|
||||
2. a copy of Spark distribution
|
||||
3. delta-core & delta-storage, which can be found in the `Maven Central`_
|
||||
|
||||
In order to make the delta packages visible for the runtime classpath of engines, we can use one of these methods:
|
||||
@ -63,7 +63,7 @@ In order to make the delta packages visible for the runtime classpath of engines
|
||||
Configurations
|
||||
**************
|
||||
|
||||
To activate functionality of delta lake, we can set the following configurations:
|
||||
To activate functionality of Delta Lake, we can set the following configurations:
|
||||
|
||||
.. code-block:: properties
|
||||
|
||||
|
||||
@ -16,53 +16,52 @@
|
||||
`Hive`_
|
||||
==========
|
||||
|
||||
The Kyuubi Hive Connector is a datasource for both reading and writing Hive table,
|
||||
It is implemented based on Spark DataSource V2, and supports concatenating multiple Hive metastore at the same time.
|
||||
You may know that the Apache Spark has built-in support for accessing Hive tables, it works well in most cases,
|
||||
but is limited to one Hive Metastore. The Kyuubi Spark Hive connector(KSHC) implemented a Hive connector based
|
||||
on Spark DataSource V2 API, supports accessing multiple Hive Metastore in a single Spark application.
|
||||
|
||||
This connector can be used to federate queries of multiple hives warehouse in a single Spark cluster.
|
||||
Hive Integration
|
||||
----------------
|
||||
|
||||
Hive Connector Integration
|
||||
-------------------
|
||||
|
||||
To enable the integration of kyuubi spark sql engine and Hive connector through
|
||||
Apache Spark Datasource V2 and Catalog APIs, you need to:
|
||||
To enable the integration of Kyuubi Spark SQL engine and Hive connector through
|
||||
Spark DataSource V2 API, you need to:
|
||||
|
||||
- Referencing the Hive connector :ref:`dependencies<kyuubi-hive-deps>`
|
||||
- Setting the spark extension and catalog :ref:`configurations<kyuubi-hive-conf>`
|
||||
- Setting the Spark catalog :ref:`configurations<kyuubi-hive-conf>`
|
||||
|
||||
.. _kyuubi-hive-deps:
|
||||
|
||||
Dependencies
|
||||
************
|
||||
|
||||
The **classpath** of kyuubi spark sql engine with Hive connector supported consists of
|
||||
The **classpath** of Kyuubi Spark SQL engine with Hive connector supported consists of
|
||||
|
||||
1. kyuubi-spark-connector-hive_2.12-\ |release|\ , the hive connector jar deployed with Kyuubi distributions
|
||||
2. a copy of spark distribution
|
||||
1. kyuubi-spark-sql-engine-\ |release|\ _2.12.jar, the engine jar deployed with a Kyuubi distribution
|
||||
2. a copy of Spark distribution
|
||||
3. kyuubi-spark-connector-hive_2.12-\ |release|\ , which can be found in the `Maven Central`_
|
||||
|
||||
In order to make the Hive connector packages visible for the runtime classpath of engines, we can use one of these methods:
|
||||
|
||||
1. Put the Kyuubi Hive connector packages into ``$SPARK_HOME/jars`` directly
|
||||
2. Set ``spark.jars=/path/to/kyuubi-hive-connector``
|
||||
|
||||
.. note::
|
||||
Starting from v1.9.2 and v1.10.0, KSHC jars available in the `Maven Central`_ guarantee binary compatibility across
|
||||
Spark versions, namely, Spark 3.3 onwards.
|
||||
|
||||
.. _kyuubi-hive-conf:
|
||||
|
||||
Configurations
|
||||
**************
|
||||
|
||||
To activate functionality of Kyuubi Hive connector, we can set the following configurations:
|
||||
To activate functionality of Kyuubi Spark Hive connector, we can set the following configurations:
|
||||
|
||||
.. code-block:: properties
|
||||
|
||||
spark.sql.catalog.hive_catalog org.apache.kyuubi.spark.connector.hive.HiveTableCatalog
|
||||
spark.sql.catalog.hive_catalog.spark.sql.hive.metastore.version hive-metastore-version
|
||||
spark.sql.catalog.hive_catalog.hive.metastore.uris thrift://metastore-host:port
|
||||
spark.sql.catalog.hive_catalog.hive.metastore.port port
|
||||
spark.sql.catalog.hive_catalog.spark.sql.hive.metastore.jars path
|
||||
spark.sql.catalog.hive_catalog.spark.sql.hive.metastore.jars.path file:///opt/hive1/lib/*.jar
|
||||
|
||||
.. tip::
|
||||
For details about the multi-version Hive configuration, see the related multi-version Hive configurations supported by Apache Spark.
|
||||
spark.sql.catalog.hive_catalog org.apache.kyuubi.spark.connector.hive.HiveTableCatalog
|
||||
spark.sql.catalog.hive_catalog.hive.metastore.uris thrift://metastore-host:port
|
||||
spark.sql.catalog.hive_catalog.<other.hive.conf> <value>
|
||||
spark.sql.catalog.hive_catalog.<other.hadoop.conf> <value>
|
||||
|
||||
Hive Connector Operations
|
||||
------------------
|
||||
@ -106,4 +105,29 @@ Taking ``DROP NAMESPACE`` as a example,
|
||||
|
||||
DROP NAMESPACE hive_catalog.ns;
|
||||
|
||||
.. _Apache Spark: https://spark.apache.org/
|
||||
Advanced Usages
|
||||
***************
|
||||
|
||||
Though KSHC is a pure Spark DataSource V2 connector which isn't coupled with Kyuubi deployment, due to the
|
||||
implementation inside ``spark-sql``, you should not expect KSHC works properly with ``spark-sql``, and
|
||||
any issues caused by such a combination usage won't be considered at this time. Instead, it's recommended
|
||||
using BeeLine with Kyuubi as a drop-in replacement for ``spark-sql``, or switching to ``spark-shell``.
|
||||
|
||||
KSHC supports accessing Kerberized Hive Metastore and HDFS, by using keytab, or TGT cache, or Delegation Token.
|
||||
It's not expected to work properly with multiple KDC instances, the limitation comes from JDK Krb5LoginModule,
|
||||
for such cases, consider setting up Cross-Realm Kerberos trusts, then you just need to talk with one KDC.
|
||||
|
||||
For HMS Thrift API used by Spark, it's known that Hive 2.3.9 client is compatible with HMS from 2.1 to 4.0, and
|
||||
Hive 2.3.10 client is compatible with HMS from 1.1 to 4.0, such version combinations should cover the most cases.
|
||||
For other corner cases, KSHC also supports ``spark.sql.catalog.<catalog_name>.spark.sql.hive.metastore.jars`` and
|
||||
``spark.sql.catalog.<catalog_name>.spark.sql.hive.metastore.version`` as well as the Spark built-in Hive datasource
|
||||
does, you can refer to the Spark documentation for details.
|
||||
|
||||
Currently, KSHC has not implemented the Parquet/ORC Hive tables read/write optimization, in other words, it always
|
||||
uses Hive SerDe to access Hive tables, so there might be a performance gap compared to the Spark built-in Hive
|
||||
datasource, especially due to lack of support for vectorized reading. And you may hit bugs caused by Hive SerDe,
|
||||
e.g. ``ParquetHiveSerDe`` can not read Parquet files that decimals are written in int-based format produced by
|
||||
Spark Parquet datasource writer with ``spark.sql.parquet.writeLegacyFormat=false``.
|
||||
|
||||
.. _Apache Spark: https://spark.apache.org/
|
||||
.. _Maven Central: https://mvnrepository.com/artifact/org.apache.kyuubi/kyuubi-spark-connector-hive
|
||||
|
||||
@ -30,8 +30,8 @@ and easy to expand than directly using Spark to manipulate Hudi.
|
||||
Hudi Integration
|
||||
----------------
|
||||
|
||||
To enable the integration of kyuubi spark sql engine and Hudi through
|
||||
Catalog APIs, you need to:
|
||||
To enable the integration of Kyuubi Spark SQL engine and Hudi through
|
||||
Spark DataSource V2 API, you need to:
|
||||
|
||||
- Referencing the Hudi :ref:`dependencies<spark-hudi-deps>`
|
||||
- Setting the Spark extension and catalog :ref:`configurations<spark-hudi-conf>`
|
||||
@ -41,10 +41,10 @@ Catalog APIs, you need to:
|
||||
Dependencies
|
||||
************
|
||||
|
||||
The **classpath** of kyuubi spark sql engine with Hudi supported consists of
|
||||
The **classpath** of Kyuubi Spark SQL engine with Hudi supported consists of
|
||||
|
||||
1. kyuubi-spark-sql-engine-\ |release|\ _2.12.jar, the engine jar deployed with Kyuubi distributions
|
||||
2. a copy of spark distribution
|
||||
1. kyuubi-spark-sql-engine-\ |release|\ _2.12.jar, the engine jar deployed with a Kyuubi distribution
|
||||
2. a copy of Spark distribution
|
||||
3. hudi-spark<spark.version>-bundle_<scala.version>-<hudi.version>.jar (example: hudi-spark3.2-bundle_2.12-0.11.1.jar), which can be found in the `Maven Central`_
|
||||
|
||||
In order to make the Hudi packages visible for the runtime classpath of engines, we can use one of these methods:
|
||||
|
||||
@ -32,21 +32,21 @@ spark to manipulate Iceberg.
|
||||
Iceberg Integration
|
||||
-------------------
|
||||
|
||||
To enable the integration of kyuubi spark sql engine and Iceberg through
|
||||
Apache Spark Datasource V2 and Catalog APIs, you need to:
|
||||
To enable the integration of Kyuubi Spark SQL engine and Iceberg through
|
||||
Spark DataSource V2 API, you need to:
|
||||
|
||||
- Referencing the Iceberg :ref:`dependencies<spark-iceberg-deps>`
|
||||
- Setting the spark extension and catalog :ref:`configurations<spark-iceberg-conf>`
|
||||
- Setting the Spark extension and catalog :ref:`configurations<spark-iceberg-conf>`
|
||||
|
||||
.. _spark-iceberg-deps:
|
||||
|
||||
Dependencies
|
||||
************
|
||||
|
||||
The **classpath** of kyuubi spark sql engine with Iceberg supported consists of
|
||||
The **classpath** of Kyuubi Spark SQL engine with Iceberg supported consists of
|
||||
|
||||
1. kyuubi-spark-sql-engine-\ |release|\ _2.12.jar, the engine jar deployed with Kyuubi distributions
|
||||
2. a copy of spark distribution
|
||||
1. kyuubi-spark-sql-engine-\ |release|\ _2.12.jar, the engine jar deployed with a Kyuubi distribution
|
||||
2. a copy of Spark distribution
|
||||
3. iceberg-spark-runtime-<spark.version>_<scala.version>-<iceberg.version>.jar (example: iceberg-spark-runtime-3.2_2.12-0.14.0.jar), which can be found in the `Maven Central`_
|
||||
|
||||
In order to make the Iceberg packages visible for the runtime classpath of engines, we can use one of these methods:
|
||||
|
||||
@ -30,21 +30,22 @@ spark to manipulate Apache Paimon (Incubating).
|
||||
Apache Paimon (Incubating) Integration
|
||||
-------------------
|
||||
|
||||
To enable the integration of kyuubi spark sql engine and Apache Paimon (Incubating), you need to set the following configurations:
|
||||
To enable the integration of Kyuubi Spark SQL engine and Apache Paimon (Incubating) through
|
||||
Spark DataSource V2 API, you need to:
|
||||
|
||||
- Referencing the Apache Paimon (Incubating) :ref:`dependencies<spark-paimon-deps>`
|
||||
- Setting the spark extension and catalog :ref:`configurations<spark-paimon-conf>`
|
||||
- Setting the Spark extension and catalog :ref:`configurations<spark-paimon-conf>`
|
||||
|
||||
.. _spark-paimon-deps:
|
||||
|
||||
Dependencies
|
||||
************
|
||||
|
||||
The **classpath** of kyuubi spark sql engine with Apache Paimon (Incubating) consists of
|
||||
The **classpath** of Kyuubi Spark SQL engine with Apache Paimon (Incubating) consists of
|
||||
|
||||
1. kyuubi-spark-sql-engine-\ |release|\ _2.12.jar, the engine jar deployed with Kyuubi distributions
|
||||
2. a copy of spark distribution
|
||||
3. paimon-spark-<version>.jar (example: paimon-spark-3.3-0.4-20230323.002035-5.jar), which can be found in the `Apache Paimon (Incubating) Supported Engines Spark3`_
|
||||
1. kyuubi-spark-sql-engine-\ |release|\ _2.12.jar, the engine jar deployed with a Kyuubi distribution
|
||||
2. a copy of Spark distribution
|
||||
3. paimon-spark-<version>.jar (example: paimon-spark-3.5-0.8.1.jar), which can be found in the `Apache Paimon (Incubating) Supported Engines Spark3`_
|
||||
|
||||
In order to make the Apache Paimon (Incubating) packages visible for the runtime classpath of engines, we can use one of these methods:
|
||||
|
||||
|
||||
@ -35,20 +35,20 @@ spark to manipulate TiDB/TiKV.
|
||||
TiDB Integration
|
||||
-------------------
|
||||
|
||||
To enable the integration of kyuubi spark sql engine and TiDB through
|
||||
Apache Spark Datasource V2 and Catalog APIs, you need to:
|
||||
To enable the integration of Kyuubi Spark SQL engine and TiDB through
|
||||
Spark DataSource V2 API, you need to:
|
||||
|
||||
- Referencing the TiSpark :ref:`dependencies<spark-tidb-deps>`
|
||||
- Setting the spark extension and catalog :ref:`configurations<spark-tidb-conf>`
|
||||
- Setting the Spark extension and catalog :ref:`configurations<spark-tidb-conf>`
|
||||
|
||||
.. _spark-tidb-deps:
|
||||
|
||||
Dependencies
|
||||
************
|
||||
The classpath of kyuubi spark sql engine with TiDB supported consists of
|
||||
The classpath of Kyuubi Spark SQL engine with TiDB supported consists of
|
||||
|
||||
1. kyuubi-spark-sql-engine-\ |release|\ _2.12.jar, the engine jar deployed with Kyuubi distributions
|
||||
2. a copy of spark distribution
|
||||
1. kyuubi-spark-sql-engine-\ |release|\ _2.12.jar, the engine jar deployed with a Kyuubi distribution
|
||||
2. a copy of Spark distribution
|
||||
3. tispark-assembly-<spark.version>_<scala.version>-<tispark.version>.jar (example: tispark-assembly-3.2_2.12-3.0.1.jar), which can be found in the `Maven Central`_
|
||||
|
||||
In order to make the TiSpark packages visible for the runtime classpath of engines, we can use one of these methods:
|
||||
|
||||
@ -32,21 +32,21 @@ Goto `Try Kyuubi`_ to explore TPC-DS data instantly!
|
||||
TPC-DS Integration
|
||||
------------------
|
||||
|
||||
To enable the integration of kyuubi spark sql engine and TPC-DS through
|
||||
Apache Spark Datasource V2 and Catalog APIs, you need to:
|
||||
To enable the integration of Kyuubi Spark SQL engine and TPC-DS through
|
||||
Spark DataSource V2 API, you need to:
|
||||
|
||||
- Referencing the TPC-DS connector :ref:`dependencies<spark-tpcds-deps>`
|
||||
- Setting the spark catalog :ref:`configurations<spark-tpcds-conf>`
|
||||
- Setting the Spark catalog :ref:`configurations<spark-tpcds-conf>`
|
||||
|
||||
.. _spark-tpcds-deps:
|
||||
|
||||
Dependencies
|
||||
************
|
||||
|
||||
The **classpath** of kyuubi spark sql engine with TPC-DS supported consists of
|
||||
The **classpath** of Kyuubi Spark SQL engine with TPC-DS supported consists of
|
||||
|
||||
1. kyuubi-spark-sql-engine-\ |release|\ _2.12.jar, the engine jar deployed with Kyuubi distributions
|
||||
2. a copy of spark distribution
|
||||
1. kyuubi-spark-sql-engine-\ |release|\ _2.12.jar, the engine jar deployed with a Kyuubi distribution
|
||||
2. a copy of Spark distribution
|
||||
3. kyuubi-spark-connector-tpcds-\ |release|\ _2.12.jar, which can be found in the `Maven Central`_
|
||||
|
||||
In order to make the TPC-DS connector package visible for the runtime classpath of engines, we can use one of these methods:
|
||||
|
||||
@ -32,21 +32,21 @@ Goto `Try Kyuubi`_ to explore TPC-H data instantly!
|
||||
TPC-H Integration
|
||||
------------------
|
||||
|
||||
To enable the integration of kyuubi spark sql engine and TPC-H through
|
||||
Apache Spark Datasource V2 and Catalog APIs, you need to:
|
||||
To enable the integration of Kyuubi Spark SQL engine and TPC-H through
|
||||
Spark DataSource V2 API, you need to:
|
||||
|
||||
- Referencing the TPC-H connector :ref:`dependencies<spark-tpch-deps>`
|
||||
- Setting the spark catalog :ref:`configurations<spark-tpch-conf>`
|
||||
- Setting the Spark catalog :ref:`configurations<spark-tpch-conf>`
|
||||
|
||||
.. _spark-tpch-deps:
|
||||
|
||||
Dependencies
|
||||
************
|
||||
|
||||
The **classpath** of kyuubi spark sql engine with TPC-H supported consists of
|
||||
The **classpath** of Kyuubi Spark SQL engine with TPC-H supported consists of
|
||||
|
||||
1. kyuubi-spark-sql-engine-\ |release|\ _2.12.jar, the engine jar deployed with Kyuubi distributions
|
||||
2. a copy of spark distribution
|
||||
1. kyuubi-spark-sql-engine-\ |release|\ _2.12.jar, the engine jar deployed with a Kyuubi distribution
|
||||
2. a copy of Spark distribution
|
||||
3. kyuubi-spark-connector-tpch-\ |release|\ _2.12.jar, which can be found in the `Maven Central`_
|
||||
|
||||
In order to make the TPC-H connector package visible for the runtime classpath of engines, we can use one of these methods:
|
||||
|
||||
@ -42,7 +42,7 @@ Dependencies
|
||||
|
||||
The **classpath** of kyuubi trino sql engine with Apache Paimon (Incubating) supported consists of
|
||||
|
||||
1. kyuubi-trino-sql-engine-\ |release|\ _2.12.jar, the engine jar deployed with Kyuubi distributions
|
||||
1. kyuubi-trino-sql-engine-\ |release|\ _2.12.jar, the engine jar deployed with a Kyuubi distribution
|
||||
2. a copy of trino distribution
|
||||
3. paimon-trino-<version>.jar (example: paimon-trino-0.2.jar), which code can be found in the `Source Code`_
|
||||
4. flink-shaded-hadoop-2-uber-<version>.jar, which code can be found in the `Pre-bundled Hadoop`_
|
||||
|
||||
@ -21,12 +21,12 @@
|
||||
|
||||
- [Apache Ranger](https://ranger.apache.org/)
|
||||
|
||||
This plugin works as a ranger rest client with Apache Ranger admin server to do privilege check.
|
||||
This plugin works as a ranger rest client with Apache Ranger Admin server to do privilege check.
|
||||
Thus, a ranger server need to be installed ahead and available to use.
|
||||
|
||||
- Building(optional)
|
||||
|
||||
If your ranger admin or spark distribution is not compatible with the official pre-built [artifact](https://mvnrepository.com/artifact/org.apache.kyuubi/kyuubi-spark-authz) in maven central.
|
||||
If your Ranger Admin or Spark distribution is not compatible with the official pre-built [artifact](https://mvnrepository.com/artifact/org.apache.kyuubi/kyuubi-spark-authz) in maven central.
|
||||
You need to [build](build.md) the plugin targeting the spark/ranger you are using by yourself.
|
||||
|
||||
## Install
|
||||
|
||||
Loading…
Reference in New Issue
Block a user