[KYUUBI #2025][HIVE] Add a Hive on Yarn doc
### _Why are the changes needed?_ jackson-annotations 2.13 and hive-exec 2.3.9 have class conflict ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [x] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request Closes #2326 from deadwind4/hive-ci. Closes #2025 0644c564 [Ada Wang] [KYUUBI #2025][HIVE] Add a hive on yarn doc Authored-by: Ada Wang <wang4luning@gmail.com> Signed-off-by: Kent Yao <yao@apache.org>
This commit is contained in:
parent
3ab2c81dce
commit
02356a3878
@ -146,7 +146,7 @@ yarn.application.id: application_00000000XX_00XX
|
||||
|
||||
Either `HADOOP_CONF_DIR` or `YARN_CONF_DIR` is configured and points to the Hadoop client configurations directory, usually, `$HADOOP_HOME/etc/hadoop`.
|
||||
|
||||
If the `HADOOP_CONF_DIR` points the YARN and HDFS cluster correctly, and the `HADOOP_CLASSPATH` environment variable is set, you can launch a Flink on YARN session, and submit an example job:
|
||||
If the `HADOOP_CONF_DIR` points to the YARN and HDFS cluster correctly, and the `HADOOP_CLASSPATH` environment variable is set, you can launch a Flink on YARN session, and submit an example job:
|
||||
```bash
|
||||
# we assume to be in the root directory of
|
||||
# the unzipped Flink distribution
|
||||
@ -186,3 +186,57 @@ As Kyuubi Flink SQL engine wraps the Flink SQL client that currently does not su
|
||||
so `security.kerberos.login.keytab` and `security.kerberos.login.principal` should not use now.
|
||||
|
||||
Instead, you can schedule a periodically `kinit` process via `crontab` task on the local machine that hosts Kyuubi server or simply use [Kyuubi Kinit](settings.html#kinit).
|
||||
|
||||
## Deploy Kyuubi Hive Engine on Yarn
|
||||
|
||||
### Requirements
|
||||
|
||||
When you want to deploy Kyuubi's Hive SQL engines on YARN, you'd better have cognition upon the following things.
|
||||
|
||||
- Knowing the basics about [Running Hive on YARN](https://cwiki.apache.org/confluence/display/Hive/GettingStarted)
|
||||
- A binary distribution of Hive
|
||||
- You can use the built-in Hive distribution
|
||||
- Download a recent Hive distribution from the [Hive official website](https://hive.apache.org/downloads.html) and unpack it
|
||||
- You can [Build Hive](https://cwiki.apache.org/confluence/display/Hive//GettingStarted#GettingStarted-BuildingHivefromSource)
|
||||
- An active [Apache Hadoop YARN](https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html) cluster
|
||||
- Make sure your YARN cluster is ready for accepting Hive applications by running yarn top. It should show no error messages
|
||||
- An active [Apache Hadoop HDFS](https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html) cluster
|
||||
- Setup Hadoop client configurations at the machine the Kyuubi server locates
|
||||
- An active [Hive Metastore Service](https://cwiki.apache.org/confluence/display/hive/design#Design-Metastore)
|
||||
|
||||
### Configurations
|
||||
|
||||
#### Environment
|
||||
|
||||
Either `HADOOP_CONF_DIR` or `YARN_CONF_DIR` is configured and points to the Hadoop client configurations directory, usually, `$HADOOP_HOME/etc/hadoop`.
|
||||
|
||||
If the `HADOOP_CONF_DIR` points to the YARN and HDFS cluster correctly, you should be able to run the `Hive SQL` example on YARN.
|
||||
|
||||
```bash
|
||||
$ $HIVE_HOME/bin/hiveserver2
|
||||
# In another terminal
|
||||
$ $HIVE_HOME/bin/beeline -u 'jdbc:hive2://localhost:10000/default'
|
||||
0: jdbc:hive2://localhost:10000/default> CREATE TABLE pokes (foo INT, bar STRING);
|
||||
0: jdbc:hive2://localhost:10000/default> INSERT INTO TABLE pokes VALUES (1, 'hello');
|
||||
```
|
||||
|
||||
If the `Hive SQL` passes and there is a job in Yarn Web UI, It indicates the hive environment is normal.
|
||||
|
||||
#### Required Environment Variable
|
||||
|
||||
The `HIVE_HADOOP_CLASSPATH` is required, too. It should contain `commons-collections-*.jar`,
|
||||
`hadoop-client-runtime-*.jar`, `hadoop-client-api-*.jar` and `htrace-core4-*.jar`.
|
||||
All four jars are in the `HADOOP_HOME`.
|
||||
|
||||
For example, in Hadoop 3.1.0 version, the following is their location.
|
||||
- `${HADOOP_HOME}/share/hadoop/common/lib/commons-collections-3.2.2.jar`
|
||||
- `${HADOOP_HOME}/share/hadoop/client/hadoop-client-runtime-3.1.0.jar`
|
||||
- `${HADOOP_HOME}/share/hadoop/client/hadoop-client-api-3.1.0.jar`
|
||||
- `${HADOOP_HOME}/share/hadoop/common/lib/htrace-core4-4.1.0-incubating.jar`
|
||||
|
||||
Configure them in `$KYUUBI_HOME/conf/kyuubi-env.sh` or `$HIVE_HOME/conf/hive-env.sh`, e.g.
|
||||
|
||||
```bash
|
||||
$ echo "export HADOOP_CONF_DIR=/path/to/hadoop/conf" >> $KYUUBI_HOME/conf/kyuubi-env.sh
|
||||
$ echo "export HIVE_HADOOP_CLASSPATH=${HADOOP_HOME}/share/hadoop/common/lib/commons-collections-3.2.2.jar:${HADOOP_HOME}/share/hadoop/client/hadoop-client-runtime-3.1.0.jar:${HADOOP_HOME}/share/hadoop/client/hadoop-client-api-3.1.0.jar:${HADOOP_HOME}/share/hadoop/common/lib/htrace-core4-4.1.0-incubating.jar" >> $KYUUBI_HOME/conf/kyuubi-env.sh
|
||||
```
|
||||
|
||||
Loading…
Reference in New Issue
Block a user