kyuubi/docs/configuration
zhouyifan279 a196ace284
[KYUUBI #6199] Support to run HiveSQLEngine on kerberized YARN
# 🔍 Description
## Issue References 🔗

This pull request implement a feature -  Run HiveSQLEngine on kerberized YARN

## Describe Your Solution 🔧
Introduced two configs:
- kyuubi.engine.principal
- kyuubi.engine.keytab

When do submit to a kerberized YARN, submitter uploads `kyuubi.engine.keytab` to application's staging dir.
YARN NodeManager downloads keytab to AM's working directory. AM logins to Kerberos using the principal and keytab

**Note**
I've tried to run HiveSQLEngine with only DelegationTokens but failed.

Take SQL `SELECT * FROM a` as an example:
Hive handles this simple TableScan SQL by reading directly from table's hdfs file.
When Hive invokes `FileInputFormat.getSplits` during reading, `java.io.IOException: Delegation Token can be issued only with kerberos or web authentication` will be thrown.
The simplified stacktrace from IDEA is as below:
```
getDelegationToken:734, DFSClient (org.apache.hadoop.hdfs)
getDelegationToken:2072, DistributedFileSystem (org.apache.hadoop.hdfs)
collectDelegationTokens:108, DelegationTokenIssuer (org.apache.hadoop.security.token)
addDelegationTokens:83, DelegationTokenIssuer (org.apache.hadoop.security.token)
obtainTokensForNamenodesInternal:143, TokenCache (org.apache.hadoop.mapreduce.security)
obtainTokensForNamenodesInternal:102, TokenCache (org.apache.hadoop.mapreduce.security)
obtainTokensForNamenodes:81, TokenCache (org.apache.hadoop.mapreduce.security)
listStatus:221, FileInputFormat (org.apache.hadoop.mapred)
getSplits:332, FileInputFormat (org.apache.hadoop.mapred)
getNextSplits:372, FetchOperator (org.apache.hadoop.hive.ql.exec)
getRecordReader:304, FetchOperator (org.apache.hadoop.hive.ql.exec)
getNextRow:459, FetchOperator (org.apache.hadoop.hive.ql.exec)
pushRow:428, FetchOperator (org.apache.hadoop.hive.ql.exec)
fetch:147, FetchTask (org.apache.hadoop.hive.ql.exec)
getResults:2208, Driver (org.apache.hadoop.hive.ql)
getNextRowSet:494, SQLOperation (org.apache.hive.service.cli.operation)
getNextRowSetInternal:105, HiveOperation (org.apache.kyuubi.engine.hive.operation)
```

Theoretically, it can be solved by add AM DelegationTokens into
 `org.apache.hadoop.hive.ql.exec.FetchOperator.job.credentials`.
But actually, it is impossible without modifying Hive's source code.

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

#### Behavior Without This Pull Request ⚰️
HiveSQLEngine can not run on a kerberized YARN

#### Behavior With This Pull Request 🎉
HiveSQLEngine can run on a kerberized YARN

#### Related Unit Tests

---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6199 from zhouyifan279/kerberized-hive-engine-on-yarn.

Closes #6199

383d1cdcb [zhouyifan279] Fix tests
458493a91 [zhouyifan279] Warn if run Hive on YARN without principal and keytab
118afe280 [zhouyifan279] Warn if run Hive on YARN without principal and keytab
41fed0c44 [zhouyifan279] Ignore Principal&Keytab when hadoop security is no enabled.
9e2d86237 [Cheng Pan] Update kyuubi-server/src/main/scala/org/apache/kyuubi/engine/hive/HiveProcessBuilder.scala
5ae0a3eac [zhouyifan279] Remove redundant checks
5d3013aaf [zhouyifan279] Use principal & keytab in Local mode
5733dfdcb [zhouyifan279] Use principal & keytab in Local mode
85ce9bb7a [zhouyifan279] Use principal & keytab in Local mode
061223dbe [zhouyifan279] Resolve comments
e706936e7 [zhouyifan279] Resolve comments
f84c7bccc [zhouyifan279] Support run HiveSQLEngine on kerberized YARN
4d262c847 [zhouyifan279] Support run HiveSQLEngine on kerberized YARN

Lead-authored-by: zhouyifan279 <zhouyifan279@gmail.com>
Co-authored-by: Cheng Pan <pan3793@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2024-03-22 16:18:43 +08:00
..
settings.md [KYUUBI #6199] Support to run HiveSQLEngine on kerberized YARN 2024-03-22 16:18:43 +08:00