kyuubi/docs/client/python/pyhive.md
liangbowen d8e75aa9da
[KYUUBI #3395] [DOCS] [Subtask] Add PyHive client docs
### _Why are the changes needed?_

close #3395.

Add client docs for PyHive.

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #3394 from bowenliang123/3309-pyhive-doc.

Closes #3395

31a37b5c [liangbowen] add pyhive docs

Authored-by: liangbowen <liangbowen@gf.com.cn>
Signed-off-by: Kent Yao <yao@apache.org>
2022-09-05 13:55:58 +08:00

2.3 KiB

PyHive

PyHive is a collection of Python DB-API and SQLAlchemy interfaces for Hive. PyHive can connect with the Kyuubi server serving in thrift protocol as HiveServer2.

Requirements

PyHive works with Python 2.7 / Python 3. Install PyHive via pip for the Hive interface.

pip install 'pyhive[hive]'

Usage

Use the Kyuubi server's host and thrift protocol port to connect.

For further information about usages and features, e.g. DB-API async fetching, using in SQLAlchemy, please refer to project homepage.

DB-API

from pyhive import hive
cursor = hive.connect(host=kyuubi_host,port=10009).cursor()
cursor.execute('SELECT * FROM my_awesome_data LIMIT 10')
print(cursor.fetchone())
print(cursor.fetchall())

Use PyHive with Pandas

PyHive provides a handy way to establish a SQLAlchemy compatible connection and works with Pandas dataframe for executing SQL and reading data via pandas.read_sql.

from pyhive import hive
import pandas as pd

# open connection
conn = hive.Connection(host=kyuubi_host,port=10009)

# query the table to a new dataframe
dataframe = pd.read_sql("SELECT id, name FROM test.example_table", conn)

Authentication

If password is provided for connection, make sure the auth param set to either CUSTOM or LDAP.

# open connection
conn = hive.Connection(host=kyuubi_host,port=10009, 
user='user', password='password', auth='CUSTOM')