### _Why are the changes needed?_ close #3395. Add client docs for PyHive. ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [x] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request Closes #3394 from bowenliang123/3309-pyhive-doc. Closes #3395 31a37b5c [liangbowen] add pyhive docs Authored-by: liangbowen <liangbowen@gf.com.cn> Signed-off-by: Kent Yao <yao@apache.org>
2.3 KiB
2.3 KiB
PyHive
PyHive is a collection of Python DB-API and SQLAlchemy interfaces for Hive. PyHive can connect with the Kyuubi server serving in thrift protocol as HiveServer2.
Requirements
PyHive works with Python 2.7 / Python 3. Install PyHive via pip for the Hive interface.
pip install 'pyhive[hive]'
Usage
Use the Kyuubi server's host and thrift protocol port to connect.
For further information about usages and features, e.g. DB-API async fetching, using in SQLAlchemy, please refer to project homepage.
DB-API
from pyhive import hive
cursor = hive.connect(host=kyuubi_host,port=10009).cursor()
cursor.execute('SELECT * FROM my_awesome_data LIMIT 10')
print(cursor.fetchone())
print(cursor.fetchall())
Use PyHive with Pandas
PyHive provides a handy way to establish a SQLAlchemy compatible connection and works with Pandas dataframe for executing SQL and reading data via pandas.read_sql.
from pyhive import hive
import pandas as pd
# open connection
conn = hive.Connection(host=kyuubi_host,port=10009)
# query the table to a new dataframe
dataframe = pd.read_sql("SELECT id, name FROM test.example_table", conn)
Authentication
If password is provided for connection, make sure the auth param set to either CUSTOM or LDAP.
# open connection
conn = hive.Connection(host=kyuubi_host,port=10009,
user='user', password='password', auth='CUSTOM')