[KYUUBI #6489] [PYTHON] PyKyuubi get_table_names also supports Spark SQL dialect

# 🔍 Description
## Issue References 🔗

This pull request fixes #6489

## Describe Your Solution 🔧

After my investigation, I found the bug and solution.
The function get_table_names returns an incorrect value when I used Superset to connect to Kyuubi for Spark SQL.
[get_table_names](https://github.com/apache/kyuubi/blob/master/python/pyhive/sqlalchemy_hive.py#L380)

The following code is used to connect to hive directly.
`return [row[0] for row in connection.execute(text(query))]`

Because The following value is returned when the Hive is connected.

show tables in default :
[('student',), ('student_scores',)]

The following code is used to connect to Kyuubi.
`return [row[1] for row in connection.execute(text(query))]`

Because The following value is returned when the Kyuubi is connected.

show tables in default :
[('default', 'employees', False), ('default', 'student', False), ('default', 'student_scores', False)]

So, for the difference in return value, I modified the code.

And I test them in Superset. The code works.

Hive
<img width="1214" alt="image" src="https://github.com/apache/kyuubi/assets/29974394/9048b21d-053e-4b5d-be35-ba29d3bd6848">

Kyuubi
<img width="1085" alt="image" src="https://github.com/apache/kyuubi/assets/29974394/d600dfed-1127-41ea-a0bf-ca662a5487df">

Spark SQL also works properly.
<img width="1199" alt="image" src="https://github.com/apache/kyuubi/assets/29974394/7026e39e-6d63-473d-9e43-eeab580719ea">

## Types of changes 🔖

- [x] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

#### Behavior Without This Pull Request ⚰️

#### Behavior With This Pull Request 🎉

#### Related Unit Tests

---

# Checklist 📝

- [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6490 from BruceWong96/branch-kyuubi-6489.

Closes #6489

94a52c0e5 [wenjie.wang01] add else branch.
8ab20becf [wenjie.wang01] fix bug for function get_table_names.
136c7b795 [wenjie.wang01] fix bug for function get_table_names.

Authored-by: wenjie.wang01 <wenjie.wang01@liulishuo.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
This commit is contained in:
wenjie.wang01 2024-06-21 19:03:43 +08:00 committed by Cheng Pan
parent 0a53415d92
commit a0b9873f81
No known key found for this signature in database
GPG Key ID: 8001952629BCC75D

View File

@ -10,6 +10,7 @@ from __future__ import unicode_literals
import datetime
import decimal
import logging
import re
from sqlalchemy import exc
@ -39,6 +40,7 @@ from pyhive.common import UniversalSet
from dateutil.parser import parse
from decimal import Decimal
_logger = logging.getLogger(__name__)
class HiveStringTypeBase(types.TypeDecorator):
"""Translates strings returned by Thrift into something else"""
@ -377,7 +379,21 @@ class HiveDialect(default.DefaultDialect):
query = 'SHOW TABLES'
if schema:
query += ' IN ' + self.identifier_preparer.quote_identifier(schema)
return [row[0] for row in connection.execute(text(query))]
table_names = []
for row in connection.execute(text(query)):
# Hive returns 1 columns
if len(row) == 1:
table_names.append(row[0])
# Spark SQL returns 3 columns
elif len(row) == 3:
table_names.append(row[1])
else:
_logger.warning("Unexpected number of columns in SHOW TABLES result: {}".format(len(row)))
table_names.append('UNKNOWN')
return table_names
def do_rollback(self, dbapi_connection):
# No transactions for Hive