Commit Graph

4 Commits

Author SHA1 Message Date
Frank Bertsch
b49ed02f16
[KYUUBI #7106] Make response.results.columns optional
### Why are the changes needed?
Bugfix. Spark 3.5 is returning `None` for `response.results.columns`, while Spark 3.3 returned actual values.

The response here: https://github.com/apache/kyuubi/blob/master/python/pyhive/hive.py#L507

For a query that does nothing (mine was an `add jar s3://a/b/c.jar`), here are the responses I received.

Spark 3.3:
```
TFetchResultsResp(status=TStatus(statusCode=0, infoMessages=None, sqlState=None, errorCode=None, errorMessage=None), hasMoreRows=False, results=TRowSet(startRowOffset=0, rows=[], columns=[TColumn(boolVal=None, byteVal=None, i16Val=None, i32Val=None, i64Val=None, doubleVal=None, stringVal=TStringColumn(values=[], nulls=b'\x00'), binaryVal=None)], binaryColumns=None, columnCount=None))
```

Spark 3.5:
```
TFetchResultsResp(status=TStatus(statusCode=0, infoMessages=None, sqlState=None, errorCode=None, errorMessage=None), hasMoreRows=False, results=TRowSet(startRowOffset=0, rows=[], columns=None, binaryColumns=None, columnCount=None))
```

### How was this patch tested?
I tested by applying it locally and running my query against Spark 3.5. I was not able to get any unit tests running, sorry!

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes #7107 from fbertsch/spark_3_5_fix.

Closes #7106

13d1440a8 [Frank Bertsch] Make response.results.columns optional

Authored-by: Frank Bertsch <fbertsch@netflix.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-06-23 23:28:18 +08:00
John Zhang
c19d923b85
[KYUUBI #7048] Fix KeyError when parsing unknown Hive type_id in schema inspection
This patch adds try/except block to prevent `KeyError` when mapping unknown `type_id` in Hive schema parsing. Now, if a `type_id` is not recognized, `type_code` is set to `None` instead of raising an exception.

### Why are the changes needed?

Previously, when parsing Hive table schemas, the code attempts to map each `type_id` to a human-readable type name via `ttypes.TTypeId._VALUES_TO_NAMES[type_id]`. If Hive introduced an unknown or custom type (e.g. some might using an non-standard version or data pumping from a totally different data source like *Oracle* into *Hive* databases), a `KeyError` was raised, interrupting the entire SQL query process. This patch adds a `try/except` block so that unrecognized `type_id`s will set `type_code` to `None` instead of raising an error so that the downstream user can decided what to do instead of just an Exception. This makes schema inspection more robust and compatible with evolving Hive data types.

### How was this patch tested?

The patch was tested by running schema inspection on tables containing both standard and unknown/custom Hive column types. For known types, parsing behaves as before. For unknown types, the parser sets `type_code` to `None` without raising an exception, and the rest of the process completes successfully. No unit test was added since this is an edge case dependent on unreachable or custom Hive types, but was tested on typical use cases.

### Was this patch authored or co-authored using generative AI tooling?

No. 😂 It's a minor patch.

Closes #7048 from ZsgsDesign/patch-1.

Closes #7048

4d246d0ec [John Zhang] fix: handle KeyError when parsing Hive type_id mapping

Authored-by: John Zhang <zsgsdesign@gmail.com>
Signed-off-by: Kent Yao <yao@apache.org>
2025-04-29 10:41:16 +08:00
Alex Wojtowicz
9daf74d9c3
[KYUUBI #6908] Connection class ssl context object paramater
**Why are the changes needed:**
Currently looking to connect to a HiveServer2 behind an NGINX proxy that is requiring mTLS communication. pyHive seems to lack the capability to establish an mTLS connection in applications such as Airflow directly communicating to the HiveServer2 instance.

The change needed is to be able to pass in the parameters for a proper mTLS ssl context to be established. I believe that creating your own ssl_context object is the quickest and cleanest way to do so, leaving the responsibility of configuring it to further implementations and users. Also cuts down on code length.

**How was this patch tested:**
Corresponding pytest fixtures have been added, using the mock module to see if ssl_context object was properly accessed, or if the default one created in the Connection initialization was properly configured.

Was not able to run pytest fixtures specifically, was lacking JDBC driver, first time contributing to open source, happy to run tests if provided guidance. Passed a clean build and test of the entire kyuubi project in local dev environment.

**Was this patch authored or co-authored using generative AI tooling**
Yes, Generated-by Cursor-AI with Claude Sonnet 3.5 agent

Closes #6935 from alexio215/connection-class-ssl-context-param.

Closes #6908

539b29962 [Cheng Pan] Update python/pyhive/tests/test_hive.py
14c607489 [Alex Wojtowicz] Simplified testing, following pattern of other tests, need proper SSL setup with nginx to test ssl_context fully
b947f2454 [Alex Wojtowicz] Added exception handling since JDBC driver will not run in python tests
11f9002bf [Alex Wojtowicz] Passing in fully configured mock object before creating connection
009c5cf24 [Alex Wojtowicz] Added back doc string documentation
e3280bcd8 [Alex Wojtowicz] Python testing
529de8a12 [Alex Wojtowicz] Added ssl_context object. If no obj is provided, then it continues to use default provided parameters

Lead-authored-by: Alex Wojtowicz <awojtowi@akamai.com>
Co-authored-by: Cheng Pan <pan3793@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-02-25 22:22:14 +08:00
Cheng Pan
f8c7b93f55
[KYUUBI #5686][FOLLOWUP] Rename pyhive to python
# 🔍 Description

This is the follow-up of #5686, renaming `./pyhive` to `./python`, and also adding `**/python/*` to RAT exclusion list temporarily.

"PyHive" may not be a suitable name after being part of Apache Kyuubi, let's use a generic dir name `python`, and discuss the official name later(we probably keep the code at `./python` eventually).

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

Recover RAT checked.

---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6279 from pan3793/pyhive-1.

Closes #5686

42d338e71 [Cheng Pan] [KYUUBI #5686][FOLLOWUP] Rename pyhive to python

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
2024-04-09 20:30:02 +08:00