[KYUUBI #7106] Make response.results.columns optional

### Why are the changes needed? Bugfix. Spark 3.5 is returning `None` for `response.results.columns`, while Spark 3.3 returned actual values. The response here: https://github.com/apache/kyuubi/blob/master/python/pyhive/hive.py#L507 For a query that does nothing (mine was an `add jar s3://a/b/c.jar`), here are the responses I received. Spark 3.3: ``` TFetchResultsResp(status=TStatus(statusCode=0, infoMessages=None, sqlState=None, errorCode=None, errorMessage=None), hasMoreRows=False, results=TRowSet(startRowOffset=0, rows=[], columns=[TColumn(boolVal=None, byteVal=None, i16Val=None, i32Val=None, i64Val=None, doubleVal=None, stringVal=TStringColumn(values=[], nulls=b'\x00'), binaryVal=None)], binaryColumns=None, columnCount=None)) ``` Spark 3.5: ``` TFetchResultsResp(status=TStatus(statusCode=0, infoMessages=None, sqlState=None, errorCode=None, errorMessage=None), hasMoreRows=False, results=TRowSet(startRowOffset=0, rows=[], columns=None, binaryColumns=None, columnCount=None)) ``` ### How was this patch tested? I tested by applying it locally and running my query against Spark 3.5. I was not able to get any unit tests running, sorry! ### Was this patch authored or co-authored using generative AI tooling? No. Closes #7107 from fbertsch/spark_3_5_fix. Closes #7106 13d1440a8 [Frank Bertsch] Make response.results.columns optional Authored-by: Frank Bertsch <fbertsch@netflix.com> Signed-off-by: Cheng Pan <chengpan@apache.org>
2025-06-23 23:28:18 +08:00 · 2025-06-23 23:28:18 +08:00 · b49ed02f16
commit b49ed02f16
parent e769f42398
1 changed files with 8 additions and 5 deletions
--- a/python/pyhive/hive.py
+++ b/python/pyhive/hive.py
@ -508,14 +508,17 @@ class Cursor(common.DBAPICursor):
        _check_status(response)
        schema = self.description
        assert not response.results.rows, 'expected data in columnar format'
-        columns = [_unwrap_column(col, col_schema[1]) for col, col_schema in
+        has_new_data = False
-                   zip(response.results.columns, schema)]
+        if response.results.columns:
-        new_data = list(zip(*columns))
+            columns = [_unwrap_column(col, col_schema[1]) for col, col_schema in
-        self._data += new_data
+                       zip(response.results.columns, schema)]
            new_data = list(zip(*columns))
            self._data += new_data
            has_new_data = (True if new_data else False)
        # response.hasMoreRows seems to always be False, so we instead check the number of rows
        # https://github.com/apache/hive/blob/release-1.2.1/service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java#L678
        # if not response.hasMoreRows:
-        if not new_data:
+        if not has_new_data:
            self._state = self._STATE_FINISHED
    def poll(self, get_progress_update=True):