[KYUUBI #7106] Make response.results.columns optional

### Why are the changes needed?
Bugfix. Spark 3.5 is returning `None` for `response.results.columns`, while Spark 3.3 returned actual values.

The response here: https://github.com/apache/kyuubi/blob/master/python/pyhive/hive.py#L507

For a query that does nothing (mine was an `add jar s3://a/b/c.jar`), here are the responses I received.

Spark 3.3:
```
TFetchResultsResp(status=TStatus(statusCode=0, infoMessages=None, sqlState=None, errorCode=None, errorMessage=None), hasMoreRows=False, results=TRowSet(startRowOffset=0, rows=[], columns=[TColumn(boolVal=None, byteVal=None, i16Val=None, i32Val=None, i64Val=None, doubleVal=None, stringVal=TStringColumn(values=[], nulls=b'\x00'), binaryVal=None)], binaryColumns=None, columnCount=None))
```

Spark 3.5:
```
TFetchResultsResp(status=TStatus(statusCode=0, infoMessages=None, sqlState=None, errorCode=None, errorMessage=None), hasMoreRows=False, results=TRowSet(startRowOffset=0, rows=[], columns=None, binaryColumns=None, columnCount=None))
```

### How was this patch tested?
I tested by applying it locally and running my query against Spark 3.5. I was not able to get any unit tests running, sorry!

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes #7107 from fbertsch/spark_3_5_fix.

Closes #7106

13d1440a8 [Frank Bertsch] Make response.results.columns optional

Authored-by: Frank Bertsch <fbertsch@netflix.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
This commit is contained in:
Frank Bertsch 2025-06-23 23:28:18 +08:00 committed by Cheng Pan
parent e769f42398
commit b49ed02f16
No known key found for this signature in database
GPG Key ID: 8001952629BCC75D

View File

@ -508,14 +508,17 @@ class Cursor(common.DBAPICursor):
_check_status(response) _check_status(response)
schema = self.description schema = self.description
assert not response.results.rows, 'expected data in columnar format' assert not response.results.rows, 'expected data in columnar format'
columns = [_unwrap_column(col, col_schema[1]) for col, col_schema in has_new_data = False
zip(response.results.columns, schema)] if response.results.columns:
new_data = list(zip(*columns)) columns = [_unwrap_column(col, col_schema[1]) for col, col_schema in
self._data += new_data zip(response.results.columns, schema)]
new_data = list(zip(*columns))
self._data += new_data
has_new_data = (True if new_data else False)
# response.hasMoreRows seems to always be False, so we instead check the number of rows # response.hasMoreRows seems to always be False, so we instead check the number of rows
# https://github.com/apache/hive/blob/release-1.2.1/service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java#L678 # https://github.com/apache/hive/blob/release-1.2.1/service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java#L678
# if not response.hasMoreRows: # if not response.hasMoreRows:
if not new_data: if not has_new_data:
self._state = self._STATE_FINISHED self._state = self._STATE_FINISHED
def poll(self, get_progress_update=True): def poll(self, get_progress_update=True):