[KYUUBI #6720] K8s pod OOM Killed should be identified as Application failed state

# 🔍 Description
## Issue References 🔗

This pull request fixes #6720

## Describe Your Solution 🔧

If pod goes into OOMKilled state, application should be marked as KILLED, which is eventually identified as isFailed

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

Tested locally, was able to launch new session
<img width="922" alt="kyuubi_new_session" src="https://github.com/user-attachments/assets/b003c86f-484d-40c5-b173-847374a45b1d">

---

**Be nice. Be informative.**

Closes #6721 from Madhukar525722/OOM.

Closes #6720

cd0bdf633 [madlnu] [KYUUBI #6720] K8s pod OOM Killed should be identified as Application failed state

Authored-by: madlnu <madlnu@visa.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
This commit is contained in:
madlnu 2024-10-02 19:12:43 +08:00 committed by Cheng Pan
parent 372f770526
commit 2d64255874
No known key found for this signature in database
GPG Key ID: 8001952629BCC75D

View File

@ -34,7 +34,7 @@ import org.apache.kyuubi.config.KyuubiConf
import org.apache.kyuubi.config.KyuubiConf.{KubernetesApplicationStateSource, KubernetesCleanupDriverPodStrategy}
import org.apache.kyuubi.config.KyuubiConf.KubernetesApplicationStateSource.KubernetesApplicationStateSource
import org.apache.kyuubi.config.KyuubiConf.KubernetesCleanupDriverPodStrategy.{ALL, COMPLETED, NONE}
import org.apache.kyuubi.engine.ApplicationState.{isTerminated, ApplicationState, FAILED, FINISHED, NOT_FOUND, PENDING, RUNNING, UNKNOWN}
import org.apache.kyuubi.engine.ApplicationState.{isTerminated, ApplicationState, FAILED, FINISHED, KILLED, NOT_FOUND, PENDING, RUNNING, UNKNOWN}
import org.apache.kyuubi.operation.OperationState
import org.apache.kyuubi.server.KyuubiServer
import org.apache.kyuubi.session.KyuubiSessionManager
@ -535,6 +535,7 @@ object KubernetesApplicationOperation extends Logging {
case "Running" => RUNNING
case "Succeeded" => FINISHED
case "Failed" | "Error" => FAILED
case "OOMKilled" => KILLED
case "Unknown" => UNKNOWN
case _ =>
warn(s"The spark driver pod state: $podState is not supported, " +