[KYUUBI #6463] Release semaphore immediately after startup process exit
# 🔍 Description ## Issue References 🔗 The concurrency limit for the engine startup process is mainly used to avoid overload on the machine(or container) of the Kyuubi server, the current implementation holds startupProcessSemaphore until the session is established successfully. While for Spark on YARN cluster mode, some YARN queue resource insufficiency may block the subsequent Spark application submissions to other queues, significantly affecting the Kyuubi server's resource utilization. ## Describe Your Solution 🔧 We should immediately release the `startupProcessSemaphore` after the engine startup process exits (i.e., after the `spark-submit` process exits) as the load has already disappeared. ## Types of changes 🔖 - [x] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 I tested it on a cluster of 50 kyuubi Servers, and kyuubi server resource utilization increased by 70% --- # Checklist 📝 - [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) **Be nice. Be informative.** Closes #6463 from ic4y/master-p003. Closes #6463 f7de68ce3 [ic4y] Improve code quality d8b0248df [ic4y] [Improve][EngineRef] Optimize Engine Startup Concurrency Limit Authored-by: ic4y <ic4y@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org>
This commit is contained in:
parent
7011d90246
commit
95ed74821c
@ -239,7 +239,10 @@ private[kyuubi] class EngineRef(
|
||||
while (engineRef.isEmpty) {
|
||||
if (exitValue.isEmpty && process.waitFor(1, TimeUnit.SECONDS)) {
|
||||
exitValue = Some(process.exitValue())
|
||||
if (!exitValue.contains(0)) {
|
||||
if (exitValue.contains(0)) {
|
||||
acquiredPermit = false
|
||||
startupProcessSemaphore.foreach(_.release())
|
||||
} else {
|
||||
val error = builder.getError
|
||||
MetricsSystem.tracing { ms =>
|
||||
ms.incCount(MetricRegistry.name(ENGINE_FAIL, appUser))
|
||||
|
||||
Loading…
Reference in New Issue
Block a user