Doctor of Philosophy (PhD)


Division of Computer Science and Engineering

Document Type



An important feature of cloud computing is scalability, which allows web service providers to dynamically add or remove computing resources for their applications according to their needs. The ability to efficiently scale a cloud application to match the workload is critical since the application must meet stringent Service-Level Objectives (SLOs) like bounded response time while handling the naturally bursty workload. While various approaches have been proposed to elasticize multiplex hardware resources to design and implement an efficient autoscaling solution to handle bursty workloads, they barely discuss the soft resource (e.g., threads or database connections) reallocation to match the hardware resource allocation changes after the system scaling. Soft resources are key system components that control the concurrency level of a server and facilitate the sharing of hardware resources, which plays an important role in hardware resource efficiency and system performance.

In this dissertation, we focus on designing and optimizing the soft resource adaptation to complement existing hardware-only resource scaling approaches to mitigate the performance degradation caused by the imbalance between hardware and soft resources in modern cloud systems, including traditional n-tier systems, large-scale microservices-based systems, and emerging data streaming IoT systems. We first present a Concurrency-aware system Scaling (ConScale) framework that quickly adapts the optimal soft resource allocations of key servers provided by a novel Scatter-Concurrency-Throughput (SCT) model during the system scaling phase in the traditional n-tier systems. To address the stringent latency requirements of user-facing large-scale microservices, we present Sora, a latency-sensitive approach with a Scatter-Concurrency-Goodput (SCG) model to quickly adjust the optimal concurrency settings for critical microservices. Meanwhile, we also design an intelligent fast concurrency adaptation framework, μConAdapter, that leverages reinforcement learning to fast and adaptively adjust soft resource allocations and complement the hardware-only resource management to reduce the SLO violations. Finally, we have observed the evolving stateful objects in the persistent data streaming IoT applications can lead to performance degradation when stream processing engines adopt region-based GCs (e.g., G1, Shenandoah, and ZGC). We present a BoarderGuard framework, including effective online detection and mitigation based on concurrent GC threads adaptation, which can help system administrators achieve early detection and fast mitigation of the stateful objects evolution problem. We hope our study can contribute to the community in designing efficient resource management to achieve good performance and high resource efficiency of various cloud systems in data centers.



Committee Chair

Wang, Qingyang

Available for download on Thursday, April 01, 2027