A Study of Response Time Instability of Microservices at High Resource Utilization in the Cloud
Document Type
Conference Proceeding
Publication Date
1-1-2024
Abstract
Maintaining consistently low response times is a critical performance requirement for mission-critical, web-facing applications (e.g., e-commerce) that typically use microservices architectures. Through extensive benchmarking of a microservices application in a cloud environment, we demonstrate that its response time stability is fragile, with significant variations (ranging from milliseconds to seconds) when the CPU utilization of component servers reaches moderate to high levels (e.g., 60%). Our detailed timeline analysis identifies a leading cause of response time instability at these utilization levels: a ripple effect caused by the long chain of dependencies inherent in microservices applications. In such an architecture, a millibottleneck (a bottleneck lasting sub-seconds) can trigger a queuing effect from a downstream server that propagates to upstream servers, resulting in dropped requests and TCP retransmissions lasting several seconds at the weakest link in the chain. We demonstrate two common factors that contribute to millibottlenecks in microservices runtime environments: interference from collocated containers and bursty workloads, both of which are prevalent in real-world cloud environments.
Publication Source (Journal or Book title)
Proceedings - 2024 IEEE 6th International Conference on Cognitive Machine Intelligence, CogMI 2024
First Page
111
Last Page
116
Recommended Citation
Wang, Q., Gu, X., & Pu, C. (2024). A Study of Response Time Instability of Microservices at High Resource Utilization in the Cloud. Proceedings - 2024 IEEE 6th International Conference on Cognitive Machine Intelligence, CogMI 2024, 111-116. https://doi.org/10.1109/CogMI62246.2024.00024