Document Type
Conference Proceeding
Publication Date
7-13-2017
Abstract
Modern distributed systems are often considered to be black boxes that greatly limit the potential to understand behaviors at the level of detail necessary to diagnose some of the most important types of performance problems. Recently researchers have found abnormal response time delays, one to two orders of magnitude longer than the average response time, that exist in short periods and cause economic loss for service providers. These very short bottlenecks are hard to detect due to their short life spans and their variety of possible reasons. In this paper, we propose milliScope (mScope), the first millisecond-granularity software-based resource and event monitoring for distributed systems that achieves both performance, low overhead at high frequency, and high accuracy matched with other firmware monitoring tool. More specifically, milliScope is a fine-grained monitoring framework to collaborate multiple mScopeMonitors for event and resource monitoring to reconstruct the flow of each client request and profile execution performance in a distributed system. We utilize the resource mScopeMonitors for system resource monitoring, and we develop our own event mScopeMonitors to identify the execution boundary in a lightweight, precise and systematic methodology. The semantic and syntactic of these monitoring logs with arbitrary formats are enriched by our multistage data transformation tool, mScopeDataTransformer, which unifies the diverse monitoring logs into a dynamic data warehouse, mScopeDB, for advanced analysis. We conduct several illustrative scenarios in which milliScope successfully diagnoses the response time anomalies caused by very short bottlenecks using a representative web application benchmark (RUBBoS).
Publication Source (Journal or Book title)
Proceedings - International Conference on Distributed Computing Systems
First Page
92
Last Page
102
Recommended Citation
Lai, C., Kimball, J., Zhu, T., Wang, Q., & Pu, C. (2017). MilliScope: A Fine-Grained Monitoring Framework for Performance Debugging of n-Tier Web Services. Proceedings - International Conference on Distributed Computing Systems, 92-102. https://doi.org/10.1109/ICDCS.2017.228