A Study of Long-Tail Latency in n-Tier Systems: RPC vs. Asynchronous Invocations

Document Type

Conference Proceeding

Publication Date

7-13-2017

Abstract

Long-tail latency of web-facing applications continues to be a serious problem. Most of the previously published research addresses two classes of long latency problems: uneven workloads such as web search, and resource saturation in single nodes. We describe an experimental study of a third class of long tail latency problemsthat are specific to distributed systems: Cross-Tier Queue Overflow (CTQO) due to a combination of millibottlenecks (with sub-second duration) and tightly-coupled servers in n-tier systems (e.g., Apache, Tomcat, and MySQL) using RPC-style request-response communications. Our experiments show that the appearance of millibottlenecks (e.g., created by short workload bursts) in one server often causes another server (which has no saturated resources) in the synchronous invocation chain to fill up its queues (CTQO) and drop packets, creating very long response time queries. CTQO can be reduced or avoided by replacing the server dropping packets with an asynchronous server. In synchronous n-tier system experiments, long tail latency due to CTQO can be reproduced consistently atutilization as low as 43%. In contrast, when all n-tier servers are replaced by asynchronous versions, CTQO and consequent dropped packets remain absent at utilization levels as high as 83%, despite the same millibottlenecks.

Publication Source (Journal or Book title)

Proceedings - International Conference on Distributed Computing Systems

First Page

207

Last Page

217

This document is currently not available here.

Share

COinS