Large-Scale Trace Analysis for Microservice Anomaly Detection and Root Cause Localization
Xin Peng
Goals and objectives
Distributed tracing traces requests as they flow between services. It has been widely accepted and practiced in industry as an important means to achieve observability in microservice architecture for various purposes such as anomaly detection and root cause localization. However, trace analysis in an industrial microservice systems is often challenging due to the huge number of traces produced by the system and the difficulties in combining traces with other types of operation data such as logs and metrics. In this talk, I will first analyze the background and describe the industrial practice of distributed tracing and trace analysis. Then I will introduce our explorations on large-scale trace analysis for microservice anomaly detection and root cause localization.