Federated Africa and Middle East Conference on Software Engineering 7-8 June, 2022 Egypt-Uganda

Large-Scale Trace Analysis for Microservice Anomaly Detection and Root Cause Localization

08 Jun 2022
10:30 - 11:15
Online

Large-Scale Trace Analysis for Microservice Anomaly Detection and Root Cause Localization

Xin Peng

Goals and objectives

Distributed tracing traces requests as they flow between services. It has been widely accepted and practiced in industry as an important means to achieve observability in microservice architecture for various purposes such as anomaly detection and root cause localization. However, trace analysis in an industrial microservice systems is often challenging due to the huge number of traces produced by the system and the difficulties in combining traces with other types of operation data such as logs and metrics. In this talk, I will first analyze the background and describe the industrial practice of distributed tracing and trace analysis. Then I will introduce our explorations on large-scale trace analysis for microservice anomaly detection and root cause localization.