What Is the RED Method?

RED Method is a monitoring methodology for request-driven microservices that focuses on three key metrics: Rate (requests per second), Errors (failed requests per second), and Duration (distribution of request latency). Created by Tom Wilkie, the RED method simplifies the golden signals into a practical framework specifically designed for services that handle requests in cloud-native architectures.

Why the RED Method Matters

While the four golden signals provide a comprehensive framework, not every signal is equally relevant for every service. The RED method focuses specifically on the metrics that matter most for request-driven services like APIs, web servers, and microservices. By tracking just three metrics per service, teams can quickly assess health, detect regressions, and identify bottlenecks without drowning in unnecessary data.

Teams that understand and adopt red method gain a significant operational advantage, reducing manual effort and improving the reliability and scalability of their infrastructure. As cloud-native adoption accelerates, familiarity with red method has become a core competency for DevOps engineers, platform teams, and site reliability engineers working in production Kubernetes and cloud environments.

How the RED Method Works

For each service, instrument and track three metrics. Rate is the number of requests per second. Errors is the number of failing requests. Duration is how long requests take, measured at p50, p95, and p99 percentiles. These metrics can be collected using Prometheus with standard HTTP middleware instrumentation or through service mesh telemetry from Istio or Linkerd without application code changes.

Understanding how red method fits into the broader cloud-native ecosystem is important for making informed architecture decisions. It works alongside other tools and practices in the DevOps and platform engineering space, and choosing the right combination depends on your team's specific requirements, scale, and operational maturity.

Key Features

Rate

Requests per second shows demand on the service, helping understand traffic patterns and detect anomalies.

Errors

Failed requests per second, either as count or percentage, reveals reliability problems quickly.

Duration

Latency percentiles show how fast the service responds, with p99 capturing worst-case user experience.

Simplicity

Three metrics per service create a manageable monitoring framework that scales across dozens of microservices.

Common Use Cases

Creating standardized Grafana dashboards that show RED metrics for every microservice in the architecture.

Using RED metrics as the basis for SLO definitions that track request success rate and latency targets.

Detecting performance regressions after deployments by comparing RED metrics before and after the release.

Building alerting rules that fire when error rate exceeds a threshold or p99 latency spikes above acceptable levels.

How Obsium Helps

Obsium's managed observability team helps organizations implement and optimize red method as part of production-grade infrastructure. Whether you are adopting red method for the first time or looking to improve an existing implementation, our engineers bring hands-on experience across cloud platforms and Kubernetes environments. Learn more about our managed observability services →

×

Contact Us