...

/

Instrumenting For Distributed Tracing

Instrumenting For Distributed Tracing

Understand distributed tracing with OpenTelemetry, including docker-compose and a Jaeger visualization.

Traces track the progression of a single activity in an application. For example, an activity can be a user making a request in our application. If a trace only tracks the progression of that activity in a single process or a single component of a system composed of many components, its value is limited. However, if a trace can be propagated across multiple components in a system, it becomes much more useful. Traces that can propagate across components in a system are called distributed traces. Distributed tracing and correlation of activities is a powerful tool for determining causality within a complex system.

A trace is composed of spans that represent units of work within an application. Each trace and span can be uniquely identified, and each span contains a context consisting of Request, Error, and Duration metrics. A trace contains a tree of spans with a single root span. For example, imagine a user clicking on the checkout button on our company's commerce site. The root span would encompass the entire request/response cycle as perceived by the user clicking on the checkout button. There would likely be many child spans for that single root span, such as a query for product data, charging a credit card, and updating a database. Perhaps there would also be an error associated with one of the underlying spans within that root span. Each span has metadata associated with it, such as a name, start and end timestamps, events, and status. By creating a tree of spans with this metadata, we are able to deeply inspect the state of complex applications.

In this lesson, we will learn to instrument Go applications with OpenTelemetry to emit distributed tracing telemetry, which we will inspect using Jaeger, an open-source tool for visualizing and querying distributed traces.

The life cycle of a distributed trace

Before we get into the code, let's first discuss how distributed tracing works. Let's imagine we have two services, A and B. ...