Best practices for distributed tracing in Flink

527 Views Asked by At

I'm considering distributing tracing in the context of Flink. I have the following questions:

  1. How to implement tracing internally within the Flink pipeline itself? That is, how to propagate the tracing context between the different operators from the sources to the sinks (In-Process Context Propagation)?

  2. How to glue things at the edges? That is, extract the context when reading from the sources and put it when writing to the sinks (Inter-Process Context Propagation)?

For 2 in particular I just need to support Kafka sources & sinks. I guess the typical thing would be to use the kafka headers for that, as described here. This also has the advantage that it does not require changes in the payload schemas for example.

More generally, are there any (Flink-specific) libraries/integrations available which facilitate the task at hand? E.g., by decorating transformations with tracing capabilities as done here for Kafka Streams. See also this related question or this interceptor-like wrapper which could be another option for effectively enlarging the context for tracing purposes.

For what it's worth, I'm mostly interested in solutions based on OpenTracing and/or OpenTelemetry.

1

There are 1 best solutions below

0
Devean On
  1. within the process

    • java process to use ThreadLocal to pass
    • Can not invade the queue, framework, components can be wrapped objects, in the wrapped object to pass
  2. between processes

    • rest, rpc, mq protocol within the Header pass

    See trace implementation for more information