Best practices for distributed tracing in Flink

527 Views Asked by salvalcantara At 03 June 2023 at 07:00

I'm considering distributing tracing in the context of Flink. I have the following questions:

How to implement tracing internally within the Flink pipeline itself? That is, how to propagate the tracing context between the different operators from the sources to the sinks (In-Process Context Propagation)?
How to glue things at the edges? That is, extract the context when reading from the sources and put it when writing to the sinks (Inter-Process Context Propagation)?

For 2 in particular I just need to support Kafka sources & sinks. I guess the typical thing would be to use the kafka headers for that, as described here. This also has the advantage that it does not require changes in the payload schemas for example.

More generally, are there any (Flink-specific) libraries/integrations available which facilitate the task at hand? E.g., by decorating transformations with tracing capabilities as done here for Kafka Streams. See also this related question or this interceptor-like wrapper which could be another option for effectively enlarging the context for tracing purposes.

For what it's worth, I'm mostly interested in solutions based on OpenTracing and/or OpenTelemetry.

Original Q&A

There are 1 best solutions below

Devean On 07 March 2024 at 03:12

within the process
- java process to use ThreadLocal to pass
- Can not invade the queue, framework, components can be wrapped objects, in the wrapped object to pass
between processes
- rest, rpc, mq protocol within the Header pass
See trace implementation for more information

Best practices for distributed tracing in Flink

There are 1 best solutions below

Related Questions in APACHE-KAFKA

Related Questions in APACHE-FLINK

Related Questions in OPEN-TELEMETRY

Related Questions in OPENTRACING

Trending Questions

Popular # Hahtags

Popular Questions