Kafka streaming service with pull model needs improvement

30 Views Asked by At

I'm trying to improve or even re-write an existing system design, but I have some open questions. The goal is to capture the change of data from a main DB, perform some JOIN operations, and store them in a separate DB(currently KSQL). In the meantime, a downstream service also listens to the CDC and queries KSQL for the latest JOIN result.

Here I'm using Kafka to store data, and you might already have noticed there could be a race condition between the KSQL JOIN vs. downstream service, but let me include a diagram first.

enter image description here

Because the JOIN tables can take up to 5 - 8 seconds to complete, the application needs to pull queries with retry before the query result becomes available in KSQL.

My question is: Even if we replace KSQL with another DB such as Postgre with Kafka sink/connectors, the delay will still be there due to the JOIN table with the current design, right? The application server always wants the latest join table result.

I feel this is not a good design and want to hear suggestions. Thank you.

0

There are 0 best solutions below