Stack Overflow community,
I am currently working on a data processing pipeline where I receive data from Kafka in the following format:
{
"id": "XYZ",
"index": "original_data_index",
"updated_data": {
"id": "XYZ1",
"index": "updated_data_index"
}
}
My goal is to store the data in two separate Elasticsearch indices:
The original data should be stored in index1 as:
{
"id": "XYZ",
"index": "original_data_index"
}
The updated data should be stored in index2 as:
{
"id": "XYZ1",
"index": "updated_data_index"
}
I'm currently using Logstash as part of my pipeline, and I would like to know how I can achieve this transformation. Could someone provide guidance on configuring the Logstash pipeline to handle this specific data transformation scenario?
Additionally, if there are any best practices or considerations for handling Kafka data transformations with Logstash in the context of Elasticsearch indexing, I would appreciate any insights.
Thank you in advance for your help!
You can use pipeline-to-pipeline communication with a forked path pattern to process the event in two different ways. In each path you can use a prune filter or mutate+remove_field to remove the fields you do not want in the index that the path outputs to.