I am doing performance testing of Kafka and need to test different large schemas. At the moment, I am working on Avro-based load testing.
Usually, when working with Kafka, you have data and generate a schema from that. I must test several schemas in this scenario, for which I don't own data. I need to generate sample Avro data based on the existing schema.
What are the possible solutions?
Tried solutions:
- I have tried making data manually, but it is too tedious.
- Searched for auto-generators with no luck
- Searched official Kafka documentation for the possible solution
- I tried writing my own but have little experience with Avro, so it seems too custom and non-maintainable to continue
How to generate sample data based on the existing Avro schema?
If you are comfortable with Python the
fastavrolibrary has utilities to generate data from the schema: https://fastavro.readthedocs.io/en/latest/utils.htmlAs an example: