How can I create an Avro schema from a python class?

847 Views Asked by At

How can I transform my simple python class like the following into a avro schema?

class Testo(SQLModel):
    name: str
    mea: int

This is the Testo.schema() output

{
    "title": "Testo",
    "type": "object",
    "properties": {
        "name": {
            "title": "Name",
            "type": "string"
        },
        "mea": {
            "title": "Mea",
            "type": "integer"
        }
    },
    "required": [
        "name",
        "mea"
    ]
}

from here I would like to create an Avro record. This can be converted online on konbert.com (select JSON to AVRO Schema) and it results in the Avro schema below. (all valid despite the name field which should be "Testo" instead of "Record".)

{
  "type": "record",
  "name": "Record",
  "fields": [
    {
      "name": "title",
      "type": "string"
    },
    {
      "name": "type",
      "type": "string"
    },
    {
      "name": "properties.name.title",
      "type": "string"
    },
    {
      "name": "properties.name.type",
      "type": "string"
    },
    {
      "name": "properties.mea.title",
      "type": "string"
    },
    {
      "name": "properties.mea.type",
      "type": "string"
    },
    {
      "name": "required",
      "type": {
        "type": "array",
        "items": "string"
      }
    }
  ]
}

Anyhow, if they can do it, there certainly must be a way to convert it with current python libraries. Which library can do a valid conversion (and also complex python models/classes?

If there is an opinion of that this is a wrong approach, that is also welcome - if - pointing out a better way how this translation process can be done.

2

There are 2 best solutions below

0
Jimothy On BEST ANSWER

It looks like there are a few libraries that aim to provide this kind of functionality:

  1. py-avro-schema has support for generic Python classes
  2. dataclasses-avroschema has support for dataclasses, pydantic models, and faust records
  3. pydantic-avro requires your Python class to inherit from pydantic.BaseModel
0
feder On

I didn't find a python library doing this, thus I've wrote it my self.

I loop over all the types and translate them one by one. I go recursive where there is a class reference in a field.

e.g. here example of the start of the method.

fields = []
        for field in model_class.__fields__.values():
            if issubclass(field.type_, SQLModel):
                # Recursively generate schema for nested models
                fields.append({"name": field.name, "type": self.create_field_array(self, field.type_)})
            elif field.type_ == str:
                fields.append({"name": field.name, "type": "string"})

... etc

For details on this method, check out our repository fa-models for trading on GitHub. This class may translate regular python classes as well as pydantic, sqlalchemy and SQLModel classes. Help to increase the test cases for the library. We accept pull requests.