Abstract Data Type definition in Python

254 Views Asked by At

Consider the following Abstract Data Type (using Haskell syntax):

data Expr = Literal String | Symbol String | And [Expr] | Or [Expr]

In Python, one can make use of dataclasses and inheritance to obtain a similar type construction:

@dataclass
class Expr:
    # "union"
    foo: str | None = None
    bar: list["Expr"] | None = None


@dataclass
class Literal(Expr):
    pass


@dataclass
class Symbol(Expr):
    pass


@dataclass
class And(Expr):
    pass


@dataclass
class Or(Expr):
    pass

As an interesting exercise, I am wondering whether it is possible to obtain a similar effect, but with a different definition which avoids duplication. I came up with the following theoretical notation:

# simulate Haskell's
# data Expr = Literal String | Symbol String | And [Expr] | Or [Expr]
# the following will bring names into scope
(
    make_adt(
        type_cons="Expr",
        possible_fields=[
            ("foo", str),
            ("bar", list["Expr"]),
        ],
    )
    .add_data_cons("Literal", fields=["foo"])
    .add_data_cons("Symbol", fields=["foo"])
    .add_data_cons("And", fields=["bar"])
    .add_data_cons("Or", fields=["bar"])
)

Here I am saying that there's a base type (the Expr type constructor) with 4 data constructors: Literal, Symbol, And, Or.

Each data constructor takes an additional argument (either str or list[Expr]), which is referred in the fields argument above (must be a subset of the possible_fields).

So:

  • Literal("foo"): sets the foo field for the instance
  • And([Literal("foo"), Symbol("baz")]): sets the bar field for the instance

The constraint here, as opposed to plain inheritance, is that that Literal and Symbol don't have the bar field, and similarly, And, Or, don't have the foo field. Or, to relax this a bit, we at least have to enforce that only non-null attributes are the ones defined in the fields list above.

My questions are:

  • Can something like this be implemented?
    • I'm thinking along the lines of attrs and dynamic class creation using type(...).
  • How brittle it would be?

P.S. I know it does not necessarily make sense to over-engineer this, especially in a dynamically typed language like Python, but I consider it to be an interesting exercise nonetheless.

2

There are 2 best solutions below

0
Alexandru Dinu On

Here's an initial attempt which can pave the way to further refinements.

Let's start with a simpler ADT:

data List = Null | Cons Integer List

We can support this in Python by implementing the following:

from dataclasses import dataclass, make_dataclass


def make_adt(
    type_cons: str, 
    data_cons: dict[str, list[tuple[str, type]]]
) -> None:
    """
    data <type_cons> = <data_cons>

    Nothing is returned, instead, the global namespace is updated.
    """
    base = dataclass(type(type_cons, (object,), {}))
    globals()[type_cons] = base

    for name, ts in data_cons.items():
        derived = make_dataclass(name, ts, bases=(base,))
        globals()[name] = derived

So we can use this as:

make_adt(
    type_cons="List",
    data_cons={
        "Null": [],
        "Cons": [("x", int), ("xs", "List")],
    },
)

Furthermore:

def show(xs: List) -> str:
    match xs:
        case Null():
            return "<END>"
        case Cons(head, tail):
            return f"{head}:{show(tail)}"

lst = Cons(1, Cons(2, Cons(3, Null())))
print(lst)
# Cons(x=1, xs=Cons(x=2, xs=Cons(x=3, xs=Null())))

show(lst)
# '1:2:3:<END>'

My original ADT can be defined like:

make_adt(
    type_cons="Expr",
    data_cons={
        "Literal": [('x', str)],
        "Symbol": [('x', str)],
        "And": [('xs', list["Expr"])],
        "Or": [('xs', list["Expr"])],
    },
)

x = Literal('42')
y = Symbol('<foo>')
z = And([x, y])
q = Or([x, y])

print(z)
# And(xs=[Literal(x='42'), Symbol(x='<foo>')])
4
Dillon Davis On

Your idea inspired me to take my own twist on the problem. My solution in its current state is somewhat hacky, and has its own limitations (you cannot name parameters), but I think it feels much nicer to write.

The machinery to set it up:

from typing import Callable, ParamSpec, List, get_args
from collections import namedtuple

class ADT(type):
    def __new__(metacls, name, bases, namespace):
        new_type = super().__new__(metacls, name, bases, namespace)
        for cons_name, cons_type in new_type.__annotations__.items():
            param_types, ret_type = get_args(cons_type)
            type_const = type(cons_name, (new_type, namedtuple(cons_name, [""]*len(param_types), rename=True)), {})
            type_const.__annotations__.update({cons_name: Callable[param_types, new_type]})
            setattr(new_type, cons_name, type_const)
        return new_type
    
    @classmethod
    def __prepare__(metacls, name, bases, **kwargs):
        namespace = {"TypeCons": Callable[ParamSpec("P"), name]}
        return namespace

Afterwards, you can use it like so:

class Expr(metaclass=ADT):
    Literal: TypeCons[str]
    Symbol:  TypeCons[str]
    And:     TypeCons[List["Expr"]]
    Or:      TypeCons[List["Expr"]]

Note the quoting of "Expr" is typical for self-referential type annotations in Python. I'm fairly certain I could hack something together to remove that limitation using reflection, but it would be way hackier than what I have above even.

Moving on, here's what it would look like in use, using Python's new match-case statements:

expr = Expr.Or([Expr.Literal("1"), Expr.Symbol("x")])

match expr:
    case Expr.Or([]):
        print("Won't match")
    case Expr.Or([Expr.Literal("1"), Expr.Symbol(sym)]):
        print("Matching with symbol:", sym)
    case _:
        print("Catch-all")

Which of course prints "Matching with symbol: x".

I had also considered using method stubs instead of type annotations, like below:

class Expr2(metaclass=ADT2):
    def Literal(foo: str):
        ...
    def Symbol(foo: str):
        ...
    def And(bar: List["Expr"]):
        ...
    def Or(bar: List["Expr"]):
        ...

-but ultimately decided on the annotation based approach (partially for being a bit cleaner, partially for the implementation challenge).