I would like to parse a predicate such as: "3 > 2" or "MyVar = 0".
Ideally, I would use a small GADT to represent the predicate:
data Expr a where
I :: Int -> Expr Int
B :: Bool -> Expr Bool
Var :: String -> Expr Int
Add :: Expr Int -> Expr Int -> Expr Int
Eq :: Eq a => Expr a -> Expr a -> Expr Bool
Mt :: Eq a => Expr a -> Expr a -> Expr Bool
The expression 3 > 2 would parse as Mt (I 3) (I 2).
I tried to approach the problem with Parsec.
However, the module Text.Parsec.Expr deals only with expressions, with type a -> a -> a.
Any suggestions?
Parsing directly into a GADT is actually kind of tricky. In my experience, it's usually better to parse into an untyped ADT first (where the
a -> a -> atypes are a natural fit), and then separately "type check" the ADT by transforming it into the desired GADT. The main disadvantage is that you have to define two parallel types for the untyped and typed abstract syntax trees. (You can technically get around this with some type level tricks, but it's not worth it for a small language.) However, the resulting design is easier to work with and generally more flexible.In other words, I'd suggest using Parsec to parse into the untyped ADT:
and then write a type checker:
Actually, you won't be able to write a
tclike that. Instead you'll need to break it up into mutually recursive type checkers for different expression types:and you'll probably want to run them in a
Readermonad that provides a list of valid variables. (Type checking normally involves checking the types of variabels. In your case, you only have integer variables, but it still makes sense to ensure the variables are defined at the type checking stage.)If you get stuck, a full solution follows...
SPOILERS
.
.
.
As I say, I'd first write a Parsec parser for an untyped
UExprADT. Note that the Text.Parsec.Expr machinery works fine forUExpr -> UExpr -> UExproperators:Then, I'd write a type checker, probably something like the following. Note that for type checking, we only need to verify that the variable names exist; we don't need their values. However, I've used a single
Ctxtype for both type checking and evaluation.You may be surprised to learn that the most natural way to write a type checker doesn't actually "use" the GADT. It could have been equally easily written using two separate types for boolean and integer expressions. You would have found the same thing if you'd actually tried to parse directly into the GADT. The parser code would need to be pretty cleanly divided between a parser for boolean expressions of type
Parser (Expr Bool)and a parser for integer expressions of typeParser (Expr Int), and there'd be no straightforward way to write a singleParser (Expr a).Really, the advantage of the GADT representation only comes at the evaluation stage where you can write a simple, type-safe evaluator that triggers no "non-exhaustive pattern" warnings, like so: