Traversing and Pattern Matching an Abstract Syntax Tree

480 Views Asked by At

I've built a parser and lexer with Alex and Happy which produces and abstract syntax tree of the language I'm parsing (Solidity). My problem now is how I properly traverse and match certain aspects of the language. The aim is to create a rule engine which will perform code analysis on the resulting AST, checking for specific issues like improper uses of functions, dangerous calls or the lacking of certain elements.

This is the layout of my data, which happy outputs as the AST. (This isn't the full AST but just a snapshot)

 data SourceUnit = SourceUnit PragmaDirective
                  | ImportUnit ImportDirective 
                  | ContractDef ContractDefinition
                  deriving (Show, Eq, Data, Typeable, Ord)

  -- Version Information
  data PragmaDirective = PragmaDirective PragmaName Version Int
                        deriving(Show, Eq, Data, Typeable, Ord)

  data Version = Version String 
                deriving (Show, Eq, Data, Typeable, Ord)

  data PragmaName = PragmaName Ident
                    deriving(Show, Eq, Typeable, Data, Ord)

  data PragmaValue = PragmaValue Dnum
                    deriving(Show, Eq, Data, Typeable, Ord)
  -- File imports/Contract Imports
  data ImportDirective = ImportDir String
                      | ImportMulti Identifier Identifier Identifier String 
                        deriving (Show, Eq, Data, Typeable, Ord)

  -- The definition of an actual Contract Code Block
  data ContractDefinition = Contract Identifier [InheritanceSpec] [ContractConts]
                            deriving (Show, Eq, Data, Typeable, Ord)

  data ContractConts = StateVarDec StateVarDeclaration
               | FunctionDefinition FunctionDef
               | UsingFor UsingForDec
               deriving (Show, Eq, Data, Typeable, Ord)

My current train of thought is to use pattern matching by passing in the [SourceUnit] to a function and matching for specific cases. For instance the following function matches the code and returns the data type for a state variable declaration.

  getStateVar :: [SourceUnit] -> Maybe StateVarDeclaration
  getStateVar [SourceUnit _ , ContractDef (Contract _ _ [StateVarDec x]) ] = Just x
  getStateVar _ = Nothing

This outputs the following, which is in part what I need. Unfortunately the language could contain multiple contract declarations, with multiple state variable declarations so I don't think it's entirely possible to match it in this fashion.

 Main> getStateVar $ runTest "pragma solidity ^0.5.0; contract test { address owner = msg.send;}"
  Just (StateVariableDeclaration (ElementaryTypeName (AddrType "address")) [] (Identifier "owner") [MemberAccess (IdentExpression "msg") "." (Identifier "send")])

I've read somewhat on generic programming and "scrap your boilerplate" but I don't understand exactly how it works or the best method to implement it would be.

The question is, am I on the right track in terms of pattern matching this way or is there a better alternative?

0

There are 0 best solutions below