My grammar contains expressions characterized by an identifier that only optionally is followed by a parenthesized list of expressions.
My problem is that Fparsec will show an "unintuitive" position on syntax errors when the error occurs in a nested expression.
For an example, consider this simple parser
let x = skipChar 'x' .>> spaces >>% Node.X
let y = skipChar 'y' .>> spaces >>% Node.Y
let c = skipChar ',' .>> spaces
let left = pchar '(' .>> spaces
let right = pchar ')' .>> spaces
let expr, exprRef = createParserForwardedToRef()
let paramList, paramListRef = createParserForwardedToRef()
let paramTuple = left >>. paramList .>> right
let xOrY = choice [x ; y]
let exprWithArgs = xOrY .>>. paramTuple |>> Node.Expr
exprRef.Value <- choice [ attempt exprWithArgs ; xOrY ]
paramListRef.Value <- sepBy1 expr c
let parser = expr .>> eof
let result = run parser "x(y, y, y(x, y, y(x,y)) )"
printf "\nParsing correct:\n%O" result
let resultWithError = run parser "x(y, y, y(x, y, y(x,z)) )"
printf "\n\nParsing error:\n%O" resultWithError
The output of the program is:
Parsing correct:
Success: Expr (X, [Y; Y; Expr (Y, [X; Y; Expr (Y, [X; Y])])])
Parsing error:
Failure:
Error in Ln: 1 Col: 2
x(y, y, y(x, y, y(x,z)) )
^
Expecting: end of input
The more nested expressions I have, the more difficult it becomes to find the actual erroneous position.
Is there a way to change my grammar (or to use/configure) FParsec in such a way that the error becomes better to find? I wished, I would get an intuitive error message like this:
Parsing correct:
Success: Expr (X, [Y; Y; Expr (Y, [X; Y; Expr (Y, [X; Y])])])
Parsing error:
Failure:
Error in Ln: 1 Col: 21
x(y, y, y(x, y, y(x,z)) )
^
Expecting: 'x' or 'y'
Let's use
"y(x,z)"as an example input. The problem occurs in theexpr/exprRefparser. First, it attempts to parseexprWithArgs, which fails because of thez. The parser backtracks (because ofattempt), and tries to parsexOrYinstead, which succeeds on they. But the next char is(instead ofeof, soparserthen fails at that position with an unhelpful message.Since
exprWithArgsstarts by parsingxOrY, we can factor that out, leading to this solution:This has the advantage of not backtracking, so the parser always moves forward until it can't proceed further, resulting in a clearer error message. Note that
exprWithArgsis no longer used.If you don't like the computation builder, you can do this instead:
Output is now:
There might be simpler solutions as well, but I think this does what you want.