I am writing a parser in python with antlr4.
In short, Input line is:
concept foo bar
The grammar rule which parse above input line is:
start_rule: 'concept' identifier
identifier: ID {ID}
To get column number and line number of all IDs in input line I am adding code in enteridentifier(self, ctx) function.
enteridentifier(self, ctx):
context = ctx.start
line_number = context.line
column_number = conext.column
The above snippet of code returns column number of first ID i.e. foo. If multiple IDs present in same line number (belongs to same rule) i.e. foo bar then how can I get column number for both IDs?
On
enteryou will not have the entire (sub) parse tree ready for evaluation. Instead use theexitvariant or (better yet) do a post-parse stage (often called the semantic phase) to extract this kind of information.Once you have a complete (sub) tree, you can access the
IDmember in theidentifiercontext, which is an array, if more than one occurences of a specific rule or token can appear. Iterate over that to get the individual child elements.To get the column information for an
IDuse the fact that the for lexer tokens thegetChild()call returns aTerminalNodeinstance (you have to cast the result to that). From there callgetSymbol()which gives you aTokeninstance, which in turn has all the info for a specific token like text, channel, column, line, type etc.