I’m a bit new to ANTLR4 (and parsing in general), and I’m wondering if I’ve bitten off a little more than I can chew.
I’m trying to parse this line into a series of tokens:
This is some text @1 @2 @/searchstring/
Looks simple enough, but the problem is that I need to separate them into:
This is some text
@1
@2
@/searchstring/
I’ve come up with this for the grammar:
grammar Callout;
callout : phrase parameterlist ;
phrase : ~'@'+? ;
parameterlist : param (WHITESPACE+ param)* ;
param : numericparam | searchparam ;
numericparam : '@' DIGITS ;
DIGITS : [0-9]+ ;
searchparam : '@/' SEARCH '/' ;
SEARCH : ~[/]+ ;
WHITESPACE : [\t ] ;
I try it out and it seems to match the whole string without the @ symbol (I imagine that’s because the longest match seems to be the whole string, though I’m not sure what happened to the @ characters).

Maybe. By looking at your grammar, I think it'd be good to start by learning/understanding the basics.
The part
~'@'+in the rulephrase : ~'@'+? ;does not what you think it does. Using literal tokens (like'@') inside parser rules will cause ANTLR to create tokens (lexer rules) for you. Your rulephrase : ~'@'+?;is translated into the following:And the
~inside a parser rule does not cause characters to be excluded, but tokens to be excluded. In other words:~T__0does not mean: "match any character other than an@", but rather: "match any token other than theT__0token". Best not use these literal tokens inside parser rules unless you know what you're doing (which you don't;))Also, ANTLR's lexer rules will consume as many characters as possible and the rule that matches the most, will "win". ANTLR will not try to match some other token if the parser is trying to match a certain token. So the rule
SEARCH : ~[/]+ ;is too greedy: it will consume the inputThis is some text @1 @2 @into a single token.Try something like this instead: