I am using jparsec to parse strings like:
[1,2, 3]
[ 3, 4]
[3 ,4,56, 7 ]
[]
I have implemented a few classes (inheriting from my Token interface) to represent the tokens:
final class OpenListToken
final class CommaToken
final class CloseListToken
final class NumberToken // Has a public final property "value" that contains the int
I have also implemented tokenizers for each:
static final Parser<OpenListToken> openListTokenParser
static final Parser<CommaToken> commaTokenParser
static final Parser<CloseListToken> closeListTokenParser
static final Parser<NumberToken> numberTokenParser
These all work at a character level. For example:
final NumberToken t = numberTokenParser.parse("123");
// t.value == 123
final OpenListToken u = openListToken.parse("[");
// Succeeds
Now I would like to combine them to make a parser of ListExpression, which is a class than represents a list of numbers. I have tried something like:
openListTokenParser
.next(numberTokenParser.sepBy(commaTokenParser))
.followedBy(closeListTokenParser)
This works for strings like [1,2,3] but obviously not for strings like [ 1, 2 ].
Is there an operator that takes some parsers and puts whitespace* between them?
Or is it possible to make my ListExpression parser work on a stream of my Token interface instances instead of characters?
You can directly build a tokenizer using the functions from
Terminalsclass. In your case, this would look like the following:First define the set of our terminals, e.g. operators, keywords, words...
Our tokens are then either tokenized by our terminals or the
IntegerLiteraltokenizer:Our final results from a syntactic parsers for integers (built from tokens tagged as
INTEGER), separated by our comma token, between our brackets token. We ignore any whitespace in between all tokens (this is the second argument tofrom:Et voilà: