PEG rules fails to EOI

243 Views Asked by At

I am trying to use PEG expression to take parse the file. My PEG expression is:

WHITESPACE = _{" "}
level = {ASCII_DIGIT*}
verb = {ASCII_ALPHA{,4}}
value = {ASCII_ALPHANUMERIC*}
structure = { level ~ verb ~ value }
file = { SOI ~ (structure? ~ NEWLINE)* ~ EOI }

I parse this text:

0 HEAD
1 VERB test
2 STOP

file parse text successfully only, if I have an extra \n at the end of the text. If I remove the \n, parse fails due to 'expected EOI'. I understood that this happens, because of my rule for file. I tried to use different rules for file and got infinite loop. So, practically I don't know how to solve this issue. I am using rust and latest pest.

3

There are 3 best solutions below

2
Caesar On

I changed the rules to

level = {ASCII_DIGIT+}
verb = {ASCII_ALPHA{1,4}}
file = { SOI ~ (structure? ~ NEWLINE)* ~ structure? ~ EOI }

and that seemed to work just fine, regardless of the trailing newline. But maybe I overlooked something. If you could edit your question to show the rules and input that caused an infinite loop with this, that'd be great.

2
Maria On
WHITESPACE = _{ " " }
level = {ASCII_DIGIT+}
verb = {ASCII_ALPHA{,4}}
value = {ASCII_ALPHANUMERIC*}
stop = { level ~ "STOP" }
structure = { level ~ verb ~ value }
line = {structure | trlr}
file = { SOI ~ (line ~ NEWLINE?)* ~ EOI }

Checked on https://pest.rs/

0
Josh Voigts On

This seems to work. It can handle arbitrary number of newlines at the beginning or end as well:

file = { SOI ~ NEWLINE* ~ structure ~ (NEWLINE ~ structure)* NEWLINE* ~ EOI }
WHITESPACE = _{" "}
level = {ASCII_DIGIT+}
verb = {ASCII_ALPHA{1,4}}
value = {ASCII_ALPHANUMERIC*}
structure = { level ~ verb ~ value }