I have tried to create a Lexer recently, and it doesn't work out well.
The problem is it's thrown an error message shows that "Can't build lexer". Here's the traceback:
ERROR: Rule 't_TIMES' defined for an unspecified token TIMES
ERROR: Rule 't_DIVIDE' defined for an unspecified token DIVIDE
Traceback (most recent call last):
File "...\Lexer.py", line 24, in <module>
lexer = lex.lex()
^^^^^^^^^
File "...\lex.py", line 910, in lex
raise SyntaxError("Can't build lexer")
SyntaxError: Can't build lexer
I'm aware that it's because of the t_error() function of mine. I also sense the token I've made maybe having a problem. Please help me with that, I know that this is kind of dumb but I'm new, so please be nice to me.
Btw, here's the source code
import ply.lex as lex
import ply.yacc as yacc
import sys
tokens = [
"INT",
"ID",
"PLUS",
"MINUS",
"EOF",
]
t_INT = r"\d+"
t_ID = r"[a-zA-Z_][a-zA-Z0-9_]*"
t_PLUS = r"+"
t_MINUS = r"-"
t_TIMES = r"*"
t_DIVIDE = r"/"
def t_error(t):
print("Illegal character '%s'" % t.lexer.lexeme, file=sys.stderr)
lexer = lex.lex()
def p_expression(p):
"""expression : INT
| ID
| expression PLUS expression
| expression MINUS expression
| expression TIMES expression
| expression DIVIDE expression"""
if len(p) == 2:
if isinstance(p[1], int):
p[0] = p[1]
elif isinstance(p[1], str):
p[0] = p[1]
else:
if p[2] == "+":
p[0] = p[1] + p[3]
elif p[2] == "-":
p[0] = p[1] - p[3]
elif p[2] == "*":
p[0] = p[1] * p[3]
elif p[2] == "/":
p[0] = p[1] / p[3]
parser = yacc.yacc()
def test(text):
try:
result = parser.parse(text)
if result:
print(result)
else:
print("Empty expression")
except yacc.YaccError:
print("Error parsing input")
if __name__ == "__main__":
test("123")
test("hello")
test("123 + 456")
test("123 - 456")
test("123 * 456")
test("123 / 456")
Maybe I'm just stupid, but because of that so I cannot make it to run.
These errors...
...seem pretty clear. You haven't defined the tokens named
TIMESorDIVIDEin yourtokensarray. You need:Once you fix those errors, you will get:
That's because the characters
+and*are both regex wildcards, so you need to escape them if you want the literal character:Once you fix those errors, you'll ultimately get this from your
t_errormethod:There doesn't appear to be a
lexemeattribute, but you can uset.value:Having fixed that error, you will now get:
You have spaces in your expressions, but you haven't accounted for this in your rules. The quick fix is to remove the spaces in your test expressions:
Having fixed that error, you'll now get:
And that's because you're trying to add string values in your
p_expressionmethod. You need to convert them to numbers before applying arithmetic operators. The easiest solution is to replace your definition oft_INTwith this method:And now running the code produces: