I am new to this and I am trying to understand what is going on here, where I get two shift reduce conflicts. I have grammar (If I am missing something I can add the needed rules):
%start translation_unit
%token EOF 0 "end of file"
%token LBRA "["
%token RBRA "]"
%token LPAR "("
%token RPAR ")"
%token DOT "."
%token ARROW "->"
%token INCDEC "++ or --"
%token COMMA ","
%token AMP "&"
%token STAR "*"
%token ADDOP "+ or -"
%token EMPH "!"
%token DIVOP "/ or %"
%token CMPO "<, >, <=, or >="
%token CMPE "== or !="
%token DAMP "&&"
%token DVERT "||"
%token ASGN "="
%token CASS "*=, /=, %=, +=, or -="
%token SEMIC ";"
%token LCUR "{"
%token RCUR "}"
%token COLON ":"
%token TYPEDEF "typedef"
%token VOID "void"
%token ETYPE "_Bool, char, or int"
%token STRUCT "struct"
%token ENUM "enum"
%token CONST "const"
%token IF "if"
%token ELSE "else"
%token DO "do"
%token WHILE "while"
%token FOR "for"
%token GOTO "goto"
%token CONTINUE "continue"
%token BREAK "break"
%token RETURN "return"
%token SIZEOF "sizeof"
%token<string> IDF "identifier"
%token<string> TYPEIDF "type identifier"
%token<int> INTLIT "integer literal"
%token<string> STRLIT "string literal"
/////////////////////////////////
%%
/////////////////////////////////
primary_expression:
IDF
| INTLIT
| STRLIT
| LPAR expression RPAR
;
postfix_expression:
primary_expression
| postfix_expression LBRA expression RBRA
| postfix_expression LPAR argument_expression_list_opt RPAR
| postfix_expression DOT IDF
| postfix_expression ARROW IDF
| postfix_expression INCDEC
;
argument_expression_list_opt:
%empty
| argument_expression_list
;
argument_expression_list:
assignment_expression
| argument_expression_list COMMA assignment_expression
;
unary_expression:
postfix_expression
| INCDEC unary_expression
| unary_operator cast_expression
| SIZEOF LPAR type_name RPAR
;
unary_operator:
AMP
| STAR
| ADDOP
| EMPH
;
cast_expression:
unary_expression
;
multiplicative_expression:
cast_expression
| multiplicative_expression STAR cast_expression
| multiplicative_expression DIVOP cast_expression
;
additive_expression:
multiplicative_expression
| additive_expression ADDOP multiplicative_expression
;
relational_expression:
additive_expression
| relational_expression CMPO additive_expression
;
equality_expression:
relational_expression
| equality_expression CMPE relational_expression
;
logical_AND_expression:
equality_expression
| logical_AND_expression DAMP equality_expression
;
logical_OR_expression:
logical_AND_expression
| logical_OR_expression DVERT logical_AND_expression
;
assignment_expression:
logical_OR_expression
| unary_expression assignment_operator assignment_expression
;
assignment_operator:
ASGN
| CASS
;
expression:
assignment_expression
;
constant_expression:
logical_OR_expression
;
declaration:
no_leading_attribute_declaration
;
no_leading_attribute_declaration:
declaration_specifiers init_declarator_list_opt SEMIC
;
init_declarator_list_opt:
%empty
| init_declarator_list
;
declaration_specifiers:
declaration_specifier
| declaration_specifier declaration_specifiers
;
declaration_specifier:
storage_class_specifier
| type_specifier_qualifier
;
init_declarator_list:
init_declarator
| init_declarator_list COMMA init_declarator
;
init_declarator:
declarator
;
storage_class_specifier:
TYPEDEF
;
type_specifier:
VOID
| ETYPE
| struct_or_union_specifier
| enum_specifier
| typedef_name
;
struct_or_union_specifier:
struct_or_union IDF LCUR member_declaration_list RCUR
| struct_or_union IDF
;
struct_or_union:
STRUCT
;
member_declaration_list:
member_declaration
| member_declaration_list member_declaration
;
member_declaration:
specifier_qualifier_list member_declarator_list_opt SEMIC
;
member_declarator_list_opt:
%empty
| member_declarator_list
;
specifier_qualifier_list:
type_specifier_qualifier
| type_specifier_qualifier specifier_qualifier_list
;
type_specifier_qualifier:
type_specifier
| type_qualifier
;
member_declarator_list:
member_declarator
| member_declarator_list COMMA member_declarator
;
member_declarator:
declarator
;
enum_specifier:
ENUM IDF LCUR enumerator_list RCUR
| ENUM IDF LCUR enumerator_list COMMA RCUR
| ENUM IDF
;
enumerator_list:
enumerator
| enumerator_list COMMA enumerator
;
enumerator:
ENUM
| ENUM ASGN constant_expression
;
type_qualifier:
CONST
;
pointer:
STAR type_qualifier_list_opt
| STAR type_qualifier_list_opt pointer
;
pointer_opt:
%empty
| pointer
;
declarator:
pointer_opt direct_declarator
;
direct_declarator:
IDF
| LPAR declarator RPAR
| array_declarator
| function_declarator
;
abstract_declarator:
pointer
| pointer_opt direct_abstract_declarator
;
direct_abstract_declarator:
LPAR abstract_declarator RPAR
| array_abstract_declarator
| function_abstract_declarator
;
direct_abstract_declarator_opt:
%empty
| direct_abstract_declarator
;
array_abstract_declarator:
direct_abstract_declarator_opt LBRA assignment_expression RBRA
;
function_abstract_declarator:
direct_abstract_declarator_opt LPAR parameter_type_list RPAR
;
array_declarator:
direct_declarator LBRA assignment_expression RBRA
;
function_declarator:
direct_declarator LPAR parameter_type_list RPAR
;
type_qualifier_list_opt:
%empty
| type_qualifier_list
;
type_qualifier_list:
type_qualifier
| type_qualifier_list type_qualifier
;
parameter_type_list:
parameter_list
;
parameter_list:
parameter_declaration
| parameter_list COMMA parameter_declaration
;
parameter_declaration:
declaration_specifiers declarator
| declaration_specifiers abstract_declarator_opt
;
abstract_declarator_opt:
%empty
| abstract_declarator
;
type_name:
specifier_qualifier_list abstract_declarator_opt
;
typedef_name:
TYPEIDF
;
statement:
matched
| unmatched
| other
;
other:
expression_statement
| compound_statement
| jump_statement
;
matched:
selection_statement_c
| iteration_statement_c
;
unmatched:
selection_statement_o
| iteration_statement_o
;
compound_statement:
LCUR block_item_list_opt RCUR
;
block_item_list_opt:
%empty
| block_item_list
;
block_item_list:
block_item
| block_item_list block_item
;
block_item:
declaration
| statement
;
expression_statement:
expression_opt SEMIC
;
expression_opt:
%empty
| expression
;
selection_statement_o:
IF LPAR expression RPAR statement
| IF LPAR expression RPAR matched ELSE unmatched
;
selection_statement_c:
IF LPAR expression RPAR matched ELSE matched
;
iteration_statement_o:
WHILE LPAR expression RPAR unmatched
| FOR LPAR expression_opt SEMIC expression_opt SEMIC expression_opt RPAR unmatched
;
iteration_statement_c:
WHILE LPAR expression RPAR matched
| DO statement WHILE LPAR expression RPAR SEMIC
| FOR LPAR expression_opt SEMIC expression_opt SEMIC expression_opt RPAR matched
;
jump_statement:
RETURN expression_opt SEMIC
;
translation_unit:
external_declaration
| translation_unit external_declaration
;
external_declaration:
function_definition
| declaration
;
function_definition:
declaration_specifiers declarator compound_statement
;
/////////////////////////////////
%%
And I am getting this shift/reduce conflicts:
State 156
85 declarator: pointer_opt . direct_declarator
91 abstract_declarator: pointer_opt . direct_abstract_declarator
"(" shift, and go to state 187
"identifier" shift, and go to state 44
"(" [reduce using rule 111 (direct_abstract_declarator_opt)]
^^conflict with previous "(" ^^
$default reduce using rule 111 (direct_abstract_declarator_opt)
...
State 197
91 abstract_declarator: pointer_opt . direct_abstract_declarator
"(" shift, and go to state 213
"(" [reduce using rule 111 (direct_abstract_declarator_opt)]
^^conflict^^
$default reduce using rule 111 (direct_abstract_declarator_opt)
So I am understanding this as : When I get "(" I can do two thigs. First, from direct_declarator I get LPAR declarator RPAR where LPAR is "("
... or ...
I can reduce direct_abstract_declarator->array_abstract_declarator->direct_abstract_declarator_opt->direct_abstract_declarator->LPAR abstract_declarator RPAR where LPAR is "(" again. So I can't decide which way to go right ?
How should I tackle this conflict ? Or am I seeing it completely wrong ?
One of the easiest ways to create a shift-reduce conflict is to introduce a new non-terminal representing an optional instance:
Sometimes that will work, but more commonly it will turn out that there is some context in which
somethingmight or might not be optional. In other words, you have two different productions which usesomething:That's not ambiguous, but it cannot be parsed with one token lookahead, because after
somethingis recognised, the parser has to decide whether to reduce it tosomething_opt, which would be necessary if the following text turns out to be ab.something_optis basically pointless semantically; there is asomethingor there is the absence of asomething. And if it were not for that detail, the parser would have no problem; it could continue to parsesomething_elseand then, depending on whether it encountersEND_AorEND_B, it can decide whether to reduce anaor ab.The fix is simple: instead of defining
something_opt, simply duplicate the production so that there is one production withsomethingand the other without.In your case, the offending optional non-terminal is
direct_abstract_declarator_opt:Those are the only places where it is used, so we can just get rid of it by duplicating the productions for the other two non-terminals:
If at all possible, you should use this model to write grammars with optional non-terminals. It does require duplicating semantic actions, but hopefully the semantic action is just a simple call to an AST node builder. And it will save you a lot of unnecessary grief.