Bison- 转移/减少冲突

发布于 2024-11-28 09:28:56 字数 10379 浏览 3 评论 0原文

我知道在 Bison 代码中,预计会出现一些移位/归约冲突,而正常的 C 语法会为 if/else 生成一个冲突。但是,我有一个语法可以产生 330 个其他移位/归约冲突。这是我的语法有严重问题的迹象吗?我应该花时间寻找每个冲突并尝试解决或验证其正确性吗?我正在使用常规的 LALR 解析器。

编辑:

这是我的语法,我删除了所有动作和其他内容。

%token STRING_LITERAL
%token INTEGER
%token FLOAT
%token CHARACTER
%token PTR_OP INC_OP DEC_OP LEFT_OP RIGHT_OP LE_OP GE_OP EQ_OP NE_OP
%token AND_OP OR_OP MUL_ASSIGN DIV_ASSIGN MOD_ASSIGN ADD_ASSIGN
%token SUB_ASSIGN LEFT_ASSIGN RIGHT_ASSIGN AND_ASSIGN
%token XOR_ASSIGN OR_ASSIGN STATIC CATCH DOUBLE_COLON ELLIPSIS FUNCTION VAR
%token SIZEOF
%token GOTO
%token AUTO  
%token THIS VAR_ASSIGN
%token NAMESPACE 
%token TRY 
%token TYPE
%token DECLTYPE
%token PUBLIC 
%token PRIVATE 
%token PROTECTED 
%token USING
%token THROW
%token FRIEND
%token COMPILETIME
%token RUNTIME

%skeleton "lalr1.cc"
%defines
%parse-param { Wide::LexedFile& l }
%parse-param { Wide::ParsedFile&  p }
%lex-param { Wide::LexedFile& l }
%lex-param { Wide::ParsedFile& p }
%define namespace Wide

%token CASE DEFAULT IF ELSE SWITCH WHILE DO FOR CONTINUE BREAK RETURN
%code requires {
#include <string>
#include <vector>
#define YYDEBUG 1
#include "ParsedFile.h"
}
%code provides {
#include "yylex.h"
}
%start program
%locations

%%

program
    : namespace_scope_definitions;

var 
    : ;//VAR;

type_expression
    : expression
    { $$ = $1; }

variable_assignment
    : VAR_ASSIGN;

name_or_qualified_name
    : IDENTIFIER 
    | name_or_qualified_name '.' IDENTIFIER 

namespace_definition
    : NAMESPACE name_or_qualified_name '{' namespace_scope_definitions '}';

accessibility_definition
    : PUBLIC ':'
    | PRIVATE ':' 
    | PROTECTED ':'
    | FRIEND ':';

using_definition
    : USING IDENTIFIER '=' name_or_qualified_name ';' 
    | USING name_or_qualified_name ';';

type_definition
    : TYPE IDENTIFIER type_literal;

namespace_scope_definition
    : namespace_definition 
    | function_definition 
    | variable_definition 
    | accessibility_definition 
    | using_definition 
    | type_definition ;

namespace_scope_definitions
    : namespace_scope_definition
    | namespace_scope_definitions namespace_scope_definition ;

type_scope_definition
    : static_function_definition
    | static_variable_definition
    | destructor_definition
    | accessibility_definition
    | using_definition
    | constructor_definition 
    | type_definition;

type_scope_definitions
    : type_scope_definition
    | type_scope_definitions type_scope_definition ;

destructor_definition
    : '~' THIS '(' ')' compound_statement;

constructor_definition
    : THIS runtime_arguments statements_or_inits;

statement_or_init
    : ':' IDENTIFIER function_call_expression
    | compound_statement;

statements_or_inits
    : statement_or_init
    | statements_or_inits statement_or_init;

compiletime_arguments
    : runtime_arguments
    | ;

runtime_arguments
    : '(' ')'
    | '(' function_argument_list ')';

return_type_expression
    : PTR_OP type_expression 
    | ;

function_definition
    : name_or_qualified_name runtime_arguments compiletime_arguments return_type_expression compound_statement;

function_argument_definition
    : IDENTIFIER 
    | type_expression IDENTIFIER
    | IDENTIFIER variable_assignment expression
    | type_expression IDENTIFIER variable_assignment expression
    | IDENTIFIER variable_assignment '{' expressions '}'
    | type_expression IDENTIFIER variable_assignment '{' expressions '}';

function_argument_list
    : function_argument_definition 
    | function_argument_list ',' function_argument_definition;

static_function_definition
    : STATIC function_definition
    | function_definition;

static_variable_definition
    : STATIC variable_definition 
    | FRIEND variable_definition
    | STATIC FRIEND variable_definition
    | variable_definition;

variable_definition
    : var IDENTIFIER variable_assignment expression ';'
    | var type_expression IDENTIFIER variable_assignment expression ';'
    | var type_expression IDENTIFIER ';'
    | var type_expression IDENTIFIER function_call_expression ';';

base_class_list
    : ':' type_expression
    | base_class_list ',' type_expression;

type_literal
    : base_class_list '{' type_scope_definitions '}'
    | '{' type_scope_definitions '}';
    | '{' '}';

literal_expression
    : INTEGER
    | FLOAT
    | CHARACTER 
    | STRING_LITERAL
    | AUTO
    | THIS
    | TYPE type_literal;

primary_expression
    : literal_expression
    | IDENTIFIER
    | '(' expression ')';

expression
    : variadic_expression;

variadic_expression
    : assignment_expression 
    | assignment_expression ELLIPSIS ;

assignment_operator
    : '=' 
    | MUL_ASSIGN
    | DIV_ASSIGN 
    | MOD_ASSIGN
    | ADD_ASSIGN
    | SUB_ASSIGN
    | LEFT_ASSIGN
    | RIGHT_ASSIGN
    | AND_ASSIGN
    | XOR_ASSIGN
    | OR_ASSIGN;

assignment_expression
    : logical_or_expression
    | unary_expression assignment_operator assignment_expression;

logical_or_expression
    : logical_and_expression
    | logical_or_expression OR_OP logical_and_expression;

logical_and_expression
    : inclusive_or_expression
    | logical_and_expression AND_OP inclusive_or_expression;

inclusive_or_expression
    : exclusive_or_expression
    | inclusive_or_expression '|' exclusive_or_expression;

exclusive_or_expression
    : and_expression
    | exclusive_or_expression '^' and_expression;

and_expression
    : equality_expression
    | and_expression '&' equality_expression;

equality_expression
    : relational_expression
    | equality_expression EQ_OP relational_expression
    | equality_expression NE_OP relational_expression;

comparison_operator
    : '<'
    | '>'
    | LE_OP 
    | GE_OP;

relational_expression
    : shift_expression
    | relational_expression comparison_operator shift_expression;

shift_operator
    : LEFT_OP
    | RIGHT_OP;

shift_expression
    : additive_expression
    | shift_expression shift_operator additive_expression;

additive_operator
    : '+'
    | '-';

additive_expression
    : multiplicative_expression
    | additive_expression additive_operator multiplicative_expression;

multiplicative_operator
    : '*'
    | '/'
    | '%';

multiplicative_expression
    : unary_expression
    | multiplicative_expression multiplicative_operator unary_expression;

lambda_expression 
    : '[' capture_list ']' function_argument_list compound_statement
    | '[' capture_list ']' compound_statement;
    | '[' ']' function_argument_list compound_statement
    | '[' ']' compound_statement;


default_capture
    : '&' | '=' ;

capture_list
    : default_capture comma_capture_list
    | comma_capture_list;

comma_capture_list
    : variable_capture 
    | comma_capture_list ',' variable_capture;

variable_capture
    : '&' IDENTIFIER
    | '=' IDENTIFIER
    | AND_OP IDENTIFIER;

unary_operator
    : '&'
    | '*'
    | '+'
    | '-'
    | '~'
    | '!'
    | INC_OP
    | DEC_OP ;

unary_expression
    : unary_operator unary_expression
    | SIZEOF '(' expression ')'
    | DECLTYPE '(' expression ')'
    | lambda_expression
    | postfix_expression

postfix_expression
    : primary_expression
    | postfix_expression '[' expression ']'
    | postfix_expression function_call_expression 
    | postfix_expression '.' IDENTIFIER
    | postfix_expression PTR_OP IDENTIFIER
    | postfix_expression INC_OP 
    | postfix_expression DEC_OP 
    | postfix_expression FRIEND; 

expressions
    : expression
    | expressions ',' expression;

function_argument
    : expression
    | IDENTIFIER variable_assignment '{' expressions '}'

    | IDENTIFIER variable_assignment expression;

function_arguments
    : function_argument
    | function_arguments ',' function_argument;

function_call_expression
    : '(' function_arguments ')' 
    | '(' ')';

initializer_statement
    : expression
    | var IDENTIFIER variable_assignment expression
    | var type_expression IDENTIFIER variable_assignment expression;

return_statement
    : RETURN expression ';' 
    | RETURN ';';

try_statement
    : TRY compound_statement catch_statements;

catch_statement
    : CATCH '(' type_expression IDENTIFIER ')' compound_statement;

catch_statements
    : catch_statement 
    | catch_statements catch_statement
    | CATCH '(' ELLIPSIS ')' compound_statement
    | catch_statements CATCH '(' ELLIPSIS ')' compound_statement;

for_statement_initializer
    : initializer_statement ';'
    | ';';

for_statement_condition
    : expression ';'
    | ';';

for_statement_repeat
    : expression
    |;

for_statement
    : FOR '(' for_statement_initializer for_statement_condition for_statement_repeat ')' statement;

while_statement
    : WHILE '(' initializer_statement ')' statement;

do_while_statement
    : DO statement WHILE '(' expression ')';

switch_statement
    : SWITCH '(' initializer_statement ')' '{' case_statements '}';

default_statement
    : DEFAULT ':' statements;

case_statement
    : CASE expression DOUBLE_COLON statements;

case_statements
    : case_statement
    | case_statements case_statement
    | case_statements default_statement;

if_statement
    : IF '(' initializer_statement ')' statement
    | IF '(' initializer_statement ')' statement ELSE statement;

continue_statement
    : CONTINUE ';';

break_statement
    : BREAK ';';

label_statement
    : IDENTIFIER ':';

goto_statement
    : GOTO IDENTIFIER ';';

throw_statement
    : THROW ';'
    | THROW expression ';';

runtime_statement
    : RUNTIME compound_statement;

compiletime_statement
    : COMPILETIME compound_statement;

statement
    : compound_statement 
    | return_statement 
    | try_statement 
    | expression ';' 
    | static_variable_definition
    | for_statement
    | while_statement 
    | do_while_statement
    | switch_statement
    | if_statement
    | continue_statement
    | break_statement
    | goto_statement
    | label_statement
    | using_definition
    | throw_statement
    | compiletime_statement
    | runtime_statement;

statements
    : statement
    | statements statement;

compound_statement
    : '{' '}'
    | '{' statements '}';

我取得了一些进展。你是对的——同样的冲突一遍又一遍。事实证明,我试图通过注释掉规则的内容(特别是 var 规则)来从语法中删除关键字,因此每当 Bison 遇到曾经使用它的东西时,它不知道是否要减少空规则。现在我只有三个转变/减少...和两个减少/减少...冲突。

所以我想我对最初问题的回答是,这是一个非常糟糕的迹象,有 330 移位/归约冲突。

I know that in Bison code, there are some shift/reduce conflicts to be expected, and the normal C grammar produces one for if/else. However, I've got a grammar that produces 330 other shift/reduce conflicts. Is this a sign of a serious problem with my grammar? Should I spend the time hunting down each conflict and attempt to either resolve or verify it's correctness? I'm using a regular LALR parser.

Edit:

Here's my grammar, I stripped all the actions and other stuff.

%token STRING_LITERAL
%token INTEGER
%token FLOAT
%token CHARACTER
%token PTR_OP INC_OP DEC_OP LEFT_OP RIGHT_OP LE_OP GE_OP EQ_OP NE_OP
%token AND_OP OR_OP MUL_ASSIGN DIV_ASSIGN MOD_ASSIGN ADD_ASSIGN
%token SUB_ASSIGN LEFT_ASSIGN RIGHT_ASSIGN AND_ASSIGN
%token XOR_ASSIGN OR_ASSIGN STATIC CATCH DOUBLE_COLON ELLIPSIS FUNCTION VAR
%token SIZEOF
%token GOTO
%token AUTO  
%token THIS VAR_ASSIGN
%token NAMESPACE 
%token TRY 
%token TYPE
%token DECLTYPE
%token PUBLIC 
%token PRIVATE 
%token PROTECTED 
%token USING
%token THROW
%token FRIEND
%token COMPILETIME
%token RUNTIME

%skeleton "lalr1.cc"
%defines
%parse-param { Wide::LexedFile& l }
%parse-param { Wide::ParsedFile&  p }
%lex-param { Wide::LexedFile& l }
%lex-param { Wide::ParsedFile& p }
%define namespace Wide

%token CASE DEFAULT IF ELSE SWITCH WHILE DO FOR CONTINUE BREAK RETURN
%code requires {
#include <string>
#include <vector>
#define YYDEBUG 1
#include "ParsedFile.h"
}
%code provides {
#include "yylex.h"
}
%start program
%locations

%%

program
    : namespace_scope_definitions;

var 
    : ;//VAR;

type_expression
    : expression
    { $ = $1; }

variable_assignment
    : VAR_ASSIGN;

name_or_qualified_name
    : IDENTIFIER 
    | name_or_qualified_name '.' IDENTIFIER 

namespace_definition
    : NAMESPACE name_or_qualified_name '{' namespace_scope_definitions '}';

accessibility_definition
    : PUBLIC ':'
    | PRIVATE ':' 
    | PROTECTED ':'
    | FRIEND ':';

using_definition
    : USING IDENTIFIER '=' name_or_qualified_name ';' 
    | USING name_or_qualified_name ';';

type_definition
    : TYPE IDENTIFIER type_literal;

namespace_scope_definition
    : namespace_definition 
    | function_definition 
    | variable_definition 
    | accessibility_definition 
    | using_definition 
    | type_definition ;

namespace_scope_definitions
    : namespace_scope_definition
    | namespace_scope_definitions namespace_scope_definition ;

type_scope_definition
    : static_function_definition
    | static_variable_definition
    | destructor_definition
    | accessibility_definition
    | using_definition
    | constructor_definition 
    | type_definition;

type_scope_definitions
    : type_scope_definition
    | type_scope_definitions type_scope_definition ;

destructor_definition
    : '~' THIS '(' ')' compound_statement;

constructor_definition
    : THIS runtime_arguments statements_or_inits;

statement_or_init
    : ':' IDENTIFIER function_call_expression
    | compound_statement;

statements_or_inits
    : statement_or_init
    | statements_or_inits statement_or_init;

compiletime_arguments
    : runtime_arguments
    | ;

runtime_arguments
    : '(' ')'
    | '(' function_argument_list ')';

return_type_expression
    : PTR_OP type_expression 
    | ;

function_definition
    : name_or_qualified_name runtime_arguments compiletime_arguments return_type_expression compound_statement;

function_argument_definition
    : IDENTIFIER 
    | type_expression IDENTIFIER
    | IDENTIFIER variable_assignment expression
    | type_expression IDENTIFIER variable_assignment expression
    | IDENTIFIER variable_assignment '{' expressions '}'
    | type_expression IDENTIFIER variable_assignment '{' expressions '}';

function_argument_list
    : function_argument_definition 
    | function_argument_list ',' function_argument_definition;

static_function_definition
    : STATIC function_definition
    | function_definition;

static_variable_definition
    : STATIC variable_definition 
    | FRIEND variable_definition
    | STATIC FRIEND variable_definition
    | variable_definition;

variable_definition
    : var IDENTIFIER variable_assignment expression ';'
    | var type_expression IDENTIFIER variable_assignment expression ';'
    | var type_expression IDENTIFIER ';'
    | var type_expression IDENTIFIER function_call_expression ';';

base_class_list
    : ':' type_expression
    | base_class_list ',' type_expression;

type_literal
    : base_class_list '{' type_scope_definitions '}'
    | '{' type_scope_definitions '}';
    | '{' '}';

literal_expression
    : INTEGER
    | FLOAT
    | CHARACTER 
    | STRING_LITERAL
    | AUTO
    | THIS
    | TYPE type_literal;

primary_expression
    : literal_expression
    | IDENTIFIER
    | '(' expression ')';

expression
    : variadic_expression;

variadic_expression
    : assignment_expression 
    | assignment_expression ELLIPSIS ;

assignment_operator
    : '=' 
    | MUL_ASSIGN
    | DIV_ASSIGN 
    | MOD_ASSIGN
    | ADD_ASSIGN
    | SUB_ASSIGN
    | LEFT_ASSIGN
    | RIGHT_ASSIGN
    | AND_ASSIGN
    | XOR_ASSIGN
    | OR_ASSIGN;

assignment_expression
    : logical_or_expression
    | unary_expression assignment_operator assignment_expression;

logical_or_expression
    : logical_and_expression
    | logical_or_expression OR_OP logical_and_expression;

logical_and_expression
    : inclusive_or_expression
    | logical_and_expression AND_OP inclusive_or_expression;

inclusive_or_expression
    : exclusive_or_expression
    | inclusive_or_expression '|' exclusive_or_expression;

exclusive_or_expression
    : and_expression
    | exclusive_or_expression '^' and_expression;

and_expression
    : equality_expression
    | and_expression '&' equality_expression;

equality_expression
    : relational_expression
    | equality_expression EQ_OP relational_expression
    | equality_expression NE_OP relational_expression;

comparison_operator
    : '<'
    | '>'
    | LE_OP 
    | GE_OP;

relational_expression
    : shift_expression
    | relational_expression comparison_operator shift_expression;

shift_operator
    : LEFT_OP
    | RIGHT_OP;

shift_expression
    : additive_expression
    | shift_expression shift_operator additive_expression;

additive_operator
    : '+'
    | '-';

additive_expression
    : multiplicative_expression
    | additive_expression additive_operator multiplicative_expression;

multiplicative_operator
    : '*'
    | '/'
    | '%';

multiplicative_expression
    : unary_expression
    | multiplicative_expression multiplicative_operator unary_expression;

lambda_expression 
    : '[' capture_list ']' function_argument_list compound_statement
    | '[' capture_list ']' compound_statement;
    | '[' ']' function_argument_list compound_statement
    | '[' ']' compound_statement;


default_capture
    : '&' | '=' ;

capture_list
    : default_capture comma_capture_list
    | comma_capture_list;

comma_capture_list
    : variable_capture 
    | comma_capture_list ',' variable_capture;

variable_capture
    : '&' IDENTIFIER
    | '=' IDENTIFIER
    | AND_OP IDENTIFIER;

unary_operator
    : '&'
    | '*'
    | '+'
    | '-'
    | '~'
    | '!'
    | INC_OP
    | DEC_OP ;

unary_expression
    : unary_operator unary_expression
    | SIZEOF '(' expression ')'
    | DECLTYPE '(' expression ')'
    | lambda_expression
    | postfix_expression

postfix_expression
    : primary_expression
    | postfix_expression '[' expression ']'
    | postfix_expression function_call_expression 
    | postfix_expression '.' IDENTIFIER
    | postfix_expression PTR_OP IDENTIFIER
    | postfix_expression INC_OP 
    | postfix_expression DEC_OP 
    | postfix_expression FRIEND; 

expressions
    : expression
    | expressions ',' expression;

function_argument
    : expression
    | IDENTIFIER variable_assignment '{' expressions '}'

    | IDENTIFIER variable_assignment expression;

function_arguments
    : function_argument
    | function_arguments ',' function_argument;

function_call_expression
    : '(' function_arguments ')' 
    | '(' ')';

initializer_statement
    : expression
    | var IDENTIFIER variable_assignment expression
    | var type_expression IDENTIFIER variable_assignment expression;

return_statement
    : RETURN expression ';' 
    | RETURN ';';

try_statement
    : TRY compound_statement catch_statements;

catch_statement
    : CATCH '(' type_expression IDENTIFIER ')' compound_statement;

catch_statements
    : catch_statement 
    | catch_statements catch_statement
    | CATCH '(' ELLIPSIS ')' compound_statement
    | catch_statements CATCH '(' ELLIPSIS ')' compound_statement;

for_statement_initializer
    : initializer_statement ';'
    | ';';

for_statement_condition
    : expression ';'
    | ';';

for_statement_repeat
    : expression
    |;

for_statement
    : FOR '(' for_statement_initializer for_statement_condition for_statement_repeat ')' statement;

while_statement
    : WHILE '(' initializer_statement ')' statement;

do_while_statement
    : DO statement WHILE '(' expression ')';

switch_statement
    : SWITCH '(' initializer_statement ')' '{' case_statements '}';

default_statement
    : DEFAULT ':' statements;

case_statement
    : CASE expression DOUBLE_COLON statements;

case_statements
    : case_statement
    | case_statements case_statement
    | case_statements default_statement;

if_statement
    : IF '(' initializer_statement ')' statement
    | IF '(' initializer_statement ')' statement ELSE statement;

continue_statement
    : CONTINUE ';';

break_statement
    : BREAK ';';

label_statement
    : IDENTIFIER ':';

goto_statement
    : GOTO IDENTIFIER ';';

throw_statement
    : THROW ';'
    | THROW expression ';';

runtime_statement
    : RUNTIME compound_statement;

compiletime_statement
    : COMPILETIME compound_statement;

statement
    : compound_statement 
    | return_statement 
    | try_statement 
    | expression ';' 
    | static_variable_definition
    | for_statement
    | while_statement 
    | do_while_statement
    | switch_statement
    | if_statement
    | continue_statement
    | break_statement
    | goto_statement
    | label_statement
    | using_definition
    | throw_statement
    | compiletime_statement
    | runtime_statement;

statements
    : statement
    | statements statement;

compound_statement
    : '{' '}'
    | '{' statements '}';

I made a little headway. You were right- it was the same conflict over and over. Turns out that I was attempting to remove a keyword from the grammar by commenting out the contents of the rule (the var rule, specifically) so whenever Bison encountered something that used to use it, it didn't know whether or not to reduce the empty rule. Now I only have three shift/reduce... and two reduce/reduce... conflicts.

So I guess my answer to the original question was yes, it is a really bad sign to have 330 shift/reduce conflicts.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

心安伴我暖 2024-12-05 09:28:56

如果没有看到你的语法文件,我无法判断这是否是一个问题。如果您遇到如此多的移位/归约冲突,则几乎可以肯定意味着编写的语法不明确。然而,很可能这里发生的情况是,至少如果您正在编写类似 C 的语言,那么您没有指定算术表达式的优先级。您是否为所有操作员分配了优先级?

顺便说一句,您是否让 bison 创建一个 y.output 文件,其中包含解析器状态和转换的人类可读版本?如果没有,我强烈建议这样做,因为如果没有额外的信息,几乎不可能调试转移/减少冲突。

修复这些类型的错误可能是值得的,因为有这么多冲突,我假设有一个问题被重复多次。或者,您可以切换到更通用的解析器,例如 GLR 解析器(bison 也支持),但由于存在如此多的冲突,我假设问题是模糊性的,更强大的解析器无法解决这个问题。

Wthout seeing your grammar file I can't say whether or not this is a problem. If you're getting this many shift/reduce conflicts, it almost certainly means that the grammar as written is ambiguous. However, it's very likely that what's going on here, at least if you're writing a C-like language, is that you haven't specified precedences for arithmetic expressions. Did you assign priorities to all your operators?

BTW, are you having bison create a y.output file containing a human-readable version of the parser states and transitions? If not, I'd strongly suggest doing so, as it's virtually impossible to debug shift/reduce conflicts without that extra info.

It is probably worth fixing these sorts of errors, since with this many conflicts I'm assuming that there is a single problem being replicated multiple times. Alternatively, you could switch to a more general parser like a GLR parser (also supported by bison), but with this many conflicts I'm assuming the issue is ambiguity, which the more powerful parsers can't resolve.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文