Usage: yooparse [options] file Options: -h, -u, -? print this message -c class specify the C++ class name -d generate token definition header file -l generate a LR(1) table instead of the default LALR(1) table -v generate a detailed LR states information -w do not generate warning messages
All parser classes are child class of yoogroup::YooParse<>
,
which in turn is a child class of yoogroup::YooLex<>
.
For the input file, in all sections, Line comments (//
) and block
comments (/*
...*/)
are allowed. %%
is used as the section separator. The overview of the sections is as follows:
//section 1 %% %{ // section 2 prolog %} // section 2 grammars here %{ // section 2 epilog. %} %% // section 3
%option
at beginning of the line in
section 1. All options are case sensitive.
Configuration | Explaination |
ccext = "name" | specifies the C++ source file extention. |
ccfile = "name" | specifies the C++ source file w/ extention. |
class = "name" | specifies the class name. |
compact | Use a more compact representation of DFA table. This is done by changing
error states to the default reduce even if the lookaheads do not match.
So use this option with caution. I think that Yacc/Bison use this mode by
default. Default reduce option can be turned on/off w/o this option by calling
yySetDefaultReduce (true) , but if you are using default reduces,
using this option can save some space.
|
hhext = "name" | specifies the default header file extention. |
hhfile = "name" | specifies the header file w/ extention. |
kernel | In the DFA state debug file, only prints kernel items for each DFA state item set. By default, closure items are printed as well. |
lalr | tells YooParse to generate an LALR(1) parser. This is the default. |
lr | tells YooParse to generate a full LR(1) parser. |
main | tells YooParse to generate a default main function. |
namespace = "name" | specifies the namespace for the class |
nola | In the DFA state debug file, do not print the lookaheads for each LR(1) item. By default, lookaheads for each item is printed. |
token | tells YooParse to generate a token definition file. By default, YooParse would generate the file named class + "_tokens" + hhext |
token = "name" | tells YooParse to generate a token definition file and specifies the output file name w/ extension. |
token_namespace = "name" | specifies the namespace for the token definitions. If not specified, it will be the same as the class namespace. To force token namespace in the default namespace, specify this option at a later place. |
verbose | tells YooParse to generate a DFA state debug file. By default, YooParse would generate the file named class + ".output" |
verbose = "name" | tells YooParse to generate a DFA state debug file and specifies the output file name with extension. |
yytext = "name" | specifies the _yyText data type. Same as the YooLex option. |
yyvalue = "name" | specifies the _yyValue data type. Note: this data type must have a default constructor. Since this data type is used inside containers, std::auto_ptr<> cannot be
used. For automatic memory management, use smart pointers like boost::shared_ptr<> .
|
%{
| starts a code block, which is terminated with %}
|
%left
| specifies left associativity on the terminal as well as the precedence level |
%nonassoc
| specifies non-associativity on the terminal as well as the precedence level |
%option
| YooParse configurations. See above. |
%right
| specifies right associativity on the terminal as well as the precedence level |
%start
| specifies the start non-terminal instead of the first one encountered. |
%token
| specifies a terminal. The value of this terminal is automatically assigned. |
%left
, %right
and %nonassoc
: Terminals
specified on the same line have the same precedence level. Terminals specified
later have higher precedence
Section 2 contain 3 parts, prolog, grammar rules and epilog. This section is the same as yacc/bison.
%prec <terminal name>
is supported.
The prolog and epilog subsections are used to insert codes in the beginning and at the end of yyParse () function, respectively.
A : B | C ; B : a ; C : a ;There is a reduce/reduce conflict with
$
(EOF
)
as lookahead since both B and C can be reduced. By default, the rule specified
earlier is reduced.
statement : if statement : if statement else statement : ; ;so there is a DFA state which is the following item set:
statement : if statement . , $/else statement : if statement . else statement , $/elseThere is a shift/reduce conflict on
else
lookahead terminal.
By default, shift takes precedence over reduce. Associativity and precedence can
be used to change this default rule.
E : E + E | INTEGER ;An expression 5 + 3 + 2 would be evalued as 5 + (3 + 2) by the default right associativity rule. By specifying left associativity on '+' terminal, the above expression would be evalued as (5 + 3) + 2, which is more intuitive.
E : E + E | E * E | INTEGER ;
5+3*2
and 5*3+2
are two expressions. By specifying
*
having higher precedence than +
,
the first expression is evalued as 5+(3*2)
, which favors shift
on *
lookahead, and the second expression is valued as (5*3)+2
,
which favors reduce on +
lookahead.
FOLLOW (A)
. Reduce/Reduce and Shift/Reduce
conflicts can be raised if other non-terminals can be reduced or other items do
shift action on the same lookahead token.
LALR stands for Lookahead LR. It improves over SLR by
attempting to do some more careful lookahead analysis. In SLR, the lookaheads
for each LR item for a non-terminal A is always FOLLOW (A)
. In
LALR, the lookaheads are subsets of FOLLOW (A)
. Example:
A : a a | B b b | b B a ; B : a ;In the grammar above:
FOLLOW (B) := { a, b }
state 0: A : . a a , $ A : . B b b , $ A : . b B a , $ B : . a , b state 1: A : a . a , $ B : a . , b state 2: A : b . B , b , $ B : . a , a ...The analysis is not easy to do with hands, this is what YooParse is used for. The advantage of LALR is that it eliminates many reduce/reduce and shift/reduce conflicts in SLR w/o additional space cost.
yoogroup::YooParse<>
unless mentioned otherwise.
YYParserState | Parser stack value data type. |
YYParserStateList | Parser stack type. Equivalent of std::list<YYParserState>
|
YYValueType | _yyValueType data type.
|
YYTextType | _yyText data type. Defined in yoogroup::YooLex<> .
|
The following macros are defined in the generated C++ source file. These macros can be accessed in the section 3.
YYPARSE_DFA(outState,inState,lookahead)
| DFA state lookup macro |
YYPARSE_GOTO(outState,inState,reducedSymbol)
| GOTO state lookup macro |
YYPARSE_GOTO_BASEADD
| Internal use |
YYPARSE_TRANSLATE(terminal)
| Does terminal->internal representation translation |
protected virtual bool
| Error recovery function. Overload this function if you don't like the default method. |
protected void
| Push an ERROR token onto parser state stack. |
public bool
| Return true if default reduce. |
public bool
| Set true to force default reduce. Return old value. |
public virtual int
| The parser function |