[Nickle] More FreeBSD and Mint

Sun Sep 19 23:29:24 PDT 2004

Keith and I are thinking hard about the excellent questions
you raise.  I need to understand the parsing technology a
little better, embarrassingly enough :-).  I'm kind of
thinking about the way that you rewrite parsing to be
ambiguous in the absence of explicit operator precedence and
associativity, and how it relates to the dangling else and
other such well-known problems.  It seems like one ought to
be able to say that one wants the parse in which all the
occurrences of certain subtrees are on the right (or left).
Does this make sense to anyone?

	Bart

In message <1095637081.18610.204696146 at webmail.messagingengine.com> you wrote:
[...]
> On another note, I'd like a little input on how the parser generator
> ought to work since I haven't done altogether that much of this sort of
> thing in the past.
> 
> I'm currently thinking about a way to express precedence.  However
> here's a little background, Mint currently is a parser generator that
> does not generate code nor does it read an external description of a
> language to generate a parser for -- it is more properly a perser
> generation library then.  When Bart and I were discussing its design
> last summer, we decided that would be the simplest and best approach to
> the problem and avoid some of the pitfalls that make YACC so...well...
> yaccy.
> 
> In essence, it takes a grammar (in this case -- this is the ubiquitous
> expression grammar) described like this:
> 
> string ADD = "+";
> string T = "Term";
> string F = "Factor";
> string id = "Identifier";
> string MULT = "*";
> string RPAREN = ")";
> string LPAREN = "(";
> string E = "Expression";
> 
> grammar expr_grammar = {
>     start = E,
>     rules = {
>         E => {
>             {E, ADD, T},
>             {T}
>         },
>         T => {
>             {T, MULT, F},
>             {F}
>         },
>         F => {
>             {LPAREN, E, RPAREN},
>             {id}
>         }
>     }
> };
> 
> And generates a table suitable for parsing using the table driven LR
> parsing algorithm.  I decided to use strings as the symbol type instead
> of integers since it provides more information to the user, especially
> when tracking down grammar ambiguities.  That all works very nicely.
> 
> Algorithmically, precedence and assocativity are simple -- but I think
> expressing them is complicated.  I originially envisioned a seperate
> precedence table, sort of analogous to YACC's %left %right and
> %nonassoc:
> 
> pred_table pred;
> pred_level(&pred, left(MULT, DIV));  /* MULT and DIV have higher
> priority */
> pred_level(&pred, left(ADD));
> ...
> 
> Where left, right, and nonassoc would be varadic functions.
> 
> However, the idea is not that tokens have priority at all, it's that
> rules have priority.  Bison just uses the priority of the last terminal
> symbol in the rule to determine the priority of the rule in most cases
> and it even has special syntax to confer seperate priorities on
> individual rules (the canonical example being unary minus)
> 
> All the ways I can think of to add priority to individual rules within
> the syntatic limits of Mint being a nickle library are terrible.  There
> might be something like --
> 
> rule_pred_table pred;
> rule_pred_level(&pred, left(E, (string[*]) {E, ADD, E}));
> ...
> 
> Which would be a table contaning rules and priority -- forcing the user
> to write several rules more than once -- which is asking for trouble.
> 
> The grammar type could also be augmented to support this information in
> the individual rules, which might clutter the rules as a rule would need
> to be a struct of some kind.
> 
> The symbol type could also be implemented as a disjoint union in which
> some symbols are interpreted as "options" -- I believe this is something
> like YACC does rule based precedence.  This is also somewhat messy.
> 
> The possibility also exists of reading in external representations of
> languages.  I'm starting to lean toward this option, if Mint was a
> program that merely read a description of a language and wrote out a
> description of a parse table, it could theoretically be used for parsing
> using any host language (supposing the parse table output file was
> simple enough to read.)  However, some flexibility is lost; it becomes
> much more difficult to do something like lexer feedback (unless the
> parse table inclued some generalized information about modifying lexer
> state)
> 
> I'd really like some input on this, please let me know about any random
> ideas?  I was under the impression that at some point some kind of tool
> like this could be used to bootstrap nickle - what sort of solution to
> the problem of expressing precedence and other per-rule options is the
> one that will help achieve this goal?
> -- 
>   James
>   giantrobot at f-m.fm
> 
> 
> _______________________________________________
> Nickle mailing list
> Nickle at nickle.org
> http://nickle.org/mailman/listinfo/nickle