[Commit] mint-2004b conclusions.tex, 1.2, 1.3 intro.tex, 1.2, 1.3 lola.tex, 1.1.1.1, 1.2 mint.tex, 1.3, 1.4 tool.tex, 1.5, 1.6 translators.tex, NONE, 1.1

Wed Dec 1 11:17:41 PST 2004

Committed by: bart

Update of /local/src/CVS/mint-2004b
In directory home.keithp.com:/tmp/cvs-serv24000

Modified Files:
	conclusions.tex intro.tex lola.tex mint.tex tool.tex 
Added Files:
	translators.tex 
Log Message:
Finished cleaning up intro.  Labeled sections.  Split
tool.tex from translators.tex .  Other misc fixes.



Index: conclusions.tex
===================================================================
RCS file: /local/src/CVS/mint-2004b/conclusions.tex,v
retrieving revision 1.2
retrieving revision 1.3
diff -u -d -r1.2 -r1.3

--- conclusions.tex	1 Dec 2004 15:39:31 -0000	1.2
+++ conclusions.tex	1 Dec 2004 19:17:37 -0000	1.3
@@ -1,4 +1,4 @@
-\section{Conclusions and Future Work}
+\section{Conclusions and Future Work}\label{sec-conclusions}
 
 As experimental as it is, the current implementation of MINT versatile.
 The ease with which it has been able to bootstrap itself suggests that

Index: intro.tex
===================================================================
RCS file: /local/src/CVS/mint-2004b/intro.tex,v
retrieving revision 1.2
retrieving revision 1.3
diff -u -d -r1.2 -r1.3
--- intro.tex	1 Dec 2004 08:59:17 -0000	1.2
+++ intro.tex	1 Dec 2004 19:17:37 -0000	1.3
@@ -1,7 +1,7 @@
-\section{Introduction: Another Parser Generator}
+\section{Introduction: Yet Another Parser Generator}
 
 {\em Parser generators are as old as the hills: there's nothing
-left to learn.}  At least, this is a common belief in the
+left to learn.}  At the least, this is a common belief in the
 programming languages community.  Any popular modern
 programming language has at least one, and sometimes more,
 lexing and parsing tools available for compiler
@@ -14,10 +14,13 @@
 
 In this environment, constructing yet another parser
 generator requires powerful motivation and careful
-consideration.  The development of the MINT parser
-generation tool is motivated by difficulties
+consideration.  Nonetheless, the authors have constructed
+such a tool: MINT.  The development of the MINT
+tool is motivated by difficulties
 presented by the implementation of the Nickle programming
-language~\cite{nickle}.  It is also motivated by a variety of perceived
+language~\cite{nickle}.  (The name comes originally from the fact
+that MINT is used to ``make Nickle''.)
+MINT is also motivated by a variety of perceived
 opportunities for improvements (of varying degrees of
 importance) in the state-of-the-art.  In particular, MINT is
 intended to be a friendly platform for future languages to
@@ -46,11 +49,40 @@
 layers C-like surface syntax and semantics atop the core of
 a modern mixed functional/imperative language.  The language
 has been carefully defined over a 20 year period to make
-algorithmic prototyping easier.  As such, Nickle should
-represent a good opportunity to achieve the state of the art.
+algorithm development easier.  As such, Nickle should
+represent a good vehicle for achieving the state of the art
+in parser generation.
 
-To achieve this goal, the MINT
-architecture has been factored in a fashion developed some
-time ago by the authors. This split is used to allow the
-generated translator to operate in a wide range of
-languages.  
+To achieve this goal, the MINT architecture has been
+factored in a fashion developed some time ago by the
+authors. This split is used to allow the generated
+translator to operate in a wide range of languages.  The
+MINT runtime parser and lexer, which are small and simple,
+can thus be implemented in a variety of programming
+languages.  For Nickle, this means that the C-based compiler
+can be used for the time being: the move to Nickle-based
+compilation can be made when appropriate.  For other
+projects, this means that the existing MINT tool can be
+reused, rather than re-implementing this wheel in each new
+language that comes along.  MINT thus also provides a
+platform for experimentation with lexer and parser
+generation techniques.
+
+MINT has already proven to be an effective tool for design
+and construction of language recognizers.  It has been used
+to bootstrap the regular expression grammar and BNF
+description grammar used by the tool itself.  MINT will be
+used in the near-term as part of the bootstrapping process
+for the Nickle language.  The potential utility of MINT is
+large, and more than justifies building yet another parser
+generator.
+
+The remainder of this paper proceeds as follows.
+Section~\ref{sec-tool} discusses the design and
+implementation of the MINT tool.  Section~\ref{sec-lola}
+describes Lola and Flo: an earlier effort by the authors
+that motivated and informed the MINT work.
+Section~\ref{sec-nickle} reviews the Nickle programming
+language.  Section~\ref{sec-exp} describes some experiments
+with MINT.  Finally, section~\ref{sec-conclusions} draws some
+conclusions and proposes some future work.

Index: lola.tex
===================================================================
RCS file: /local/src/CVS/mint-2004b/lola.tex,v
retrieving revision 1.1.1.1
retrieving revision 1.2
diff -u -d -r1.1.1.1 -r1.2
--- lola.tex	1 Dec 2004 06:47:02 -0000	1.1.1.1
+++ lola.tex	1 Dec 2004 19:17:37 -0000	1.2
@@ -1,4 +1,4 @@
-\section{Lola and Flo}
+\section{Lola and Flo}\label{sec-lola}
 
 Lola was an LL parser generator written in 1987 in the Kalypso dialect of
 lisp (a research language focusing on functional lisp programming).  As with
@@ -6,6 +6,14 @@
 structure which could drive parser frameworks in multiple languages.
 Parsers for the Lola output were written in C, Pascal and Kalypso.
 
+The Flo tool was a grammar-to-grammar transformation tool to factor left
+common sub-expressions out of a grammar to permit more compact and easier to
+understand representations of LL(1) languages.  It was also written in
+Kalypso.  Grammars are represented in lisp s-expressions as in
+Figure~\ref{fig-lola-grammar}, eliminating the need for a custom grammar
+parser and permitting easier grammar-to-grammar transformation tool
+development.  
+
 \begin{figure}
 \footnotesize
 \begin{verbatim}
@@ -43,14 +51,6 @@
 \caption{A sample Lola grammar}\label{fig-lola-grammar}
 \end{figure}
 
-The Flo tool was a grammar-to-grammar transformation tool to factor left
-common sub-expressions out of a grammar to permit more compact and easier to
-understand representations of LL(1) languages.  It was also written in
-Kalypso.  Grammars are represented in lisp s-expressions as in
-Figure~\ref{fig-lola-grammar}, eliminating the need for a custom grammar
-parser and permitting easier grammar-to-grammar transformation tool
-development.  
-
 Semantic attributes were attached to the grammar in the form of special
 tokens in the right side of each grammar production.  The parser would then
 perform the semantic operation when the action appeared on the top of the

Index: mint.tex
===================================================================
RCS file: /local/src/CVS/mint-2004b/mint.tex,v
retrieving revision 1.3
retrieving revision 1.4
diff -u -d -r1.3 -r1.4
--- mint.tex	1 Dec 2004 08:59:17 -0000	1.3
+++ mint.tex	1 Dec 2004 19:17:37 -0000	1.4
@@ -37,6 +37,7 @@
 \input{tool}
 \input{lola}
 \input{nickle}
+\input{translators}
 \input{conclusions}
 
 \bibliographystyle{entcs}

Index: tool.tex
===================================================================
RCS file: /local/src/CVS/mint-2004b/tool.tex,v
retrieving revision 1.5
retrieving revision 1.6
diff -u -d -r1.5 -r1.6
--- tool.tex	1 Dec 2004 18:55:03 -0000	1.5
+++ tool.tex	1 Dec 2004 19:17:37 -0000	1.6
@@ -8,7 +8,7 @@
 %    + Code base + Bootstrapping
 %  + Achieving goals
 
-\section{MINT}
+\section{MINT}\label{sec-tool}
 
 \subsection{Design}
 
@@ -184,78 +184,3 @@
 alternative to the more traditional code-generating parser
 generator architecture.
 
-\section{Translators in MINT}
-
-A simple examples should be adequate to demonstrate the essentials
-of the MINT parser generator language.  More example grammars, including
-the grammar for the parser generator itself appear in the appendix.
-
-Each grammar file is divided into three sections -- tokens,
-precedence, and rules.  The tokens section specifies an identifier for
-each token, along with a regular expression specifying the pattern
-of input associated with that token.  In short, the token section
-compactly represents a lexer. 
-
-The precedence section allows ambiguous grammars to be specified and
-disambiguated by the user by specifying the precedence and assocativity
-of tokens in the grammar, similar to the YACC ``\%left'' ``\%right''
-and ``\%nonassoc'' directives.
-
-The rules section represents the productions of the context free grammar.
-The syntax of these rules may seem a little confusing at first.   A rule
-consists of a nonterminal which is tagged with some kind of descriptor, and
-zero or more symbols which represent a possible sentence the nonterminal
-can derive.  Each symbol may be ``tagged'' meaning such that its value 
-will appear in the syntax tree resulting from a successful parse tagged by a identifier.
-Untagged values are not included in the syntax tree.  This
-labelling allows the user to write syntax tree walkers which are
-readable and since the tags are not positional, allows more flexibility
-in modifying the grammar.  The omission of the untagged values reduces
-redundant syntactic clutter.
-
-A few examples may clarify things:
-
-\begin{verbatim}
-
-#
-# The canonical expression grammar (with precedence)
-#
-
-#
-# Token section
-#
-tokens:
-
-token plus       /\+/
-token mult       /\*/
-token lparen     /\(/
-token rparen     /\)/
-token id         /[A-Za-z0-9]/
-skip whitespace  /[ ]/                # Skip whitespace
-
-#
-# Operator precedence
-#
-precedence:
-
-left plus
-left mult
-
-#
-# Rules
-#
-rules:
-
-rule (E : plusexpr) -> (E : op1) plus (E : op2)
-rule (E : multexpr) -> (E : op1) mult (E : op2)
-rule (E : id) -> (id : name)
-rule (E : parenexpr) -> lparen (E : sub) rparen
-
-\end{verbatim}
-
-In this example, there is one nonterminal (E) with four productions.
-Addition and multiplication are binary operators in which the left
-operand is tagged ``op1'' and the right is tagged ``op2''.  Both the
-additon symbol and the multiplication symbol are untagged -- so they
-will not be included in the resulting syntax tree.
-

--- NEW FILE: translators.tex ---
\section{Translators in MINT}\label{sec-exp}

A simple examples should be adequate to demonstrate the essentials
of the MINT parser generator language.  More example grammars, including
the grammar for the parser generator itself appear in the appendix.

Each grammar file is divided into three sections -- tokens,
precedence, and rules.  The tokens section specifies an identifier for
each token, along with a regular expression specifying the pattern
of input associated with that token.  In short, the token section
compactly represents a lexer. 

The precedence section allows ambiguous grammars to be specified and
disambiguated by the user by specifying the precedence and assocativity
of tokens in the grammar, similar to the YACC ``\%left'' ``\%right''
and ``\%nonassoc'' directives.

The rules section represents the productions of the context free grammar.
The syntax of these rules may seem a little confusing at first.   A rule
consists of a nonterminal which is tagged with some kind of descriptor, and
zero or more symbols which represent a possible sentence the nonterminal
can derive.  Each symbol may be ``tagged'' meaning such that its value 
will appear in the syntax tree resulting from a successful parse tagged by a identifier.
Untagged values are not included in the syntax tree.  This
labelling allows the user to write syntax tree walkers which are
readable and since the tags are not positional, allows more flexibility
in modifying the grammar.  The omission of the untagged values reduces
redundant syntactic clutter.

A few examples may clarify things:

\begin{verbatim}

#
# The canonical expression grammar (with precedence)
#

#
# Token section
#
tokens:

token plus       /\+/
token mult       /\*/
token lparen     /\(/
token rparen     /\)/
token id         /[A-Za-z0-9]/
skip whitespace  /[ ]/                # Skip whitespace

#
# Operator precedence
#
precedence:

left plus
left mult

#
# Rules
#
rules:

rule (E : plusexpr) -> (E : op1) plus (E : op2)
rule (E : multexpr) -> (E : op1) mult (E : op2)
rule (E : id) -> (id : name)
rule (E : parenexpr) -> lparen (E : sub) rparen

\end{verbatim}

In this example, there is one nonterminal (E) with four productions.
Addition and multiplication are binary operators in which the left
operand is tagged ``op1'' and the right is tagged ``op2''.  Both the
additon symbol and the multiplication symbol are untagged -- so they
will not be included in the resulting syntax tree.