[Commit] nickle/doc/tutorial/builtins io.sgml,NONE,1.1 math.sgml,NONE,1.1 strings.sgml,NONE,1.1

Fri, 23 May 2003 16:49:02 -0700

Committed by: bart

Update of /local/src/CVS/nickle/doc/tutorial/builtins
In directory home.keithp.com:/tmp/cvs-serv15809/doc/tutorial/builtins

Added Files:
	io.sgml math.sgml strings.sgml 
Log Message:
docbook version of Nickle tutorial

--- NEW FILE: io.sgml ---
<sect1><title>Input and Output</title>
<para>
Input and output in Nickle are mostly accomplished through the File builtin namespace; some top-level builtins refer to those functions.
Nickle's input and output are modeled, as much of the language is, on C, but many changes have been made.
</para>

<sect2><title>Opening and closing files</title>
<para>
The functions in the File namespace use the <literal>file</literal> primitive type to describe filehandles.
Three are predefined, with their usual meanings: <literal>stdin</literal>, <literal>stdout</literal>, and <literal>stderr</literal>.
For many functions in File, there is a top-level builtin which assumes one of these standard streams.
Other files may be read and written by opening them:
</para>
<itemizedlist>
<listitem><para>file open(string path, string mode)</para></listitem>
</itemizedlist>
<para>
The first string gives the path to the file to be opened; the second is one of:
</para>
<itemizedlist>
<listitem><para>"r" to open read-only, starting at the beginning of the file.</para></listitem>
<listitem><para>"r+" to open read-write, starting at the beginning of the file.</para></listitem>
<listitem><para>"w" to create or truncate the file and open write-only.</para></listitem>
<listitem><para>"w+" to create or truncate the file and open read-write.</para></listitem>
<listitem><para>"a" to open write-only, appending to the end of the file.</para></listitem>
<listitem><para>"a+" to open read-write, appending to the end of the file.</para></listitem>
</itemizedlist>
<para>
If successful, a filehandle will be returned that can then be used.
</para>
<para>
Nickle can also open pipes to other programs, reading or writing to their stdouts or stdins; these are also treated as <literal>file</literal>s, and the difference is transparent to the functions that manipulate them.
Pipes are opened with <literal>pipe</literal> rather than <literal>open</literal>; otherwise they are treated identically to flat files.
</para>
<itemizedlist>
<listitem><para>file pipe(string path, string[*] argv, string mode)</para></listitem>
</itemizedlist>
<para>
The first string refers to the program to be run; <literal>argv</literal> is an array of the arguments to pass to it.
By convention, <literal>argv[0]</literal> should be the name of the program.
Finally, <literal>mode</literal> is one of those for <literal>open</literal>; reading from the pipe reads from the program's stdout, and writing to the pipe writes to the program's stdin.
For example,
<informalexample><screen>
$ nickle
> string[*] args = {"-a"};
> file ls = File::pipe ( "ls", args, "r" );
> do printf ( "%s\n", File::fgets ( ls ) );
+ while ( ! File::end ( ls ) );
bin
man
nickle
share
</screen></informalexample>
</para>
<para>
When a file is no longer needed, it should be closed.
</para>
<itemizedlist>
<listitem><para>void close(file f)</para></listitem>
</itemizedlist>
<informalexample><screen>
> File::close ( ls );
</screen></informalexample>
</sect2>

<sect2><title>Flush</title>
<para>
Output written to a file is not immediately written, but buffered until an appropriate time.
Ordinarily, this is not noticed; if, however, it is important to know that all buffers have been written to a file, they can be flushed:
</para>
<itemizedlist>
<listitem><para>void flush (file f)</para></listitem>
</itemizedlist>
</sect2>

<sect2><title>End</title>
<para>
Returns true if the file is at end-of-file, otherwise returns false.
</para>
<itemizedlist>
<listitem><para>bool end (file f)</para></listitem>
</itemizedlist>
</sect2>

<sect2><title>Characters and strings</title>
<para>
Individual characters can be read and written using <literal>getc</literal>, <literal>getchar</literal>, <literal>putc</literal>, and <literal>putchar</literal>.
</para>
<itemizedlist>
<listitem><para>int getc(file f)</para></listitem>
<listitem><para>int getchar()</para></listitem>
<listitem><para>int putc(int c,file f)</para></listitem>
<listitem><para>void putchar(int c)</para></listitem>
</itemizedlist>
<para>
A character can be pushed back onto the stream with <literal>ungetc</literal> or <literal>ungetchar</literal>.
</para>
<itemizedlist>
<listitem><para>int ungetc(int c, file f)</para></listitem>
<listitem><para>int ungetchar(int c)</para></listitem>
</itemizedlist>
<para>
Strings can be read, a line at a time, using <literal>fgets</literal> and <literal>gets</literal>.
</para>
<itemizedlist>
<listitem><para>string fgets(file f)</para></listitem>
<listitem><para>string gets()</para></listitem>
</itemizedlist>
<para>
All of these are like their C counterparts, with the exception noted in the following section.
</para>
</sect2>

<sect2><title>Unicode and characters vs. bytes</title>
<para>
Unicode is a standard for representing characters, like ASCII.
However, Unicode is designed to be able to support a much larger range of characters; in fact, every character in every alphabet worldwide.
It is optimized so standard ASCII characters retain their ASCII codes, and characters are not larger than they have to be.
Because of its advantages, and the possibility that it may become more standard than ASCII, and because there is no reason not to, Nickle reads and writes Unicode.
This is entirely transparent to the user/programmer.
</para>
<para>
However, there is one situation in which the programmer will notice (disregarding the case where the programmer finds himself typing on a Sanskrit keyboard): extended characters that do not stand for themselves the same in ASCII and Unicode are <emphasis>not</emphasis> one byte long; they can be as many as four for the really obscure characters.
Therefore, unlike in C, <emphasis>characters cannot be counted on to be the same as bytes</emphasis>.
For this reason, Nickle provides the following functions:
</para>
<itemizedlist>
<listitem><para>int putb(int c,file f)</para></listitem>
<listitem><para>int getb(file f)</para></listitem>
<listitem><para>int ungetb(file f)</para></listitem>
</itemizedlist>
<para>
These operate the same as <literal>putc</literal>, <literal>getc</literal>, and <literal>ungetc</literal>, but will always read or write one byte at a time, regardless of character representation.
</para>
</sect2>

<sect2><title>Formatted I/O</title>
<para>
Nickle provides functions such as <literal>printf</literal>, <literal>sprintf</literal>, and <literal>scanf</literal> to perform formatted input and output.
These functions perform like their C counterparts, with the following exceptions:
</para>
<itemizedlist>
<listitem><para>The precision of a field in the format string may be specified to be '-', which means infinite precision.</para></listitem>
<listitem><para>The %g format specifier requires a number, and prints it in the best way possible. For example:
<informalexample><screen>
> printf("%g %g %g\n", 1, 1/3, sqrt(2));
1 0.{3} 1.414213562373095
</screen></informalexample>
</para></listitem>
<listitem><para>
The %v format specifier will attempt to find the best way to print whatever value it is given.
This is a great way to print polys whose types will not be known ahead of time.
<informalexample><screen>
> printf("%v %v %v\n", 1/3, "hello", fork 4!);
(1/3) "hello" %38
</screen></informalexample>
Notice that it can even figure out difficult things like the thread returned by 'fork'.
</para></listitem>
</itemizedlist>
</sect2>

<sect2><title>At the top level</title>
<para>
Many functions in the File namespace have counterparts builtin at the top level; these do not need to be imported from File because they are automatically present.
</para>
<itemizedlist>
<listitem><para>int printf(string fmt, poly args...) is the same as <literal>File::printf</literal>.</para></listitem>
<listitem><para>string printf(string fmt, poly args...) is the same as <literal>File::sprintf</literal>.</para></listitem>
<listitem><para>void putchar(int c) is the same as <literal>File::putchar</literal>.</para></listitem>
</itemizedlist>
<para>
File also contains a namespace called FileGlobals, which is automatically imported.
It contains the following definitions:
<informalexample><screen>
public int scanf (string format, *poly args...)
{
	return File::vfscanf (stdin, format, args);
}

public int vscanf (string format, (*poly)[*] args)
{
	return File::vfscanf (stdin, format, args);
}    

public string gets ()
{
	return File::fgets (stdin);
}

public int getchar ()
{
	return File::getc (stdin);
}

public void ungetchar (int ch)
{
	File::ungetc (ch, stdin);
}
</screen></informalexample>
Thus, <literal>scanf</literal>, <literal>vscanf</literal>, <literal>gets</literal>, <literal>getchar</literal>, and <literal>ungetchar</literal> call the appropriate functions in File and return their results.
The other functions in File must be imported as normal.
</para>
</sect2>

</sect1>

--- NEW FILE: math.sgml ---
<sect1><title>Math</title>

<sect2><title>Numbers</title>
<para>
The three numeric types in Nickle--int, rational, and real--have a hierarchical relationship.
Specifically, int is a subset of rational, which is a subset of real.
Ints and rationals are stored internally in infinite precision, and printed as precisely as possible (rationals with repeating portions are represented with curly braces to allow more precision in printing; see the section on Expressions for a discussion of rational constants).
Reals are stored in finite, floating-point representations.
The mantissa defaults to 256 bits long, but this number can be changed.
</para>
<para>
Whenever performing calculations, Nickle will keep numbers in their most specific format.
For example, the result of '4/2' is an int, because although the result (2) is a rational, it is also an int, and int is more specific.
Similarly, reals are not always in imprecise floating representation; if they are known exactly, they will be represented as rationals or ints.
Nickle will only produce imprecise reals when it has to, as in square roots and logarithms.
</para>
</sect2>

<sect2><title>Operators</title>
<para>
In order to do the Right Thing for a desk calculator, Nickle provides several operators that are not present in C; these are extremely useful.
To force division to produce an integer, even if the result would be a rational, use the '//' integer divide operator, which always rounds its results to ints.
Nickle also has an exponentiation operator '**', which behaves correctly for all exponents, including negative and fractional.
Therefore, sqrt(x) is the same as x**.5, and 1/x is the same as x**-1.
Finally, it provides a factorial operator '!'.
</para>
</sect2>

<sect2><title>The Math namespace</title>
<para>
Nickle provides the builtin namespace Math for useful functions such as trigonometric functions, logarithms, as well as useful constants such as <literal>pi</literal> and <literal>e</literal>.
</para>

<sect3><title>Logarithms</title>
<itemizedlist>
<listitem><para>real log ( real a )</para></listitem>
<listitem><para>real log10 ( real a )</para></listitem>
<listitem><para>real log2 ( real a )</para></listitem>
</itemizedlist>
<para>
The logarithm of <literal>a</literal> in base e, ten, and two, respectively.
<informalexample><screen>
$ nickle
> log ( Math::e )
1.000000000000000
> log10 ( 16 ) / log10 ( 4 )	/* change of base formula, log_4 16 */
1.999999999999999
> log2 ( 16 )
3.999999999999999
> 
</screen></informalexample>
</para>
</sect3>

<sect3><title>Trigonometric functions</title>
<itemizedlist>
<listitem><para>real sin ( real a )</para></listitem>
<listitem><para>real cos ( real a )</para></listitem>
<listitem><para>real tan ( real a )</para></listitem>
<listitem><para>real asin ( real a )</para></listitem>
<listitem><para>real acos ( real a )</para></listitem>
<listitem><para>real atan ( real a )</para></listitem>
</itemizedlist>
<para>
The sine, cosine, and tangent of <literal>a</literal>, and the inverse functions.
<informalexample><screen>
$ nickle
> sin ( pi ) ** 2 + cos ( pi ) **2
1
> atan ( 1 ) * 4
3.141592653589793
> 
</screen></informalexample>
</para>
</sect3>

<sect3><title>Constants</title>
<itemizedlist>
<listitem><para>protected real e</para></listitem>
<listitem><para>real pi</para></listitem>
</itemizedlist>
<para>
<literal>pi</literal> and <literal>e</literal> define the usual constants (3.14..., 2.72...).
<literal>e</literal> is protected and must be called <literal>Math::e</literal> to allow ordinary use of the name <literal>e</literal>.
</para>
</sect3>

</sect2>

</sect1>

--- NEW FILE: strings.sgml ---
<sect1><title>Strings</title>
<para>
Unlike in C, strings in Nickle are not arrays of or pointers to individual characters.
Consistent with its pattern of providing primitive datatypes for types for things that make sense (e.g. <literal>file</literal> instead of integer file handles), Nickle provides the <literal>string</literal> type.
This has several interesting differences from C-style strings:
</para>
<itemizedlist>
<listitem><para>In Nickle, strings are immutable--individual characters may not be changed.</para></listitem>
<listitem><para>
Strings are, as with everything else, assigned and passed by-value.
See the section on Copy semantics for details.
</para></listitem>
</itemizedlist>

<sect2><title>Operators</title>
<para>
Two useful operators have been overloaded to allow sane manipulation of strings: '+' and array subscript ('[]').
</para>

<sect3><title>Subscripting</title>
<para>
Although they are not arrays of characters, it is often useful to access a string a character at a time; the array subscript operator has been overloaded to allow this.
For example:
<informalexample><screen>
> string s = "hello, world";
> s[0]
104
> s[1]
101
> s[2]
108
> s[3]
108
> s[4]
111
> 
</screen></informalexample>
</para>
<para>
Those are the integer representations of each character; they are most likely in ASCII, but not necessarily--see the section on Unicode in the I/O section.
The String namespace provides <literal>new</literal> to recreate a string from these integer character representations, regardless of ASCII or Unicode:
</para>
<itemizedlist>
<listitem><para>string new(int c)</para></listitem>
<listitem><para>string new(int[*] cv)</para></listitem>
</itemizedlist>
<para>
For instance,
<informalexample><screen>
> String::new(s[0])
"h"
</screen></informalexample>
</para>
</sect3>

<sect3><title>Concatenation</title>
<para>
On strings, '+' is the concatenation operator.
For example,
<informalexample><screen>
> string s = "hello", t = "world"; 
> s = s + ", ";
> t += "!";
> s+t
"hello, world!"
</screen></informalexample>
</para>
</sect3>

</sect2>

<sect2><title>String namespace</title>
<para>
In addition, the String namespace provides several useful functions that facilitate using strings, including the following.
</para>

<sect3><title>Length</title>
<itemizedlist>
<listitem><para>int length ( string s )</para></listitem>
</itemizedlist>
<para>
Returns the number of characters in <literal>s</literal>.
For example,
<informalexample><screen>
$ nickle
> String::length ( "hello, world" ) 
12
> 
</screen></informalexample>
</para>
</sect3>

<sect3><title>Index</title>
<itemizedlist>
<listitem><para>int index ( string t, string p )</para></listitem>
<listitem><para>int rindex ( string t, string p )</para></listitem>
</itemizedlist>
<para>
Returns the index of the first occurence of the substring <literal>p</literal> in <literal>t</literal>, or -1 if <literal>p</literal> is not in <literal>t</literal>; <literal>rindex</literal> returns the last occurance instead.
For example,
<informalexample><screen>
$ nickle
> String::index ( "hello, world", "or" ) 
8
> String::index ( "hello, world", "goodbye" )
-1
> String::rindex ( "hello, world", "o" )
8
</screen></informalexample>
</para>
</sect3>

<sect3><title>Substr</title>
<itemizedlist>
<listitem><para>string substr ( string s, int i, int l )</para></listitem>
</itemizedlist>
<para>
Returns the substring of <literal>s</literal> which begins at index <literal>i</literal> and is <literal>l</literal> characters long.
If <literal>l</literal> is negative, returns the substring of that length which preceeds <literal>i</literal> instead.
For example,
<informalexample><screen>
$ nickle
> String::substr ( "hello, world", 8, 2 ) 
"or"
> String::substr ( "hello, world", 8, -4 )
"o, w"
>
</screen></informalexample>
</para>
</sect3>

</sect2>

</sect1>