[Nickle]Union values

Keith Packard nickle@nickle.org
Fri, 26 Jul 2002 13:09:08 -0700


Nickle unions provide a way to get limited polymorphism with compile-time 
typechecking for most operations and explicit run-time typechecking
for others.  There are two ways to declare unions:

	union {
		<type>	<member> { , <member ... };
		...
	}

	enum {
		<member> { , <member> ...}
	}

Now you're asking yourself what enums have to do with unions.  Borrowing 
from ML, we define the 'void' (unit in ML) type as that primitive type 
which holds only one value, '<>'.  An 'enum' type is just a union where all
of the members are of 'void' type.  These two types are equivalent:

	typedef union {
		void	a, b, c;
	} u;

	typedef enum {
		a, b, c
	} e;

(well, they're equilvalent *now*, but this morning, you would have used 
'enum' in the union declaration rather than 'void').

Now we come to union values.  There are currently four ways of 
constructing a union value:

	<expr>.<member> = <value>
	<expr> = <type>.<member>		for void members
	<expr> = <type>.<member> ( <value> )	for non-void members
	<expr> = ( <type>.<member> ) <value>	for non-void members

Here's some examples:

	typedef union { int i; real r; void v; } u; u x;

	x.v = <>
	x = u.v
	x = u.i (10)
	x = (u.i) 10

It's seems obvious to me that the ( <type>.<member> ) <value> form is
useless and should be discarded.

Another change I've just made is an attempt to make the union switch
statement easier to use.  Recalling the original syntax:

	union switch (<expr>) {
	case <member>:
		statement ...
	case <member>:
		statement ...
	...
	default:
		statement ...
	}

This is inconvenient because within the enclosed statements, there's no 
way to refer to the value associated with <member> without explicitly
declaring a local variable:

	...
	case <member>:
		<type> <name> = <expr> . <member>;
		statements ...
	...

I've extended the case syntax to include a name which is bound to the
referenced union member and whose type is the type of the named
member of <expr>.  Where <expr> is a poly, the type will be <poly>:

	case <member> <name>:
		statements (involving <name>) ...

Because <name> is implicitly typed, I've made it illegal to fall through 
into one of these cases:

	union switch (x) {
	case i ivalue:
		printf ("i: %d\n", ivalue);
	case r rvalue:
		printf ("r: %d\n", rvalue);

This code generates a semantic error at compile time.

I think this change makes unions less onerous to deal with.  If we agree, 
I'd like to also propose that the NULL pointer values be eliminated from 
nickle entirely.  They are an endless source of typechecking grief and 
are better implemented using unions with void members:

Here's what lists look like with unions:

    typedef conscell;

    typedef union {
        *conscell       ref;
        void            end;
    } list;

    typedef struct {
        poly    car;
        list    cdr;
    } conscell;

    poly car (list l)   { return l.ref->car; }
    poly cdr (list l)   { return l.ref->cdr; }
    bool nilp (list l)  { return l == list.end; }

    list cons (poly car, list cdr) {
        return list.ref (reference ((conscell) { car = car, cdr = cdr }));
    }

    list conc (list lists...) {
        list conc_help(int i) {
            switch (i) {
            case dim(lists):        return list.end;
            case dim(lists) - 1:    return lists[i];
            default:
                list conc_two (list a, list b) {
                    union switch (a) {
                    case ref r: return cons (r->car, conc_two (r->cdr, b));
                    case end:   return b;
                    }
                }
                return conc_two (lists[i], conc_help (i + 1));
            }
        }
        return conc_help (0);
    }

    list map (poly(poly) f, list l) {
        union switch (l) {
        case ref r:	return cons (f(r->car), map (f, r->cdr));
        case end:	return l;
        }
    }

    void printlist (list l) { 
        map (void func (poly car) { printf ("%d ", car); }, l);
        printf ("\n");
    }

    list l = cons (1, cons (2, cons (3, list.end)));

    list l1 = map (int func (int i) { return i + 1; }, l);

    printlist (l);
    printlist (l1);
    printlist (conc (l, l1));

Note that the primitive type is a *reference* to a list -- without that, 
Nickle's deep-copy semantics cause lists to be copied across each 
assignment or function call.  All of the union switch statements above
could be replaced with conditional tests:

    list map (poly(poly) f, list l) {
	if (nilp (l))
	    return list.end;
        return cons (f(car(l)), map (f, cdr(l)));
    }

To some extent, that's just a matter of style.

-keith