From nickle@nickle.org Sun Jul 14 11:47:23 2002 From: nickle@nickle.org (Bart Massey) Date: Sun, 14 Jul 2002 03:47:23 -0700 Subject: [Nickle]Nickle open() returns 0 on error Message-ID: ------- =_aaaaaaaaaa0 Content-Type: text/plain; charset="us-ascii" Content-ID: <23548.1026643612.1@bart.cs.pdx.edu> The Nickle open() function currently returns integer 0 on error. This is obviously borken, as e.g. file f = open("/notthere", "r"); will throw a type exception during the assignment instead of doing something sensible. Alternatives I see offhand are to (1) make open() itself throw an exception, or to (2) make open return a structured type of either fid(file) or badfile. It seems clear to me that (1) is the right answer here. Assuming that I reimplement open() to do (1), is the attached code roughly how I would use twixt with try to get using this to work smoothly? (As if I shouldn't know. :-) BTW, in the dept. of bad ideas exception x(); try {raise x();} catch y() { printf("darn\n"); }; actually doesn't do anything: no unhandled exception x(), no "darn", nothing. IIRC this had something to do with y() being treated as a function definition, but in any case, I think things are either buggy or mis-speced here. The behavior I think I want: + catch must be supplied with an in-scope exception name of the correct arity and compatible arg-types. I think we agreed there was no wildcard catch(). The real problem here, of course, is if you slightly mis-spell the exception name. It's late. I'm sleepy. Goodnight. Bart ------- =_aaaaaaaaaa0 Content-Type: text/plain; charset="us-ascii" Content-ID: <23548.1026643612.2@bart.cs.pdx.edu> #!/usr/bin/env nickle string name = "/notthere"; import File; exception io_exception(string msg); file myopen(string fn, string m) { printf("in myopen\n"); poly result = open(fn, m); if (result == 0) { printf("oops\n"); raise io_exception("could not open/create file"); } return result; } printf("starting\n"); try twixt(file f = myopen(name, "r"); close(f)) { printf("body\n"); string s; fscanf(f, "%s", &s); printf("got %s\n", s); } catch io_exception(string msg) { fprintf(stderr, "%s: %s\n", name, msg); exit(1); }; ------- =_aaaaaaaaaa0-- From nickle@nickle.org Sun Jul 14 19:23:19 2002 From: nickle@nickle.org (Christine Hall) Date: Mon, 15 Jul 2002 02:23:19 +0800 (CST) Subject: [Nickle]http://nickle.keithp.com Message-ID: <334CR1000008575@emaserver.trafficmagnet.net> --224732621.1026670999312.JavaMail.SYSTEM.emaserver Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Hi I visited http://nickle.keithp.com, and noticed that you're not listed on some search engines! I think we can offer you a service which can help you increase traffic and the number of visitors to your website. I would like to introduce you to TrafficMagnet.net. We offer a unique technology that will submit your website to over 300,000 search engines and directories every month. You'll be surprised by the low cost, and by how effective this website promotion method can be. To find out more about TrafficMagnet and the cost for submitting your website to over 300,000 search engines and directories, visit www.TrafficMagnet.net. I would love to hear from you. Best Regards, Christine Hall Sales and Marketing E-mail: christine@trafficmagnet.net http://www.TrafficMagnet.net This email was sent to nickle@keithp.com. I understand that you may NOT wish to receive information from me by email. To be removed from this and other offers, simply go to the link below: http://emaserver.trafficmagnet.net/trafficmagnet/www/optoutredirect?UC=Lead&UI=8660646 --224732621.1026670999312.JavaMail.SYSTEM.emaserver Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: 7bit
Hi

I visited http://nickle.keithp.com, and noticed that you're not listed on some search engines! I think we can offer you a service which can help you increase traffic and the number of visitors to your website.

I would like to introduce you to TrafficMagnet.net. We offer a unique technology that will submit your website to over 300,000 search engines and directories every month.


You'll be surprised by the low cost, and by how effective this website promotion method can be.

To find out more about TrafficMagnet and the cost for submitting your website to over 300,000 search engines and directories, visit www.TrafficMagnet.net.

I would love to hear from you.


Best Regards,

Christine Hall
Sales and Marketing
E-mail: christine@trafficmagnet.net
http://www.TrafficMagnet.net

 

This email was sent to nickle@keithp.com.
I understand that you may NOT wish to receive information from me by email.
To be removed from this and other offers, simply click here.
. --224732621.1026670999312.JavaMail.SYSTEM.emaserver-- From nickle@nickle.org Mon Jul 15 09:38:50 2002 From: nickle@nickle.org (Bart Massey) Date: Mon, 15 Jul 2002 01:38:50 -0700 Subject: [Nickle]twixt bug? Message-ID: ------- =_aaaaaaaaaa0 Content-Type: text/plain; charset="us-ascii" Content-ID: <26634.1026722325.1@bart.cs.pdx.edu> This may be the same bug that I reported before and Keith has apparently fixed but not yet checked into CVS: I don't know. I've got it cut down to this, but haven't chased it further, and I'm sleepy. Invoke the attached script as ./bug.5c On my machine, it reports nickle: sched.c:504: ContinuationMark: Assertion `!continuation->pc || ((((InstPtr) (continuation->obj + 1)) + ( 0)) <= continuation->pc && continuation->pc <= (((InstPtr) (continuation->obj + 1)) + ( ((continuation->obj)->used - 1))))' failed. It is very sensitive to the stack state: I couldn't cut it down further. I got a stack backtrace with gdb: the garbage collector was trying to collect an apparently broken TwixtMark. It seems like the twixt continuation stuff is still a bit messed up, but I'll let Keith poke at it before I try again... Bart ------- =_aaaaaaaaaa0 Content-Type: text/plain; name="bug.5c"; charset="us-ascii" Content-ID: <26634.1026722325.2@bart.cs.pdx.edu> #!/usr/bin/env nickle import File; int main() { string destcodes = sprintf("/tmp/%c.codes", 'a'); twixt(file sf = open("/dev/zero", "r"); close(sf)) { string word; while (fscanf(sf, "%s\n", &word) == 1) { } } return 0; } main(); ------- =_aaaaaaaaaa0-- From nickle@nickle.org Sat Jul 20 20:09:30 2002 From: nickle@nickle.org (Keith Packard) Date: Sat, 20 Jul 2002 12:09:30 -0700 Subject: [Nickle]break/continue/return from catch block Message-ID: The question was the proper interpretation of: exception e (int i); int f () { try { raise e (7); } catch e (int i) { return 12; /* where does this go? */ } return 3; } The right answer is to return from 'f', but the catch block is implemented internally as a lambda. I've fixed this as well as fixing 'break' and 'continue' in the same context. Break/continue jump to the appropriate enclosing loop construct. Continuations are good. Keith Packard XFree86 Core Team HP Cambridge Research Lab From nickle@nickle.org Sun Jul 21 07:25:54 2002 From: nickle@nickle.org (Bart Massey) Date: Sat, 20 Jul 2002 23:25:54 -0700 Subject: [Nickle]break/continue/return from catch block In-Reply-To: Your message of "Sat, 20 Jul 2002 12:09:30 PDT." Message-ID: Cool: thanks for the fix. If I understand you correctly (not sure) you made break and continue break out of or repeat the catch block itself. In any case, IMHO it should break out of or continue the surrounding loop, i.e. while (1) { try f(); catch e() { break; } } should exit the while loop... Bart In message you wrote: > > The question was the proper interpretation of: > > exception e (int i); > > int f () > { > try > { > raise e (7); > } > catch e (int i) > { > return 12; /* where does this go? */ > } > return 3; > } > > > The right answer is to return from 'f', but the catch block is implemented > internally as a lambda. I've fixed this as well as fixing 'break' and > 'continue' in the same context. Break/continue jump to the appropriate > enclosing loop construct. > > Continuations are good. > > Keith Packard XFree86 Core Team HP Cambridge Research Lab > > > > _______________________________________________ > Nickle mailing list > Nickle@nickle.org > http://nickle.org/mailman/listinfo/nickle From nickle@nickle.org Mon Jul 22 00:30:10 2002 From: nickle@nickle.org (Keith Packard) Date: Sun, 21 Jul 2002 16:30:10 -0700 Subject: [Nickle]break/continue/return from catch block In-Reply-To: Your message of "Sat, 20 Jul 2002 23:25:54 PDT." Message-ID: Around 23 o'clock on Jul 20, Bart Massey wrote: > If I understand you correctly (not sure) you made break and > continue break out of or repeat the catch block itself. In > any case, IMHO it should break out of or continue the > surrounding loop, i.e. > > while (1) { > try f(); catch e() { break; } > } > > should exit the while loop... You agree with Rob Pike and the current implementation as well. Additional overloading of the 'break' statement is a bad idea, it should always control the nearest enclosing loop construct. Keith Packard XFree86 Core Team HP Cambridge Research Lab From nickle@nickle.org Tue Jul 23 07:32:29 2002 From: nickle@nickle.org (Keith Packard) Date: Mon, 22 Jul 2002 23:32:29 -0700 Subject: [Nickle]UTF8 and bool type Message-ID: I'd like to propose a couple of changes to nickle, the first is to switch the string representation to Unicode. Strings will appear as a sequence of 32-bit Unicode values. Strings read and written in "raw" form will be encoded using UTF-8. The internal representation is also UTF-8, but this isn't visible to applications. Three new File namespace functions are needed: File::getcharacter (file f) - parse the next UTF-8 char File::ungetcharacter (int c, file f) - push the specified char back as UTF-8 File::putcharacter (file f) - output c in UTF-8 format As always, suggestions for the names are welcome. The second change is a bit more fundemental. Nickle has always allowed any type to appear in expressions where a boolean value was needed; the test was always whether the computed value was not equal to 0. I propose to add a real boolean type (name bool). This type will have two values, 'true' and 'false' and will be the required type for all primitive conditional tests (if, while, do, for, twixt, ?:). The 'bool' type will be extremely useful in the future as algebraic types are added, but even now, it should catch a large class of errors which would otherwise go undetected. -keith From nickle@nickle.org Tue Jul 23 09:27:21 2002 From: nickle@nickle.org (Bart Massey) Date: Tue, 23 Jul 2002 01:27:21 -0700 Subject: [Nickle]UTF8 and bool type In-Reply-To: Your message of "Mon, 22 Jul 2002 23:32:29 PDT." Message-ID: Is there any reason not to simply change getc(), ungetc(), and putc() to have the specified behavior, and have special names for the "raw" functions (e.g. getb(), ungetb(), putb())? It seems to me like the UTF-8 behavior is both the common and the "character"-oriented case. Failing that, I vote for getuc(), ungetuc(), putuc(). Bart In message you wrote: > > I'd like to propose a couple of changes to nickle, the first is to switch > the string representation to Unicode. Strings will appear as a sequence of > 32-bit Unicode values. Strings read and written in "raw" form will be > encoded using UTF-8. The internal representation is also UTF-8, but this > isn't visible to applications. > > Three new File namespace functions are needed: > > File::getcharacter (file f) - parse the next UTF-8 char > File::ungetcharacter (int c, file f) - push the specified char back as UTF-8 > File::putcharacter (file f) - output c in UTF-8 format > > As always, suggestions for the names are welcome. > > The second change is a bit more fundemental. Nickle has always allowed any > type to appear in expressions where a boolean value was needed; the test > was always whether the computed value was not equal to 0. I propose to > add a real boolean type (name bool). This type will have two values, > 'true' and 'false' and will be the required type for all primitive > conditional tests (if, while, do, for, twixt, ?:). > > The 'bool' type will be extremely useful in the future as algebraic types > are added, but even now, it should catch a large class of errors which > would otherwise go undetected. > > -keith > > > > _______________________________________________ > Nickle mailing list > Nickle@nickle.org > http://nickle.org/mailman/listinfo/nickle From nickle@nickle.org Tue Jul 23 16:42:09 2002 From: nickle@nickle.org (Carl Worth) Date: Tue, 23 Jul 2002 15:42:09 +0000 Subject: [Nickle]Sequencing of expressions with side effects Message-ID: <15677.31057.53465.48007@scream.east.isi.edu> I noticed that nickle includes the unary increment/decrement operators from C and I started wondering about the sequencing of their side effects. In C, the behavior of these operators makes possible statements such as: a = a++; the behavior of which is undefined. Do nickle's semantics prevent such undefined behavior? I couldn't find any definition of ++/-- nor any discussion of the sequencing of the side effects of expressions. I've done a little experimenting, and empirically, it seems that a working definition for ++a could be (a=a+1). The postfix form is a bit trickier, but defining a++ as ((poly t), a=a+1, t) seems to do the trick for all cases that I have tested. Are those accurate definitions? As I was abusing nickle along these lines, I discovered an oddity with the += operator. The man page mentions that a += b should be the same as a = a + b, but I found cases where that does not hold, (again involving expressions with side effects): > a=1; a += ++a 4 > a=1; a = a + (++a) 3 This doesn't rely on the unary increment operator appearing in the right-hand expression. We can replace it with the definition I used above and still get the same result: > a=1; a += (a = a+1) 4 > a=1; a = a + (a = a+1) 3 Is this behavior "correct"? What are the defined semantics for assignment side-effects within expressions in nickle? -Carl -- Carl Worth USC Information Sciences Institute cworth@east.isi.edu 3811 N. Fairfax Dr. #200, Arlington VA 22203 703-812-3725 From nickle@nickle.org Tue Jul 23 17:10:40 2002 From: nickle@nickle.org (Keith Packard) Date: Tue, 23 Jul 2002 09:10:40 -0700 Subject: [Nickle]Sequencing of expressions with side effects In-Reply-To: Your message of "Tue, 23 Jul 2002 15:42:09 -0000." <15677.31057.53465.48007@scream.east.isi.edu> Message-ID: Around 15 o'clock on Jul 23, Carl Worth wrote: > a = a++; > > the behavior of which is undefined. > > Do nickle's semantics prevent such undefined behavior? I couldn't find > any definition of ++/-- nor any discussion of the sequencing of the > side effects of expressions. The intent is for the semantics to have well defined behaviour in all cases. The evaluation order within expressions is controlled by precedence. Within operators, or within a list of operators of equal precedence, the evaluation is strictly left to right. However, the compilation of assignment operators currently violates this rule; the right hand side is evaluated before the address of the left hand side: > int[2] a = { 3, 4 }; > int j = 0; > a[j] = j++; > a [2] {3, 0} > Fixing this will be a bit tricky, as the compiler must handle the multiple assignment case: > b[j] = a[j++] = j++; It looks like nickle will need to stack the lvalues until reaching the rvalue and then unwind the stack with multiple stores. In the case of += style operators, the evaluation of the lvalue will be followed by fetching the referenced value before the evaluation of the right hand side. I think this is easy. Keith Packard XFree86 Core Team HP Cambridge Research Lab From nickle@nickle.org Tue Jul 23 17:12:55 2002 From: nickle@nickle.org (Keith Packard) Date: Tue, 23 Jul 2002 09:12:55 -0700 Subject: [Nickle]UTF8 and bool type In-Reply-To: Your message of "Tue, 23 Jul 2002 01:27:21 PDT." Message-ID: Around 1 o'clock on Jul 23, Bart Massey wrote: > Is there any reason not to simply change getc(), ungetc(), and putc() to > have the specified behavior, and have special names for the "raw" functions > (e.g. getb(), ungetb(), putb())? Except that this breaks any existing applications communicating in binary, it seems like a reasonable plan. Without additional opinions, I'm tempted to change the semantics in this way. -keith From nickle@nickle.org Tue Jul 23 17:32:59 2002 From: nickle@nickle.org (Keith Packard) Date: Tue, 23 Jul 2002 09:32:59 -0700 Subject: [Nickle]Sequencing of expressions with side effects In-Reply-To: Your message of "Tue, 23 Jul 2002 09:10:40 PDT." Message-ID: Around 9 o'clock on Jul 23, Keith Packard wrote: > It looks like nickle will need to stack the lvalues until reaching the > rvalue and then unwind the stack with multiple stores. In the case > of += style operators, the evaluation of the lvalue will be followed by > fetching the referenced value before the evaluation of the right hand side. > > I think this is easy. Indeed it was quite easy. The patch is entangled with the new 'bool' type; is there any complaint about breaking applications with this new type? I've found it has caught a few errors in some of my code: if (a & b == 0) A classic precidence error -- the equality operator is of higher precidence than the bitwise and operator, making this equivalent to: if (a & (b == 0)) The addition of a true 'bool' type not commensurate with 'int' catches this nicely. The only remaining question is what to name our new utf8 functions. Keith Packard XFree86 Core Team HP Cambridge Research Lab From nickle@nickle.org Tue Jul 23 18:05:54 2002 From: nickle@nickle.org (Bart Massey) Date: Tue, 23 Jul 2002 10:05:54 -0700 Subject: [Nickle]Sequencing of expressions with side effects In-Reply-To: Your message of "Tue, 23 Jul 2002 09:10:40 PDT." Message-ID: Naah. If you're gonna honor the precedence of operators in expression evaluation, you should also honor the associativity: assignment operators associate right-to-left, and should thus evaluate right-to-left. The claim in the manual that (a += b) == (a = a + b) should be corrected: since we're evaluating right to left, we want (a += b) == (a = b + a) IMHO this is fine. Alternatively you could make the compiler follow the original definition: a += b --> a = (int x1 = a) + (int x2 = b); which would still be consistent with the principles espoused above. (We should in any case document += as what it is---a right-associative operator (evaluated right-to-left)---rather than calling it syntactic sugar.) Sorry to break your extremely clever fix, Keith :-). And thanks, Carl, for the bug report! Bart In message you wrote: > > Around 15 o'clock on Jul 23, Carl Worth wrote: > > > a = a++; > > > > the behavior of which is undefined. > > > > Do nickle's semantics prevent such undefined behavior? I couldn't find > > any definition of ++/-- nor any discussion of the sequencing of the > > side effects of expressions. > > The intent is for the semantics to have well defined behaviour in all > cases. The evaluation order within expressions is controlled by > precedence. Within operators, or within a list of operators of equal > precedence, the evaluation is strictly left to right. > > However, the compilation of assignment operators currently violates this > rule; the right hand side is evaluated before the address of the left hand > side: > > > int[2] a = { 3, 4 }; > > int j = 0; > > a[j] = j++; > > a > [2] {3, 0} > > > > Fixing this will be a bit tricky, as the compiler must handle the multiple > assignment case: > > > b[j] = a[j++] = j++; > > It looks like nickle will need to stack the lvalues until reaching the > rvalue and then unwind the stack with multiple stores. In the case > of += style operators, the evaluation of the lvalue will be followed by > fetching the referenced value before the evaluation of the right hand side. > > I think this is easy. > > Keith Packard XFree86 Core Team HP Cambridge Research Lab > > > > _______________________________________________ > Nickle mailing list > Nickle@nickle.org > http://nickle.org/mailman/listinfo/nickle From nickle@nickle.org Tue Jul 23 18:53:02 2002 From: nickle@nickle.org (Keith Packard) Date: Tue, 23 Jul 2002 10:53:02 -0700 Subject: [Nickle]Sequencing of expressions with side effects In-Reply-To: Your message of "Tue, 23 Jul 2002 10:05:54 PDT." Message-ID: Around 10 o'clock on Jul 23, Bart Massey wrote: > Naah. If you're gonna honor the precedence of operators in > expression evaluation, you should also honor the > associativity: assignment operators associate right-to-left, > and should thus evaluate right-to-left. I disagree. A strict left to right rule fixes problems with lexical scope of identifiers: a = (int b = 2) ** b; Associativity only affects implicit parenthesization, I don't think it should affect the evaluation order. > a += b --> a = (int x1 = a) + (int x2 = b); > which would still be consistent with the principles > espoused above. This isn't quite right -- 'a' must be evaluated only once in the 'a += b' case: a += b -> (poly* ap = &a), (*ap = *ap + b) -keith From nickle@nickle.org Tue Jul 23 19:18:39 2002 From: nickle@nickle.org (Carl Worth) Date: Tue, 23 Jul 2002 18:18:39 +0000 Subject: [Nickle]crashing nickle in <100 bytes Message-ID: <15677.40447.460676.701637@scream.east.isi.edu> While Keith and Bart try to solve my last riddle, here's a new one: typedef list; typedef struct { list next; } list; list burn = {next = 0}; crash = burn; The assignment to burn yields the following legitimate and polite error message, (notice the missing '*' in the struct definition): Incompatible types 'list', 'union { int; *poly; }' in struct initializer The assignment to crash yields the following not-so-polite message: Segmentation fault -Carl PS. Would it make sense to have a literal other than 0 to represent a null pointer? I've got a code segment hear that I think would be useful with a function accepting a poly that could legitimately be an integer, (such as 0), or a null pointer. Anything I can do for that already? -- Carl Worth USC Information Sciences Institute cworth@east.isi.edu 3811 N. Fairfax Dr. #200, Arlington VA 22203 703-812-3725 From nickle@nickle.org Tue Jul 23 20:55:20 2002 From: nickle@nickle.org (Bart Massey) Date: Tue, 23 Jul 2002 12:55:20 -0700 Subject: [Nickle]Sequencing of expressions with side effects In-Reply-To: Your message of "Tue, 23 Jul 2002 10:53:02 PDT." Message-ID: In message you wrote: > > Around 10 o'clock on Jul 23, Bart Massey wrote: > > > Naah. If you're gonna honor the precedence of operators in > > expression evaluation, you should also honor the > > associativity: assignment operators associate right-to-left, > > and should thus evaluate right-to-left. > > I disagree. A strict left to right rule fixes problems with lexical scope > of identifiers: > > a = (int b = 2) ** b; Huh? The example you give has nothing to do with the order of evaluation of assignment, as near as I can tell: the exponentiation is higher-precedence than the assignment, so the RHS ((int b = 2) ** b) evaluates however it evaluates, and the result is bound to a. This will happen whether you evaluate assignment left-to-right or right-to-left. The inner assignment is similar. Perhaps you're making the point that parentheses should still dominate the order of evaluation: a parenthesized expression should be fully evaluated before use. Or perhaps you meant to write b = (int b = 2) ** 2 which I agree is quite confusing, but IMHO unavoidable: see below. > Associativity only affects implicit parenthesization, I don't think it > should affect the evaluation order. I guess my rules are ultimately these: 1. The meaning of an expression should never be changed by explicitly putting in the parentheses implied by precedence and associativity. 2. Expression evaluation should honor explicit parentheses: parenthesized subexpressions should always be evaluated before their containing expressions. I think these two rules together are sufficient to completely constrain Nickle's order of evaluation in expressions, and they are easy to implement and understand. The only problem I see is that they lead to a little bit of wartiness as discussed above: I find this NBD compared to the alternatives. > > a += b --> a = (int x1 = a) + (int x2 = b); > > which would still be consistent with the principles > > espoused above. > > This isn't quite right -- 'a' must be evaluated only once in the 'a += b' > case: > > a += b -> (poly* ap = &a), (*ap = *ap + b) Since a is required to be an lvalue, I'm not seeing how this matters. Maybe when a is of pointer type? Bart From nickle@nickle.org Tue Jul 23 21:28:51 2002 From: nickle@nickle.org (Bart Massey) Date: Tue, 23 Jul 2002 13:28:51 -0700 Subject: [Nickle]Sequencing of expressions with side effects In-Reply-To: Your message of "Tue, 23 Jul 2002 12:55:20 PDT." Message-ID: In message I wrote: > 1. The meaning of an expression should never be changed by > explicitly putting in the parentheses implied by > precedence and associativity. > > 2. Expression evaluation should honor explicit parentheses: > parenthesized subexpressions should always be evaluated > before their containing expressions. The second part of number 2, is of course, bogus. What I really mean is something more like 2. Expression evaluation should honor explicit parentheses: parenthesized subexpressions should always be fully evaluated (including side effects) before their value is used. And now that I think about it, Keith is right as well: the question of order-of-evaluation is somewhat orthogonal to the precedence + associativity question. Sigh. I'm confused. Give me a bit to sort this all out. Bart From nickle@nickle.org Tue Jul 23 21:47:19 2002 From: nickle@nickle.org (Keith Packard) Date: Tue, 23 Jul 2002 13:47:19 -0700 Subject: [Nickle]Sequencing of expressions with side effects In-Reply-To: Your message of "Tue, 23 Jul 2002 12:55:20 PDT." Message-ID: Around 12 o'clock on Jul 23, Bart Massey wrote: > > a = (int b = 2) ** b; > > Huh? The example you give has nothing to do with the order > of evaluation of assignment, as near as I can tell I was attempting to demonstrate the issue with the exponentiation operator which is also right associative. In this case, a strict left-to-right rule ensures that when 'b' is lexically in scope, the initializer has been run. If the RHS of the '**' operator is evaluated before the LHS, the initializer for 'b' won't have been run. > 1. The meaning of an expression should never be changed by > explicitly putting in the parentheses implied by > precedence and associativity. That's true in either evaluation order. > 2. Expression evaluation should honor explicit parentheses: > parenthesized subexpressions should always be evaluated > before their containing expressions. But that doesn't constrain the evaluation of the LHS of a binary operator wrt the RHS. In a strict left-to-right world, you just start evaluating the expression elements from left to right, executing the operators when the appropriate operands have been evaluated: a = (int b = 2) ** b; r1 = ref(a) r2 = ref(b) r3 = 2 *r2 = r3 r4 = b r5 = r3 ** r4 *r1 = r5 Consider this in the form of a parse tree: = / \ a ** / \ = b / \ b 2 A left-bearing depth-first walk of the tree yields the correct order of evaluation. >> a += b -> (poly* ap = &a), (*ap = *ap + b) > > Since a is required to be an lvalue, I'm not seeing how this > matters. Maybe when a is of pointer type? *a++ += b -> (poly *ap = a++), (*ap = *ap + b) The important part is to compute the reference value only once. Keith Packard XFree86 Core Team HP Cambridge Research Lab From nickle@nickle.org Tue Jul 23 22:07:36 2002 From: nickle@nickle.org (Keith Packard) Date: Tue, 23 Jul 2002 14:07:36 -0700 Subject: [Nickle]crashing nickle in <100 bytes In-Reply-To: Your message of "Tue, 23 Jul 2002 18:18:39 -0000." <15677.40447.460676.701637@scream.east.isi.edu> Message-ID: Around 18 o'clock on Jul 23, Carl Worth wrote: typedef list; typedef struct { list next; } list; list burn = {next = 0}; crash = burn; For the programmers convenience, nickle automatically creates structures and unions when they aren't initialized explicitly. This automatic initialization wasn't checking for recursive structures. > PS. Would it make sense to have a literal other than 0 to represent a > null pointer? I've got a code segment hear that I think would be > useful with a function accepting a poly that could legitimately be an > integer, (such as 0), or a null pointer Should you be using a union instead of 'poly' in this case? Unions give you explicit tags and some reasonable typechecking for this kind of case. Their syntax is a bit awkward in places; feel free to suggest improvements. I can still imagine cases where a separate explicit nil would be useful though. Keith Packard XFree86 Core Team HP Cambridge Research Lab From nickle@nickle.org Wed Jul 24 02:52:26 2002 From: nickle@nickle.org (Carl Worth) Date: Wed, 24 Jul 2002 01:52:26 +0000 Subject: [Nickle]crashing nickle in <100 bytes In-Reply-To: References: <15677.40447.460676.701637@scream.east.isi.edu> Message-ID: <15678.2138.436564.575480@scream.east.isi.edu> On Jul 23, Keith Packard wrote: > Around 18 o'clock on Jul 23, Carl Worth wrote: > typedef list; > typedef struct { > list next; > } list; > > For the programmers convenience, nickle automatically creates structures > and unions when they aren't initialized explicitly. This automatic > initialization wasn't checking for recursive structures. Thanks. I take it this hasn't been committed yet though, right? With this change, will it actually be possible to implement list structures without using pointer datatypes, (as in the definition above)? That might be quite nice, although then we'll definitely need a new nil literal for terminating lists, (assigning 0 to a structured value object current fails). Hmm... with a new nil type, I wonder if it wouldn't make sense to change what happens when objects are declared without initializers. For example, currently primitive datatypes and structured values behave slightly differently: This is legal: > typedef struct {int val} sv; > sv s; > sv t = s val = While this causes an exception: > int i; > int j = i Unhandled exception "uninitialized_value" "Uninitialized value" Oh, and allowing "sv t = s;" seems odd in light of the fact that "sv t; t.val = s.val" causes an exception. I do understand the current semantics, (now that Keith explained the convenience creation of structures), but I'm wondering if another approach might be more consistent. Perhaps all uninitialized data could receive a value of nil? (The only difference would be that this is a value that could appear as a literal for use in comparisons, etc.) Then, maybe the assignment operator would not raise an "uninitialized_value" exception on nil values? (Other operators could still raise the exception). Maybe it would even make sense to eliminate the convenience creation of structures and instead just initialize them to nil? (Especially since the programmer can get the empty structure by simply adding an empty initializer, "= {};") But I don't have strong feelings or compelling reasons for much of that if it is difficult or causes other complications. One thing that I would like is if I could assign a structured value without explicitly providing its type, (in the same style as the initializer), such as: sv s; s = {val = 3}; rather than: sv s; s = (sv) {val = 3}; But I can understand why that might not be possible. -Carl -- Carl Worth USC Information Sciences Institute cworth@east.isi.edu 3811 N. Fairfax Dr. #200, Arlington VA 22203 703-812-3725 From nickle@nickle.org Wed Jul 24 03:43:13 2002 From: nickle@nickle.org (Keith Packard) Date: Tue, 23 Jul 2002 19:43:13 -0700 Subject: [Nickle]crashing nickle in <100 bytes In-Reply-To: Your message of "Wed, 24 Jul 2002 01:52:26 -0000." <15678.2138.436564.575480@scream.east.isi.edu> Message-ID: Around 1 o'clock on Jul 24, Carl Worth wrote: > Thanks. I take it this hasn't been committed yet though, right? Yeah, I went ahead and committed the whole mess. Bool types, utf-8 getc/ putc/ungetc and the fix for recursive automatic struct initializers. > With this change, will it actually be possible to implement list > structures without using pointer datatypes, (as in the definition > above)? Yes, but you can already do that with unions: typedef list; typedef union { void end; list next; } listref; typedef struct { listref ref; } list; listref listend = (listref.end)<>; list a = { ref = listend }; > Hmm... with a new nil type, I wonder if it wouldn't make sense to > change what happens when objects are declared without > initializers. Nil isn't a type, it's just a value which is commensurate with every pointer type. I'm not sure what type it is... > For example, currently primitive datatypes and structured values behave > slightly differently: I added these default initializers on Bart's request -- it was a pain to manually initialize a nested structure so that you could start assigning to the various members. This makes nickle a bit friendlier to C programmers. > Perhaps all uninitialized data could receive a value of nil? I like the current exception when using uninitialised data; that catches more programming errors. Besides, there isn't one value which is commensurate with every type in the system; nil is a pointer value. > Maybe it would even make sense to eliminate the convenience creation > of structures and instead just initialize them to nil? (Especially > since the programmer can get the empty structure by simply adding an > empty initializer, "= {};") One of the convenience features is that nested structures are also created automatically. > One thing that I would like is if I could assign a structured value > without explicitly providing its type, (in the same style as the > initializer), such as: Yes, that would be nifty, but it requires transmitting type information backwards across the assignment operator. As these values can occur in any expression context, the general solution requires fairly complete type inference which is about to be thrown out the window in favor of algebraic types. Keith Packard XFree86 Core Team HP Cambridge Research Lab From nickle@nickle.org Wed Jul 24 08:33:42 2002 From: nickle@nickle.org (Bart Massey) Date: Wed, 24 Jul 2002 00:33:42 -0700 Subject: [Nickle]Tonight's topics Message-ID: Tonight's topics: left to right, partial and total, broken types, 0 considered harmful. Keith and I just spent several hours on the phone, concluding several things: 1) He is right, and I am wrong. Even right-associative operators should evaluate left-to-right. 2) There should be no such thing as a structured value some of whose elements are . Yet, the language should not try to guess what value things should be initialized to. Ergo, though I was fond of them, the automatic implicit struct and array creation have to go. This will ensure prompt run-time detection of trouble, and remove a serious definitional problem in the type system. 3) There should be no way to create a type with no obviously well-defined values. In the absence of partial this means that typedef x; typedef struct { x y; } x; has to go, and also its singly and mutually recursive cousins. I sincerely doubt anyone will miss them. 4) Runtime detection should be augmented by a static analysis ala Java to detect the use of potentially undefined values. Since this analysis doesn't have to be sound or complete, it can be simple and comprehensible. We will probably make it sound and conservative for everything but globals anyhow. Hopefully it will be useful rather than annoying, as I've found it to be in Java. 5) There's a whole bunch of cruft in the design and implementation of Nickle around the "zero" value. We don't want to talk about what the language and implementation currently do. Here's how things will work when Keith's done: As in the C specification, the lexeme "0" will be magic. It will be different than, e.g., 0x0, 00, 0.0, '\0', and (1-1). It may denote any one of several constants, depending on its type context, namely the zero integer, the null value of some specific pointer type, or the zero of type poly. This last constant is a zero created by e.g. poly x = 0; and will be type-convertible at runtime to any of the other kinds of zero. The end result of this is that 0 will no longer be a source of type insecurity. For example, the following will finally be an error: int x = 0; poly y = x; *int z = y; *int x = 0; poly y = x; *string z = y; Comments on any of this are welcome. Bart From nickle@nickle.org Wed Jul 24 14:19:56 2002 From: nickle@nickle.org (Carl Worth) Date: Wed, 24 Jul 2002 13:19:56 +0000 Subject: I'm confused about type promotion (was: [Nickle]Tonight's topics) In-Reply-To: References: Message-ID: <15678.43388.811428.513926@scream.east.isi.edu> On Jul 24, Bart Massey wrote: > 2) There should be no such thing as a structured value some > of whose elements are . I agree with this. This will be a nice improvement. > Yet, the language should not try to guess what value things > should be initialized to. I agree that the language should never guess. But might it not make sense for the programmer to be able to explicitly provide default initialization values in the definition of a structured value type? I started thinking along those lines as I have been considering the promotion rules in nickle, (which, frankly I haven't quite grasped yet -- it doesn't help that I can never keep straight the directions of downward/upward promotion and sub/super-types). What prompted my train of thought was the surprising behavior of the "structure compatibility rule" mentioned in the nickle tutorial, (which is coming along quite nicely by the way -- thanks!): a struct value is compatible with a struct type if the struct value contains all of the entries in the type. For example: typedef struct { int i; real r; } i_and_r; typedef struct { int i; } just_i; i_and_r i_and_r_value = { i = 12, r = 37 }; just_i i_value; i_value = i_and_r_value; The assignment is legal because i_and_r contains all of the elements of just_i. i_value will end up with both 'i' and 'r' values. I'm still a bit confused about the state of the variable i_value at the conclusion of this exercise. Empirically, it seems to be of type just_i still since I can pass it to a function accepting an argument of type just_i. Also, I can not access the r field directly from it. However, I can also do things with this variable that I can not do with a straight value of type just_i, for example, I can assign it to a variable of type i_and_r, (assigning both the i and the r values). So what kind of object do we have at this point? It seems to be a strange mix of both types, (which, admittedly, I can see some uses for). I tried to compare this with what happens between the int and rational types which seem to have a similar relationship as the just_i and i_and_r types above. But, it's an error to try a similar assignment with these types: > rational r = 12 / 37; > int i; > i = r Unhandled exception "invalid_argument" "Incompatible types in assignment" However, assignment in the other direction works of course, (while failing for the structure values): > int i = 12; > rational r = i; > just_i i = { i = 12}; > i_and_r r = i; Unhandled exception "invalid_argument" "Incompatible types in assignment" I may be thinking about this the wrong way, but it seems that assigning an int value to a rational because nickle "knows" the correct value for the missing field. So, perhaps, (with no guessing), nickle could allow the programmer to specify default initialization values so that similar behavior could also be possible for structured values? That might even make it possible for Bart to keep his convenient automatic structured value creations, no? > Ergo, though I was fond of them, the automatic implicit struct > and array creation have to go. This will ensure prompt run-time > detection of trouble, and remove a serious definitional problem > in the type system. -Carl -- Carl Worth USC Information Sciences Institute cworth@east.isi.edu 3811 N. Fairfax Dr. #200, Arlington VA 22203 703-812-3725 From nickle@nickle.org Wed Jul 24 14:27:19 2002 From: nickle@nickle.org (Carl Worth) Date: Wed, 24 Jul 2002 13:27:19 +0000 Subject: [Nickle]Recursive datatypes (was: Tonight's topics) In-Reply-To: References: Message-ID: <15678.43831.744815.824370@scream.east.isi.edu> On Jul 24, Bart Massey wrote: > 3) There should be no way to create a type with no obviously > well-defined values. Agreed. > In the absence of partial > this means that > typedef x; > typedef struct { > x y; > } x; > has to go, and also its singly and mutually recursive > cousins. I sincerely doubt anyone will miss them. While I could probably live without this, what's the fundamental difference between that struct definition and the following: typedef struct { poly y; } x; where I can still assign y a value of type x? It seems the first definition would actually be better for this purpose as it provides type checking. Maybe the difference is that there is a defined "zero of type poly" but no "zero of type structured value"? If so, might it make sense to create a new zero? -Carl -- Carl Worth USC Information Sciences Institute cworth@east.isi.edu 3811 N. Fairfax Dr. #200, Arlington VA 22203 703-812-3725 From nickle@nickle.org Wed Jul 24 19:11:34 2002 From: nickle@nickle.org (Keith Packard) Date: Wed, 24 Jul 2002 11:11:34 -0700 Subject: I'm confused about type promotion (was: [Nickle]Tonight's topics) In-Reply-To: Your message of "Wed, 24 Jul 2002 13:19:56 -0000." <15678.43388.811428.513926@scream.east.isi.edu> Message-ID: Around 13 o'clock on Jul 24, Carl Worth wrote: > What prompted my train of thought was the surprising behavior of the > "structure compatibility rule" mentioned in the nickle tutorial, > (which is coming along quite nicely by the way -- thanks!): > > a struct value is compatible with a struct type if > the struct value contains all of the entries in the > type. This is the extension of the rule allowing assignment of an integer type to a real type: int i; real r; i = 1; r = i; The essential behaviour is that the variable 'r' can perform any operation on the new value that it could perform on any other real value. Just so with structures: typedef struct { int i; real r } i_and_r; typedef struct { int i; } just_i; i_and_r i_and_r_value = { i = 12, r = 37 }; just_i i_value; i_value = i_and_r_value; Anywhere the program uses 'just_i' types, i_and_r_value will perform correctly. This is statically checkable. The funny part comes when you want to go the other direction: i_and_r_value = i_value; This is permitted by the type system as an explicit narrowing of the type, just as in: real r = 7; int i; i = r; Both of these cases involve a run-time check to make sure the narrowing involves a value of a compatible type. In the 'int = real' case, the program checks to make sure the value is an int, while in the 'i_and_r_value = i_value' case, the program checks to make sure 'i_value' contains a struct with both 'i' and 'r' fields of the correct types. > I tried to compare this with what happens between the int and rational > types which seem to have a similar relationship as the just_i and > i_and_r types above. But, it's an error to try a similar assignment > with these types: > > > rational r = 12 / 37; > > int i; > > i = r > Unhandled exception "invalid_argument" > "Incompatible types in assignment" Note that this generates a run-time exception, not a compile-time type error. That's because the narrowing is allowed at compile time but explicitly checked at runtime. Try instead: > rational r = 12 / 3; > int i; > i = r; Note here that the run-time check isn't simply ensuring that the value stored in 'r' was originally an 'int', it's actually checking that the computation resulting in the value produced a value that can be exactly represented as an int: > rational r = 12 / 37; > int i; > i = r * 37; I suspect one source of your confusion is the super/sub type relations among the numeric and structured types. A super type is a type which can represent all of the values of it's subtypes, plus some additional values. That's most obvious in the numeric types where 'real' is a super-type of 'rational' which is a super-type of 'int'. In general, it's always statically type-safe to use a sub-type any place it's super-type is used. Extending this to structures means that a structure with *more* elements is a sub-type of a structure with *fewer* elements; that's the way the static type safety is preserved -- any place the structure type with fewer elements is used can obviously be satisfied by any value containing more elements. Think of an object system with sub-classes -- the sub-class always extends the super-class with *more* members and methods. > So, perhaps, (with no guessing), nickle could allow the programmer to > specify default initialization values so that similar behavior could also > be possible for structured values? Yes, we could extend the structure type to include default values for all of the structure elements: typedef struct { int i = 12; real r = 7.1; } i_and_r; But, Bart's plan is to leave the structure variable uninitialized and statically analyse the program to ensure the variable is initialized before use. This will catch errors instead of masking them with possibly incorrect values. If you want to have a default value for your structure, you can just store it in a global variable when you declare the type. i_and_r i_and_r_value; if (one_way) i_and_r_value = (i_and_r) { i = 1, r = 2 }; else i_and_r_value = (i_and_r) { i = 2, r = 1 }; return i_and_r_value; This will pass the simple static analysis, while a future change: i_and_r i_and_r_value; if (one_way) i_and_r_value = (i_and_r) { i = 1, r = 2 }; else if (the_other_way) i_and_r_value = (i_and_r) { i = 2, r = 1 }; return i_and_r_value will elicit a compiler error. Java does this and describes the simple static analysis so that programmers aren't (too) surprised. Our analyser will be complicated by closures and local functions: int() foo () { int j; int bar () { return j; } j = 7; return bar; } > That might even make it possible for Bart to keep his convenient > automatic structured value creations, no? The automatic structured value creations are more likely to mask bugs than help the programmer. -keith From nickle@nickle.org Wed Jul 24 19:26:00 2002 From: nickle@nickle.org (Keith Packard) Date: Wed, 24 Jul 2002 11:26:00 -0700 Subject: [Nickle]Recursive datatypes (was: Tonight's topics) In-Reply-To: Your message of "Wed, 24 Jul 2002 13:27:19 -0000." <15678.43831.744815.824370@scream.east.isi.edu> Message-ID: Around 13 o'clock on Jul 24, Carl Worth wrote: > typedef x; > typedef struct { > x y; > } x; > While I could probably live without this, what's the fundamental > difference between that struct definition and the following: It's not a matter of "probably" living without recursive structures, the fact is you can't actually create a value of that type -- the representation of this type is of infinite size. The current kludge is to ensure that the automatic structure value creation code doesn't recurse in this case, but that leaves us with a structure containing undefined values, that's what we're trying to eliminate. The way around this type problem is to use a type which can be represented with more than one kind of value. > typedef struct { > poly y; > } x; Now I can represent values of this type easily: x foo = { y = 7 }; x bar = { y = (x) { y = (x) { y = <> } } }; To avoid pure polymorphism, you can use a union type instead: typedef list; typedef union { list next; void end; } listref; typedef struct { listref ref; } list; list l = { ref = (listref.next) (list) { ref = (listref.end) <> } }; union switch (l.ref) { case next: printf ("next\n"); break; case end: printf ("end\n"); break; } if (l.ref == listend) printf ("yes\n"); l.ref.next.ref.next = (list) { ref = listend }; ML has a cleaner syntax for unions which makes this look less awkward. The advantage of unions is you can retain most of the compile-time typechecking, and the run-time typechecking is all explicitly labeled in the code. -keith From nickle@nickle.org Wed Jul 24 19:49:59 2002 From: nickle@nickle.org (Bart Massey) Date: Wed, 24 Jul 2002 11:49:59 -0700 Subject: [Nickle]Recursive datatypes (was: Tonight's topics) In-Reply-To: Your message of "Wed, 24 Jul 2002 11:26:00 PDT." Message-ID: An important distinction is between the runtime type of a value, and the static analysis that ensures that runtime type errors will not occur. When you use poly, you turn off the latter. So with typedef struct { poly f; } t; you can store in the field f anything you want from the compiler's point of view, including t x = {f = <>}; t y = {f = x}; y is now a value that looks like global t y = {f = {f = <>}}; What is the type of this value of y? It is typedef struct { struct {void f;} f; } t_y; Indeed, y holds the only legal value of this type. Thus t_y z = y; works as expected: Nickle can't rule out this assignment at compile time, and it checks out OK at runtime. Note that t_y is not a recursive type: just a nested structure type. You cannot create infinitely nested (i.e. recursive) values in (new) Nickle, because there is no way to "tie the knot", so you'd have to type an infinitely long initializer in. HTH, Bart In message you wrote: > > Around 13 o'clock on Jul 24, Carl Worth wrote: > > > typedef x; > > typedef struct { > > x y; > > } x; > > > While I could probably live without this, what's the fundamental > > difference between that struct definition and the following: > > It's not a matter of "probably" living without recursive structures, the > fact is you can't actually create a value of that type -- the > representation of this type is of infinite size. The current kludge is to > ensure that the automatic structure value creation code doesn't recurse in > this case, but that leaves us with a structure containing undefined > values, that's what we're trying to eliminate. > > The way around this type problem is to use a type which can be represented > with more than one kind of value. > > > typedef struct { > > poly y; > > } x; > > Now I can represent values of this type easily: > > x foo = { y = 7 }; > x bar = { y = (x) { y = (x) { y = <> } } }; > > To avoid pure polymorphism, you can use a union type instead: > > typedef list; > > typedef union { > list next; > void end; > } listref; > > typedef struct { > listref ref; > } list; > > list l = { ref = (listref.next) (list) { ref = (listref.end) <> } }; > > union switch (l.ref) { > case next: printf ("next\n"); break; > case end: printf ("end\n"); break; > } > > if (l.ref == listend) printf ("yes\n"); > > l.ref.next.ref.next = (list) { ref = listend }; > > ML has a cleaner syntax for unions which makes this look less awkward. The > advantage of unions is you can retain most of the compile-time > typechecking, and the run-time typechecking is all explicitly labeled in > the code. From nickle@nickle.org Wed Jul 24 20:38:38 2002 From: nickle@nickle.org (Keith Packard) Date: Wed, 24 Jul 2002 12:38:38 -0700 Subject: [Nickle]Tonight's topics In-Reply-To: Your message of "Wed, 24 Jul 2002 00:33:42 PDT." Message-ID: Around 0 o'clock on Jul 24, Bart Massey wrote: > 3) There should be no way to create a type with no obviously > well-defined values. In the absence of partial > this means that > typedef x; > typedef struct { > x y; > } x; > has to go, and also its singly and mutually recursive > cousins. I sincerely doubt anyone will miss them. Ok, so the question is how to specify this limitation. The limitation devolves down to restrictions on the use of forward-declared types (the bare 'typedef x'). Here's what I think the restrictions should be: 1) You can create a reference type to a forward declared type: typedef x; typedef *x y; That's because '0' is always a legal pointer value which can terminate any possible value recursion. 2) You can create a union containing a forward declared type, but the union must contain one non forward-declared type: typedef list; typedef union { void end; list next; } listref; This restriction ensures that there is always a non-recursive value possible. I think that these two restrictions ensure that all types have finite value representations. -keith From nickle@nickle.org Wed Jul 24 22:05:36 2002 From: nickle@nickle.org (Jamey Sharp) Date: 24 Jul 2002 14:05:36 -0700 Subject: [Nickle]Re: structure compatibility (was: I'm confused about type promotion) In-Reply-To: <15678.43388.811428.513926@scream.east.isi.edu> References: <15678.43388.811428.513926@scream.east.isi.edu> Message-ID: <1027544736.552.142.camel@zoo> On Wed, 2002-07-24 at 06:19, Carl Worth wrote: > What prompted my train of thought was the surprising behavior of the > "structure compatibility rule" mentioned in the nickle tutorial, > (which is coming along quite nicely by the way -- thanks!): > > a struct value is compatible with a struct type if > the struct value contains all of the entries in the > type. This sounds to me like a horrible misfeature. In XCB, we had to do non-idiomatic single-element structures to get C's type system to properly check XIDs. It sounds like even that is complicated by nickle's type system, requiring the programmer to ensure that no other structure has a field of the same name. Doesn't this mean that "XCB-Nickle" would need types with horrible long element names like these? typedef struct { int XCBFONTSeqnum; } FONT; typedef struct { int XCBWINDOWSeqnum; } WINDOW; And doesn't it mean that even with this effort, some library developer down the line could introduce serious type-safety errors into XCB? Promotion between structure types considered harmful, in short, unless the language allows the programmer to say explicitly that two structure types are related. -- Jamey Sharp - http://jamey.is.dreaming.org/ From nickle@nickle.org Wed Jul 24 22:33:40 2002 From: nickle@nickle.org (Keith Packard) Date: Wed, 24 Jul 2002 14:33:40 -0700 Subject: [Nickle]Re: structure compatibility (was: I'm confused about type promotion) In-Reply-To: Your message of "24 Jul 2002 14:05:36 PDT." <1027544736.552.142.camel@zoo> Message-ID: Around 14 o'clock on Jul 24, Jamey Sharp wrote: > > a struct value is compatible with a struct type if > > the struct value contains all of the entries in the > > type. > > This sounds to me like a horrible misfeature. This is how you deal with subtyping in a structural equivalence world. In a name equivalence world, you'd need special syntax to subtype structures: typedef struct { int i; } i_only; typedef struct extend i_only { real r; } i_and_r; Nickle uses structural equivalence to make interactive interaction easier; name equivalence means that redefining types breaks interfaces: typedef struct { int i; } q; q f () { return (q) { i = 7; } } typedef struct { int i; } q; q x; x = f(); With name equivalence, the redefinition of 'q' can either: a) create a new typedef, scoping out the old b) insist that the two types match a) means the example above would generate a type mismatch error, b) would mean you could never redefine types. Structural equivalence finesses these issues by comparing the type structure instead of the type pointers. I would personally prefer name equivalence, but I don't want to wreck Nickle for interactive use. C uses a mixture of structural and name equivalence: typedef int x_type; typedef int y_type; typedef struct { int i; } xs_type; typedef struct { int i; } ys_type; foo () { x_type x_value; y_type y_value; xs_type xs_value; ys_type ys_value; y_value = x_value; /* structural equivalence */ ys_value = xs_value; /* name equivalence */ } XCB uses the name equivalence of structure "feature" to gain some measure of type safety; if C had instead used pure name equivalence, XCB wouldn't have needed to resort to the synthetic structure type and could have just created a new typedef of CARD32. Keith Packard XFree86 Core Team HP Cambridge Research Lab From nickle@nickle.org Thu Jul 25 01:59:50 2002 From: nickle@nickle.org (Carl Worth) Date: Thu, 25 Jul 2002 00:59:50 +0000 Subject: [Nickle]I think I've finally got it (was: I'm confused about type promotion) In-Reply-To: References: <15678.43388.811428.513926@scream.east.isi.edu> Message-ID: <15679.19846.102443.953807@scream.east.isi.edu> Keith, Many thanks for all the patient explanations. On Jul 24, Keith Packard wrote: > I suspect one source of your confusion is the super/sub type relations > among the numeric and structured types. This is exactly what had me tripped up. There's a substantive difference between these two relations, (in the way the values of the sub-type are constrained as compared to the super-type). > A super type is a type which can represent all of the values of > it's subtypes, plus some additional values. This always made sense to me, but I had to read the following description of the super/sub relation among structured types about a dozen times before realizing it didn't contradict this definition, (it seemed to me that adding fields allowed more values not fewer -- but I finally figured it out). My confusion came from imagining trying to do something like the following with structured values: typedef struct { real r; } Real; typedef struct { real r; real i; } Complex; Which yields Complex as a sub-type of Real. With these definitions it's always legal to assign a Complex value to a variable of type Real, and it's only legal to assign from a Real variable to a Complex variable if the Real variable happens to be holding values for both r and i. Those aren't the most useful semantics for this situation of course. Instead, if the numeric types included a complex type, it would actually be a super-type of real. Then it would always be legal to assign a real value to a variable of type complex, and only legal to assign from a complex variable to a real variable if the complex variable happens to be holding a real value. Those would be the ideal semantics for dealing with real and complex types/values, but structured value types don't allow for the same kind of power that the builtin numeric types have. -Carl -- Carl Worth USC Information Sciences Institute cworth@east.isi.edu 3811 N. Fairfax Dr. #200, Arlington VA 22203 703-812-3725 From nickle@nickle.org Thu Jul 25 11:57:16 2002 From: nickle@nickle.org (Carl Worth) Date: Thu, 25 Jul 2002 10:57:16 +0000 Subject: [Nickle]"purely" functional nickle (was: Recursive datatypes) In-Reply-To: References: <15678.43831.744815.824370@scream.east.isi.edu> Message-ID: <15679.55692.301216.985253@scream.east.isi.edu> On Jul 24, Keith Packard wrote: > The way around this type problem is to use a type which can be represented > with more than one kind of value. > > To avoid pure polymorphism, you can use a union type instead: > > if (l.ref == listend) printf ("yes\n"); I suppose that was intended to be: if (l.ref == (listref.end) <>) printf ("yes\n"); which is indeed a rather awkward syntax. Other than the syntax, I do like the other characteristics of using a union for this purpose. I ended up just using poly. I made a global struct value named nil as a placeholder for list termination. When the new poly zero is in place, I can get rid of my nil. Using poly is actually the right thing for what I'm doing, (which involves implementing a purely polymorphic language in nickle). I've been trying is a little experiment in "pure functional" programming in nickle. I've been going through a 1984 paper by John Hughes, "Why Functional Programming Matters"[1], implementing each example. With a small module defining cons, nil, and functional versions of all the operators it was a simple matter to directly implement the first set of examples: /* fp.5c */ namespace FP { public typedef struct { poly car; poly cdr; } cons_cell; public global cons_cell nil = { car = 0, cdr = 0 }; public cons_cell cons(car, cdr) { return (cons_cell){car = car, cdr = cdr}; } public poly add(a, b) { return a + b; } public poly mult(a, b) { return a * b; } /* etc. for all other nickle operators */ } /* fptest.5c */ /* load "fp.5c"; *** Commented out for your cut-and-paste pleasure */ import FP; poly reduce(poly(poly,poly) binop, poly basis, cons_cell list) { if (list == nil) { return basis; } else { return binop(list.car, reduce(binop, basis, list.cdr)); } } poly sum(cons_cell list) { return reduce(add, 0, list); } poly product(cons_cell list) { return reduce(mult, 1, list); } cons_cell append(cons_cell a, cons_cell b) { return reduce(cons, b, a); } l1 = cons(1, cons(2, nil)); l2 = cons(3, cons(4, nil)); list = append(l1, l2); printf("sum is %d\n", sum(list)); printf("product is %d\n", product(list)); That's all very straightforward. The next example revealed something interesting. He starts with a single function to double each element in a list, then progressively decomposes it in order to arrive at a definition for map. I was able to implement the version with map like so: poly(poly,poly) compose(poly(poly,poly) f, poly(poly) g) { poly composite(car, cdr) { return f(g(car), cdr); } return composite; } poly map(poly(poly) f, cons_cell list) { return reduce(compose(cons, f), nil, list); } poly double(arg) { return mult(2, arg); } cons_cell doubleall(list) { return map(double, list); } list2 = doubleall(list); This works, but the definition of compose isn't very satisfying. This compose is restricted to accepting one function with two arguments and a second with one argument. A more powerful compose would be able to accept functions with any number of arguments and just do the Right Thing. That is, if f accepts M arguments and g accepts N arguments, then compose would return a function accepting N + (M-1) arguments that would return f(g(arg1, ..., argN), argN+1, ..., argN+M-1). I don't suppose there would be any way to write such a beast in nickle? Oh, and in a similar vein, is there any way to call one function with a variable length argument list from another that also has a variable length argument list. Something like: int foo(int args ...) { /* ... */ return 0; } int bar(int args ...) { return foo(args); /* How to make something like this work? */ } Would we want some new bit of syntax that turns an array into a list of arguments for a function call? By the way, thanks for nickle. I'm having a lot of fun poking and prodding at it like this. -Carl [1] http://www.math.chalmers.se/~rjmh/Papers/whyfp.html -- Carl Worth USC Information Sciences Institute cworth@east.isi.edu 3811 N. Fairfax Dr. #200, Arlington VA 22203 703-812-3725 From nickle@nickle.org Thu Jul 25 16:30:06 2002 From: nickle@nickle.org (Keith Packard) Date: Thu, 25 Jul 2002 08:30:06 -0700 Subject: [Nickle]"purely" functional nickle (was: Recursive datatypes) In-Reply-To: Your message of "Thu, 25 Jul 2002 10:57:16 -0000." <15679.55692.301216.985253@scream.east.isi.edu> Message-ID: Around 10 o'clock on Jul 25, Carl Worth wrote: > > To avoid pure polymorphism, you can use a union type instead: > > > > if (l.ref == listend) printf ("yes\n"); > > I suppose that was intended to be: > > if (l.ref == (listref.end) <>) printf ("yes\n"); I'd defined 'listend' as a variable holding that value so that the syntax wasn't quite so horrible. > I ended up just using poly. I made a global struct value named nil as > a placeholder for list termination. When the new poly zero is in > place, I can get rid of my nil. You can use 0 now; all that the new poly zero is going to do is permit some run-time typechecking of pointer types. Using poly is certainly easier at the moment; the union syntax is just too painful. > That is, if f accepts M arguments and g accepts N arguments, > then compose would return a function accepting N + (M-1) arguments > that would return f(g(arg1, ..., argN), argN+1, ..., argN+M-1). > > I don't suppose there would be any way to write such a beast in > nickle? Nickle doesn't have the necessary reflective operators; you'd need some way to talk about the number of arguments expected by each function. Given that, we could easily add some syntax like: int foo (int a, int b) { return a + b; } int bar (int args...) { foo (args...); } Now you could decompose the argument list into two arrays and call the functions. Keith Packard XFree86 Core Team HP Cambridge Research Lab From nickle@nickle.org Thu Jul 25 18:20:29 2002 From: nickle@nickle.org (Carl Worth) Date: Thu, 25 Jul 2002 17:20:29 +0000 Subject: [Nickle]Removing type names from literal structured values (was: Re: crashing nickle in <100 bytes) In-Reply-To: References: <15678.2138.436564.575480@scream.east.isi.edu> Message-ID: <15680.13149.780534.682637@scream.east.isi.edu> On Jul 23, Keith Packard wrote: > > One thing that I would like is if I could assign a structured value > > without explicitly providing its type, (in the same style as the > > initializer) > > Yes, that would be nifty, but it requires transmitting type information > backwards across the assignment operator. As these values can occur in > any expression context, the general solution requires fairly complete type > inference which is about to be thrown out the window in favor of algebraic > types. I don't actually want the complete type interference. Here's where I'm at as a programmer: 1) I like to provide types for variables, function arguments, and function returns. 2) I don't like having to provide the types for literal structured values when they are used in any of the above contexts. 3) I don't mind having to provide the type if a literal structured value is used in other contexts within an expression. So, my proposal would only reflect type information backwards across assignment, passing arguments to functions, and return statements. Would that be reasonable? If nickle supported this, then anonymous arrays could provide a more flexible approach to variable length argument lists with minimal extra typing on the part of the programmer: int foo(int[*] args) { ... } int bar(int[*] args, int[*] other_args) { foo(args); ... } bar({1, 2, 3}, {4, 5, 6}); That would be most pleasant. What do you think? I might not go as far as getting rid of the current variable-length argument list functionality in favor of anonymous arrays. I don't know if I could ever get used to typing extra braces in printf. ;-) So, even with this, I'd probably still want the new "args ..." syntax for passing an array to a function accepting "args ...". -Carl -- Carl Worth USC Information Sciences Institute cworth@east.isi.edu 3811 N. Fairfax Dr. #200, Arlington VA 22203 703-812-3725 From nickle@nickle.org Fri Jul 26 06:57:11 2002 From: nickle@nickle.org (Bart Massey) Date: Thu, 25 Jul 2002 22:57:11 -0700 Subject: [Nickle]Removing type names from literal structured values (was: Re: crashing nickle in <100 bytes) In-Reply-To: Your message of "Thu, 25 Jul 2002 17:20:29 -0000." <15680.13149.780534.682637@scream.east.isi.edu> Message-ID: Maybe can't do it if you want parametric polymorphism more (and you do). Sorry. More later when I have more typing time. Bart In message <15680.13149.780534.682637@scream.east.isi.edu> you wrote: > On Jul 23, Keith Packard wrote: > > > One thing that I would like is if I could assign a structured value > > > without explicitly providing its type, (in the same style as the > > > initializer) > > > > Yes, that would be nifty, but it requires transmitting type information > > backwards across the assignment operator. As these values can occur in > > any expression context, the general solution requires fairly complete type > > inference which is about to be thrown out the window in favor of algebraic > > types. > > I don't actually want the complete type interference. Here's where I'm > at as a programmer: > > 1) I like to provide types for variables, function arguments, and > function returns. > > 2) I don't like having to provide the types for literal structured > values when they are used in any of the above contexts. > > 3) I don't mind having to provide the type if a literal structured > value is used in other contexts within an expression. > > So, my proposal would only reflect type information backwards across > assignment, passing arguments to functions, and return > statements. Would that be reasonable? > > If nickle supported this, then anonymous arrays could provide a more > flexible approach to variable length argument lists with minimal extra > typing on the part of the programmer: > > int foo(int[*] args) { > ... > } > > int bar(int[*] args, int[*] other_args) { > foo(args); > ... > } > > bar({1, 2, 3}, {4, 5, 6}); > > That would be most pleasant. What do you think? > > I might not go as far as getting rid of the current variable-length > argument list functionality in favor of anonymous arrays. I don't > know if I could ever get used to typing extra braces in printf. ;-) > > So, even with this, I'd probably still want the new "args ..." syntax > for passing an array to a function accepting "args ...". > > -Carl > > -- > Carl Worth > USC Information Sciences Institute cworth@east.isi.edu > 3811 N. Fairfax Dr. #200, Arlington VA 22203 703-812-3725 > > _______________________________________________ > Nickle mailing list > Nickle@nickle.org > http://nickle.org/mailman/listinfo/nickle From nickle@nickle.org Fri Jul 26 12:32:46 2002 From: nickle@nickle.org (Carl Worth) Date: Fri, 26 Jul 2002 11:32:46 +0000 Subject: [Nickle]Declaring a variable name, replacing previously defined type? Message-ID: <15681.13150.897787.503643@scream.east.isi.edu> Would this be reasonable to support? > typedef foo; > int foo; parse error before "foo" It would be handy in interactive settings for people like me that have trouble coming up with more than two unique identifiers. I'm not awake enough yet to think about whether this would introduce ambiguities in the grammar or any other potential problem. -Carl -- Carl Worth USC Information Sciences Institute cworth@east.isi.edu 3811 N. Fairfax Dr. #200, Arlington VA 22203 703-812-3725 From nickle@nickle.org Fri Jul 26 17:38:21 2002 From: nickle@nickle.org (Keith Packard) Date: Fri, 26 Jul 2002 09:38:21 -0700 Subject: [Nickle]Declaring a variable name, replacing previously defined type? In-Reply-To: Your message of "Fri, 26 Jul 2002 11:32:46 -0000." <15681.13150.897787.503643@scream.east.isi.edu> Message-ID: Around 11 o'clock on Jul 26, Carl Worth wrote: > Would this be reasonable to support? > > > typedef foo; > > int foo; > parse error before "foo" Not with our current syntax. Handling typedef-style names requires help from the lexer -- each lexed symbol is looked up in the symbol table and different tokens returned to the parser depending on the kind of symbol. It's the same hack that C compilers use because of the limitations of an LALR parser. You can use 'undefine', but that emits an error when the symbol isn't currently defined. Keith Packard XFree86 Core Team HP Cambridge Research Lab From nickle@nickle.org Fri Jul 26 21:09:08 2002 From: nickle@nickle.org (Keith Packard) Date: Fri, 26 Jul 2002 13:09:08 -0700 Subject: [Nickle]Union values Message-ID: Nickle unions provide a way to get limited polymorphism with compile-time typechecking for most operations and explicit run-time typechecking for others. There are two ways to declare unions: union { { , { , ...} } Now you're asking yourself what enums have to do with unions. Borrowing from ML, we define the 'void' (unit in ML) type as that primitive type which holds only one value, '<>'. An 'enum' type is just a union where all of the members are of 'void' type. These two types are equivalent: typedef union { void a, b, c; } u; typedef enum { a, b, c } e; (well, they're equilvalent *now*, but this morning, you would have used 'enum' in the union declaration rather than 'void'). Now we come to union values. There are currently four ways of constructing a union value: . = = . for void members = . ( ) for non-void members = ( . ) for non-void members Here's some examples: typedef union { int i; real r; void v; } u; u x; x.v = <> x = u.v x = u.i (10) x = (u.i) 10 It's seems obvious to me that the ( . ) form is useless and should be discarded. Another change I've just made is an attempt to make the union switch statement easier to use. Recalling the original syntax: union switch () { case : statement ... case : statement ... ... default: statement ... } This is inconvenient because within the enclosed statements, there's no way to refer to the value associated with without explicitly declaring a local variable: ... case : = . ; statements ... ... I've extended the case syntax to include a name which is bound to the referenced union member and whose type is the type of the named member of . Where is a poly, the type will be : case : statements (involving ) ... Because is implicitly typed, I've made it illegal to fall through into one of these cases: union switch (x) { case i ivalue: printf ("i: %d\n", ivalue); case r rvalue: printf ("r: %d\n", rvalue); This code generates a semantic error at compile time. I think this change makes unions less onerous to deal with. If we agree, I'd like to also propose that the NULL pointer values be eliminated from nickle entirely. They are an endless source of typechecking grief and are better implemented using unions with void members: Here's what lists look like with unions: typedef conscell; typedef union { *conscell ref; void end; } list; typedef struct { poly car; list cdr; } conscell; poly car (list l) { return l.ref->car; } poly cdr (list l) { return l.ref->cdr; } bool nilp (list l) { return l == list.end; } list cons (poly car, list cdr) { return list.ref (reference ((conscell) { car = car, cdr = cdr })); } list conc (list lists...) { list conc_help(int i) { switch (i) { case dim(lists): return list.end; case dim(lists) - 1: return lists[i]; default: list conc_two (list a, list b) { union switch (a) { case ref r: return cons (r->car, conc_two (r->cdr, b)); case end: return b; } } return conc_two (lists[i], conc_help (i + 1)); } } return conc_help (0); } list map (poly(poly) f, list l) { union switch (l) { case ref r: return cons (f(r->car), map (f, r->cdr)); case end: return l; } } void printlist (list l) { map (void func (poly car) { printf ("%d ", car); }, l); printf ("\n"); } list l = cons (1, cons (2, cons (3, list.end))); list l1 = map (int func (int i) { return i + 1; }, l); printlist (l); printlist (l1); printlist (conc (l, l1)); Note that the primitive type is a *reference* to a list -- without that, Nickle's deep-copy semantics cause lists to be copied across each assignment or function call. All of the union switch statements above could be replaced with conditional tests: list map (poly(poly) f, list l) { if (nilp (l)) return list.end; return cons (f(car(l)), map (f, cdr(l))); } To some extent, that's just a matter of style. -keith From nickle@nickle.org Fri Jul 26 22:39:01 2002 From: nickle@nickle.org (Keith Packard) Date: Fri, 26 Jul 2002 14:39:01 -0700 Subject: [Nickle]Union values In-Reply-To: Your message of "Fri, 26 Jul 2002 21:21:27 -0000." <15681.48471.231582.435631@scream.east.isi.edu> Message-ID: Around 21 o'clock on Jul 26, Carl Worth wrote: > That's not as obvious to me. Maybe I'm just slow to grasp new syntax, > but I can't help but reading this as a function call every time I look > at: > > > x = u.i (10) Hmm. Yes, it does seem a bit ambiguous. However, in one sense, you are invoking a constructor which takes an argument (10) and returns a value (the union with i = 10). What could be legal is: (u) { i = 10 } That is more symmetrical with other composite initializers. Unions aren't currently accepted there as they aren't "composite" datatypes like arrays or structures. > It seems potentially confusing to have a "switch" that falls through > and a "union switch" that does not. Union switch *does* fall through, but it's an error to fall into a case block which has a local value for the case variable: union switch (x) { case i ivalue: printf ("%d\n", ivalue); break; case r rvalue: printf ("%d\n", rvalue); } This is legal while the same code without the 'break' will generate a compile error. I've added code to do reachability analysis to perform this particular check. > It would be another departure from C, but one option would be to make > the "switch" not fall through either. In that case, new syntax for an > explicit fall-through could also be provided. (Some might argue that > this would be fixing something that is broken in C anyway). Yes, C is broken. But, I'd like to leave this particular piece of history alone; you either need a "fallthrough" statement or "break", and "fallthrough" isn't enough better to be different. > Yes, unions are looking very appealing now. I don't see any problem > eliminating NULL pointers in favor of unions with void > members. Functions that return straight pointers will be so much more > reliable this way. Ok, I'll go ahead and make that change as well. > Thanks for the example list implementation. That was quite fun. Keith Packard XFree86 Core Team HP Cambridge Research Lab From nickle@nickle.org Fri Jul 26 23:32:26 2002 From: nickle@nickle.org (Keith Packard) Date: Fri, 26 Jul 2002 15:32:26 -0700 Subject: [Nickle]Union values In-Reply-To: Your message of "Fri, 26 Jul 2002 22:18:25 -0000." <15681.51889.89740.737030@scream.east.isi.edu> Message-ID: Around 22 o'clock on Jul 26, Carl Worth wrote: > > (u) { i = 10 } > > That's not bad. I'd take that even though it does require a couple of > extra characters. But you'd rather use the union constructor style instead, right? u.i (10) I'm currently digging out of a horrible mess inside the type system relating to using 0 as a generic pointer and an integer. I'm deleting a lot of code... Keith Packard XFree86 Core Team HP Cambridge Research Lab From nickle@nickle.org Fri Jul 26 23:58:41 2002 From: nickle@nickle.org (Bart Massey) Date: Fri, 26 Jul 2002 15:58:41 -0700 Subject: [Nickle]Union values In-Reply-To: Your message of "Fri, 26 Jul 2002 13:09:08 PDT." Message-ID: In message you wrote: [...] > Now we come to union values. There are currently four ways of > constructing a union value: > > . = (1) > = . (2) for void members > = . ( ) (3) for non-void members > = ( . ) (4) for non-void members Note that the third is problematic for functions. Consider typedef void(void) xxx_t; typedef union { xxx_t f; } u_t; u_t u; xxx_t g = void func(void x) {}; u_t v = u_t.f(g); /* init */ void v = v.f(<>); /* call */ While these last two lines are distinct, it seems confusing to me. I don't want to think about it :-). IMHO, lose (3), not (4). BTW, why are the parens in (4) required? Just as a grammar helper? If so, I'd note that as long as the lexer is tracking typenames anyway, it could also track which ones are bound to union types and return a different token, which might make (4) work better. > If we agree, I'd like to also propose that the NULL > pointer values be eliminated from nickle entirely. A proposal, by the way, due to Andrew Tolmach . Sorry if some of this is stale. A lot of e-mail has been exchanged while I was writing it. Bart From nickle@nickle.org Sat Jul 27 00:30:35 2002 From: nickle@nickle.org (Keith Packard) Date: Fri, 26 Jul 2002 16:30:35 -0700 Subject: [Nickle]Union values In-Reply-To: Your message of "Fri, 26 Jul 2002 15:58:41 PDT." Message-ID: Around 15 o'clock on Jul 26, Bart Massey wrote: > u_t v = u_t.f(g); /* init */ > void v = v.f(<>); /* call */ > > While these last two lines are distinct, it seems confusing > to me. I don't want to think about it :-). IMHO, lose (3), > not (4). You could think of the u_t.f (g) form as a union value constructor invocation, but it's certainly less clear that way. I can do (4) if you'd prefer; it's just a grammar tweak in either case. > BTW, why are the parens in (4) required? Yeah; without a known token, the grammar can't tell when the following expression is an initializer or something following a void union value: typedef union { int i; void v; } u; u x; poly y; x = u.v * y; Attempting to permit this syntax results in 54 shift/reduce conflicts. > If so, I'd note that as long as the lexer is > tracking typenames anyway, it could also track which ones > are bound to union types and return a different token, The lexer can't compute the value of a type expression and use that to control the parser; all it can do is look at a single token. > > pointer values be eliminated from nickle entirely. > > A proposal, by the way, due to Andrew Tolmach > . Yes, I'd forgotten to mention that part -- as usual, the simplest solution is to eliminate the problem, rather than piling yet more kludges on. -keith From nickle@nickle.org Fri Jul 26 22:21:27 2002 From: nickle@nickle.org (Carl Worth) Date: Fri, 26 Jul 2002 21:21:27 +0000 Subject: [Nickle]Union values In-Reply-To: References: Message-ID: <15681.48471.231582.435631@scream.east.isi.edu> Keith, I really like this! A few comments. On Jul 26, Keith Packard wrote: > = . ( ) for non-void members > = ( . ) for non-void members > > It's seems obvious to me that the ( . ) form is > useless and should be discarded. That's not as obvious to me. Maybe I'm just slow to grasp new syntax, but I can't help but reading this as a function call every time I look at: > x = u.i (10) On the other hand, the following syntax seems to parallel the assignment of structured values quite well. (This, in spite of the fact that I've been lobbying to get rid of the type in that case): > x = (u.i) 10 But I agree that these are redundant and we don't need/want both. > I've extended the case syntax to include a name which is bound to the > referenced union member and whose type is the type of the named > member of . > > case : > statements (involving ) ... This is nice. And I'm not having any trouble reading this new syntax. :) > Because is implicitly typed, I've made it illegal to fall through > into one of these cases: It seems potentially confusing to have a "switch" that falls through and a "union switch" that does not. It would be another departure from C, but one option would be to make the "switch" not fall through either. In that case, new syntax for an explicit fall-through could also be provided. (Some might argue that this would be fixing something that is broken in C anyway). The hardest thing about that is that it could break a lot of existing code without any warning. > I think this change makes unions less onerous to deal with. If we agree, > I'd like to also propose that the NULL pointer values be eliminated from > nickle entirely. They are an endless source of typechecking grief and > are better implemented using unions with void members: Yes, unions are looking very appealing now. I don't see any problem eliminating NULL pointers in favor of unions with void members. Functions that return straight pointers will be so much more reliable this way. Thanks for the example list implementation. -Carl -- Carl Worth USC Information Sciences Institute cworth@east.isi.edu 3811 N. Fairfax Dr. #200, Arlington VA 22203 703-812-3725 From nickle@nickle.org Fri Jul 26 23:18:25 2002 From: nickle@nickle.org (Carl Worth) Date: Fri, 26 Jul 2002 22:18:25 +0000 Subject: [Nickle]Union values In-Reply-To: References: <15681.48471.231582.435631@scream.east.isi.edu> Message-ID: <15681.51889.89740.737030@scream.east.isi.edu> On Jul 26, Keith Packard wrote: > What could be legal is: > > (u) { i = 10 } That's not bad. I'd take that even though it does require a couple of extra characters. > > It seems potentially confusing to have a "switch" that falls through > > and a "union switch" that does not. > > Union switch *does* fall through, but it's an error to fall into a case > block which has a local value for the case variable: Yes, sorry -- I was typing too fast and thinking too slow with this one. With a compile time error as you described I have no objections. -Carl -- Carl Worth USC Information Sciences Institute cworth@east.isi.edu 3811 N. Fairfax Dr. #200, Arlington VA 22203 703-812-3725 From nickle@nickle.org Sat Jul 27 00:56:31 2002 From: nickle@nickle.org (Keith Packard) Date: Fri, 26 Jul 2002 16:56:31 -0700 Subject: [Nickle]Type narrowing Message-ID: Nickle currently has some relatively simplistic type narrowing for expressions involving 'poly' types. For example: poly i, j; i + j; The addition operator can only generate a few possible types, among them: *poly a; int x, y; rational u, v; real c, d; "hello" + "world" -> string a + 1 -> *poly x + y -> int u + v -> rational c + d -> real Nickle takes advantage of this and narrows the result of 'i + j' to this list of types; this can be helpful in cases like: typedef struct { int i; real r; } s; s sv; sv = i + j; Nickle emits the following error: Incompatible types in assignment 's' = 'string, *poly, int, rational, real' The question is whether this narrowing is reasonable, or whether I should just use the single least upper bound type in all cases. Keith Packard XFree86 Core Team HP Cambridge Research Lab From nickle@nickle.org Sat Jul 27 00:59:50 2002 From: nickle@nickle.org (Carl Worth) Date: Fri, 26 Jul 2002 23:59:50 +0000 Subject: [Nickle]Union values In-Reply-To: References: <15681.51889.89740.737030@scream.east.isi.edu> Message-ID: <15681.57974.804816.492289@scream.east.isi.edu> On Jul 26, Keith Packard wrote: > Around 22 o'clock on Jul 26, Carl Worth wrote: > > > > (u) { i = 10 } > > > > That's not bad. I'd take that even though it does require a couple of > > extra characters. > > But you'd rather use the union constructor style instead, right? > > u.i (10) I'm not so sure. Now it looks like a "member function" for an object -- icky. > I'm currently digging out of a horrible mess inside the type system > relating to using 0 as a generic pointer and an integer. I'm deleting a > lot of code... We must be doing something right then. Fun. :) -Carl -- Carl Worth USC Information Sciences Institute cworth@east.isi.edu 3811 N. Fairfax Dr. #200, Arlington VA 22203 703-812-3725 From nickle@nickle.org Sat Jul 27 01:04:31 2002 From: nickle@nickle.org (Carl Worth) Date: Sat, 27 Jul 2002 00:04:31 +0000 Subject: [Nickle]Union values In-Reply-To: References: Message-ID: <15681.58255.623749.977306@scream.east.isi.edu> On Jul 26, Bart Massey wrote: > Note that the third is problematic for functions. Consider > > u_t v = u_t.f(g); /* init */ > void v = v.f(<>); /* call */ > > While these last two lines are distinct, it seems confusing > to me. I don't want to think about it :-). IMHO, lose (3), > not (4). A good additional point Bart. I also prefer keeping (4) instead of (3). -Carl From nickle@nickle.org Sat Jul 27 01:05:34 2002 From: nickle@nickle.org (Keith Packard) Date: Fri, 26 Jul 2002 17:05:34 -0700 Subject: [Nickle]Union values In-Reply-To: Your message of "Fri, 26 Jul 2002 23:59:50 -0000." <15681.57974.804816.492289@scream.east.isi.edu> Message-ID: Around 23 o'clock on Jul 26, Carl Worth wrote: > > u.i (10) > > I'm not so sure. Now it looks like a "member function" for an object > -- icky. Ok, I'll bend. Union values are in either: (type.name) value or type.name form. > We must be doing something right then. Fun. :) I could delete a lot more code if I used the least upper bound for type computations rather than a list of more specific types when dealing with poly. I'm not sure I care about poly enough to have this complexity. -keith From nickle@nickle.org Mon Jul 29 09:02:56 2002 From: nickle@nickle.org (Bart Massey) Date: Mon, 29 Jul 2002 01:02:56 -0700 Subject: [Nickle]Boolean type, twixt, static vs global, built-in profiling Message-ID: I haven't heard much comment from anyone on the recent dramatic change to Nickle to use a bool type instead of int everywhere. I'm not sure I like it: it is quite un-C-like and it hasn't helped me noticeably yet. Do we have any Nickle users? Have you noticed yet? If we leave things the way they are, we probably need to give up on the else clause of twixt(), and simply assert that everything should raise exceptions on failure. Right now, I'm having to write import File; twixt((file f = open("/tmp/foo", "w")), true; close(f)) { ... } where the comma expression is pretty obviously bogus but shuts the compiler up. If we ditch the else clause and have to use exceptions, we need some more convenient way of handling those exceptions than wrapping the twixt() in a try. One possibility would be to get rid of try {}, which is an endless pain in the neck to use in any case. We've talked about this before, but haven't come up with a great alternative. The "obvious" ones from the POV of this example are: (1) allow any block to be followed by a string of catch clauses, eliminating the try keyword. Thus, one could write twixt(file f = open("/tmp/foo", "w"); close(f)) { ... } catch open_error(fn, msgcode, msg) { ... } This is ugly syntax, though, and it's not obvious that the catch clause should indeed scope in the open() in the twixt() head. Another possibility (2) is to make catch a first-class statement. We talked about making it extend to the end of the current scope, ala catch open_error(fn, msgcode, msg) { ... } ... twixt(file f = open("/tmp/foo", "w"); close(f)) { ... } but then falling out of the catch body almost always does the wrong thing. The alternative (3) I think I favor is to have the catch extend to the *beginning* of the current scope, ala twixt(file f = open("/tmp/foo", "w"); close(f)) { ... } ... catch open_error(fn, msgcode, msg) { ... } This is very convenient for the programmer, but of course the reader surprise factor here is nontrivial. Other suggestions? AFAIK, in an outer function scope, static and global mean the same thing. Also AFAIK, if a function is static or global, there is no reason to make its auto variables static or global. IOW, in namespace Eg { public bool f (int i) { static bool[*] g() { bool[*] u = {true, false, true}; return u; } static bool[*] v = g(); return v[i]; } the definitions of g and v only can get evaluated once, so there's no reason to use global here. Thus, the auto class of u is harmless, since it will only be initialized once anyhow. Have I missed something? Note that this code is a poor-man's way of doing array comprehensions: better syntax would be nice. Finally, Keith, I give up: I figured out how to turn on Nickle-statement-level profiling, and it looks from the source like just printing a function ought to now show its per-statement profile data, but I'm afraid I'm not having that experience. Suggestions? Bart Massey bart@cs.pdx.edu From nickle@nickle.org Mon Jul 29 17:25:23 2002 From: nickle@nickle.org (Keith Packard) Date: Mon, 29 Jul 2002 09:25:23 -0700 Subject: [Nickle]Boolean type, twixt, static vs global, built-in profiling In-Reply-To: Your message of "Mon, 29 Jul 2002 01:02:56 PDT." Message-ID: Around 1 o'clock on Jul 29, Bart Massey wrote: > If we leave things the way they are, we probably need to > give up on the else clause of twixt(), and simply assert > that everything should raise exceptions on failure. Right > now, I'm having to write > import File; > twixt((file f = open("/tmp/foo", "w")), true; close(f)) { That makes try_acquire much less useful; I'm not sure I want it raising an exception when the mutex can't be grabbed. But, the bool type has already helped me out: if (a & b == 0) Without the bool type, this typechecks just fine and gives the wrong answer. For your twixt example, really what you wanted was: import File; file f; twixt (bool func () { try { f = open ("/tmp/foo", "w"); } catch open_error (string message, errorType error, string name) { return false; } return true; } (); close (f)) { fprintf (f, "hello, world\n"); } This turns the exception into an appropriate boolean; of course, the lambda could check the errorType... > Finally, Keith, I give up: I figured out how to turn on > Nickle-statement-level profiling, and it looks from the > source like just printing a function ought to now show its > per-statement profile data, but I'm afraid I'm not having > that experience. Suggestions? Just a bug in the profile code when I converted to bool. Try profile(false) for a good time, or update to current CVS :-) -keith From nickle@nickle.org Mon Jul 29 20:23:22 2002 From: nickle@nickle.org (Bart Massey) Date: Mon, 29 Jul 2002 12:23:22 -0700 Subject: [Nickle]Boolean type, twixt, static vs global, built-in profiling In-Reply-To: Your message of "Mon, 29 Jul 2002 09:25:23 PDT." Message-ID: In message you wrote: > Around 1 o'clock on Jul 29, Bart Massey wrote: > > If we leave things the way they are, we probably need to > > give up on the else clause of twixt(), and simply assert > > that everything should raise exceptions on failure. Right > > now, I'm having to write > > import File; > > twixt((file f = open("/tmp/foo", "w")), true; close(f)) { > > That makes try_acquire much less useful; I'm not sure I want it raising an > exception when the mutex can't be grabbed. try_acquire() is just a bad idea anyhow :-). But to take a page from your book, why not just write this? exception no_acq(); try { twixt(bool func(){ if (!try_acquire(m)) raise no_acq();}(); release(m)) { ... } } catch no_acq() { ... } Answer: because it is unreadably ugly :-). Also because it isn't super clear you want to retry the try_acquire(m) on re-entry anyway. IMHO this whole thing is a mass of potential bugs: if the try_acquire() succeeds initially but not on subsequent re-entry, things get pretty weird. Perhaps better is if (try_acquire(m)) { twixt(;release(m)) { ... } } else { ... } which ensures the release, anyhow. This begs one question: for inline use, the lambda syntax is verbose and inconvenient, partly because the types are a pain. This is what I ran into with initializer functions for arrays as well. Perhaps we need a smalltalk-like "block expression" that works something like (func x *= x; return x + y;) where the expression is constructed as a lambda and auto-evaluated and its type is the result of the return expression else void. While I'm at it, we probably need a builtin apply() where poly apply(poly fn(...), poly[*] args) and the actuals in the array are supplied as the formals of fn(). Obviously, static typechecking goes out the window here, at least for now. C'est la vie. Also, while I'm at it, you were going to make else syntactically independent of if, so that we could get rid of that stupid extra semicolon at top level interactive. Any reason not to do this? > But, the bool type has already helped me out: > > if (a & b == 0) > > Without the bool type, this typechecks just fine and gives the wrong > answer. I know. I just wanted feedback from a third and fourth person for a change of this scope if possible... > For your twixt example, really what you wanted was: > > import File; > > file f; > > twixt (bool func () { > try { > f = open ("/tmp/foo", "w"); > } catch open_error (string message, errorType error, string name) { > return false; > } > return true; } (); > close (f)) > { > ... > } > > This turns the exception into an appropriate boolean; of course, the lambda > could check the errorType... No, this really *isn't* what I wanted to write: see above :-). I really intend the twixt() to be *good* syntax for things like opening and closing files, creating and removing temp files, locking and unlocking mutexes, etc. > > Finally, Keith, I give up: I figured out how to turn on > > Nickle-statement-level profiling, and it looks from the > > source like just printing a function ought to now show its > > per-statement profile data, but I'm afraid I'm not having > > that experience. Suggestions? > > Just a bug in the profile code when I converted to bool. Try > profile(false) for a good time, or update to current CVS :-) Got it, thanks :-). Now, how do I read the mystic pairs of numbers? What are the units? I think one is self and one is sub, right? Thanks much, Bart From nickle@nickle.org Mon Jul 29 21:00:07 2002 From: nickle@nickle.org (Keith Packard) Date: Mon, 29 Jul 2002 13:00:07 -0700 Subject: [Nickle]Boolean type, twixt, static vs global, built-in profiling In-Reply-To: Your message of "Mon, 29 Jul 2002 12:23:22 PDT." Message-ID: Around 12 o'clock on Jul 29, Bart Massey wrote: > try_acquire() is just a bad idea anyhow :-). But to take a > page from your book, why not just write this? > exception no_acq(); > try { > twixt(bool func(){ if (!try_acquire(m)) raise no_acq();}(); > release(m)) { > ... > } > } catch no_acq() { > ... > } Hmm. Perhaps the elements of the twixt could be statements instead of expressions? twixt (if (!try_acquire ()) raise no_acq (); release()) { } > Perhaps we need a smalltalk-like "block expression" that works something like > (func x *= x; return x + y;) GCC already has something like this, the syntax is: ({ x *= x; }) But, it's defined to have a void type; we could extend the syntex to permit return which could give a non-void value. That would make our try_acquire block look more like: twixt (({ if (!try_acquire()) raise no_acquire(); }); release()) { } That almost looks sensible now. > While I'm at it, we probably need a builtin apply() > where > poly apply(poly fn(...), poly[*] args) > and the actuals in the array are supplied as the formals of > fn(). Obviously, static typechecking goes out the window > here, at least for now. C'est la vie. Actually, we could invert the current varargs syntax: int foo (int i, int j, int k) {} int[*] b = { 1, 2, 3 }; foo (b...) This would retain the same level of typechecking that the current varargs stuff does. -keith From nickle@nickle.org Mon Jul 29 21:12:57 2002 From: nickle@nickle.org (Carl Worth) Date: Mon, 29 Jul 2002 20:12:57 +0000 Subject: [Nickle]Boolean type, twixt, static vs global, built-in profiling In-Reply-To: References: Message-ID: <15685.41417.51665.331195@scream.east.isi.edu> On Jul 29, Keith Packard wrote: > > Hmm. Perhaps the elements of the twixt could be statements instead of > expressions? > > twixt (if (!try_acquire ()) > raise no_acq (); > release()) > { > } This seems sensible to me. So, with this, "twixt ... else" would be no more? (Personally, it's always seemed wrong to me that twixt was inextricably entwined with an implicit if statement anyway). > But, it's defined to have a void type; we could extend the syntex to permit > return which could give a non-void value. That would make our try_acquire > block look more like: > > twixt (({ if (!try_acquire()) raise no_acquire(); }); release()) { > } I'm still not clear on the intended semantics here. What is the value that results from evaluating the "block expression" above? -Carl -- Carl Worth USC Information Sciences Institute cworth@east.isi.edu 3811 N. Fairfax Dr. #200, Arlington VA 22203 703-812-3725 From nickle@nickle.org Mon Jul 29 21:20:06 2002 From: nickle@nickle.org (Carl Worth) Date: Mon, 29 Jul 2002 20:20:06 +0000 Subject: [Nickle]Type narrowing In-Reply-To: References: Message-ID: <15685.41846.222884.75984@scream.east.isi.edu> On Jul 26, Keith Packard wrote: > The question is whether this narrowing is reasonable, or whether I should > just use the single least upper bound type in all cases. Not a direct answer to your question. But you got me thinking. It seems to be that poly and union are functionally equivalent. (union defines a type that can hold a value of any type belonging to a set of types given in the union definition. poly is effectively a union with an unrestricted set of allowable types). Given that, might it not make sense to use the union value syntax when narrowing the type of a poly value? Beyond making the union/poly similarity more obvious, this would also have the benefit of making more run-time type checking explicit in the code. -Carl -- Carl Worth USC Information Sciences Institute cworth@east.isi.edu 3811 N. Fairfax Dr. #200, Arlington VA 22203 703-812-3725 From nickle@nickle.org Mon Jul 29 21:59:36 2002 From: nickle@nickle.org (Keith Packard) Date: Mon, 29 Jul 2002 13:59:36 -0700 Subject: [Nickle]Type narrowing In-Reply-To: Your message of "Mon, 29 Jul 2002 20:20:06 -0000." <15685.41846.222884.75984@scream.east.isi.edu> Message-ID: Around 20 o'clock on Jul 29, Carl Worth wrote: > It seems to be that poly and union are functionally equivalent. (union > defines a type that can hold a value of any type belonging to a set of > types given in the union definition. poly is effectively a union with > an unrestricted set of allowable types). Almost, but not quite. Unions in nickle must be explicitly typechecked anytime you use the value held within them while poly values are only typechecked at runtime when interactive with stronger types. At this point, you should treat 'poly' as a crutch for a weak language; we need paramorphic polymorphism to solve most of the cases currently using 'poly', except of course where applications really do want only runtime typechecking to match some existing semantics (like your FP experiments). I was using unions to hold the list of types narrowed from poly, but that really warped the internal use of unions. Using a proper separate representation has removed most of the gore surrounding typechecking and unions; removing 'null' types fixed most of the remaining ugliness. -keith From nickle@nickle.org Mon Jul 29 21:59:39 2002 From: nickle@nickle.org (Keith Packard) Date: Mon, 29 Jul 2002 13:59:39 -0700 Subject: [Nickle]Boolean type, twixt, static vs global, built-in profiling In-Reply-To: Your message of "Mon, 29 Jul 2002 20:12:57 -0000." <15685.41417.51665.331195@scream.east.isi.edu> Message-ID: Around 20 o'clock on Jul 29, Carl Worth wrote: > Not unless we also do it for "for". IMHO, the problem here > is still that try_acquire() is just a bad idea: adapting the > language to fit it is wrong. Do you have a better example? No. The whole twixt else block was designed to cope with try_acquire. > In any case, I'm convinced that the folks with bools should > have to work around twixt, not the folks with exceptions. Makes sense to me; try_acquire is generally poor practice for most applications anyway. > Hate their syntax, but I guess I can live with it. Existing > practice and all that. Let's stick with void as the resulting value type so that we don't have to deal with 'return' inside of this. If we think of another use for this little syntactic gem, we can reconsider 'return'. Without return, this should take about two minutes to implement. > Perfect! I assume > int[*] b = {2, 3}; > foo (1, b...) > would also be legal, which is even better. Yes, and it's nearly implemented. -keith Keith Packard XFree86 Core Team HP Cambridge Research Lab From nickle@nickle.org Tue Jul 30 02:37:15 2002 From: nickle@nickle.org (Keith Packard) Date: Mon, 29 Jul 2002 18:37:15 -0700 Subject: [Nickle]Boolean type, twixt, static vs global, built-in profiling In-Reply-To: Your message of "Mon, 29 Jul 2002 12:23:22 PDT." Message-ID: Around 12 o'clock on Jul 29, Bart Massey wrote: > Also, while I'm at it, you were going to make else > syntactically independent of if, so that we could get rid of > that stupid extra semicolon at top level interactive. Any > reason not to do this? Yes, I discovered the reason not to do this: if ((int i = foo ()) == 0) printf ("i is zero\n"); else printf ("i is %d\n", i); When 'if' and 'else' are separate statements, I loose the if namespace at the end of that statement so that the else block can't see variables declared within the scope of the conditional. Fixing this would be tricky... -keith From nickle@nickle.org Tue Jul 30 03:43:47 2002 From: nickle@nickle.org (Keith Packard) Date: Mon, 29 Jul 2002 19:43:47 -0700 Subject: [Nickle]Boolean type, twixt, static vs global, built-in profiling In-Reply-To: Your message of "Mon, 29 Jul 2002 12:23:22 PDT." Message-ID: Around 12 o'clock on Jul 29, Bart Massey wrote: > No, this really *isn't* what I wanted to write: see above > :-). I really intend the twixt() to be *good* syntax for > things like opening and closing files, creating and removing > temp files, locking and unlocking mutexes, etc. I've removed the 'else' clause from 'twixt' and made the 'enter' expression not conditional anymore. The results inside the interpreter are quite salutory, cleaning up quite the mess generated by the handling the conditional. Now twixt is nice and symmetrical. I've also added the expression-level statement, as expected it took about 2 minutes -- a new production in the grammar and a two line addition to the compiler. These are *always* void at this point, making them able to return a value would be a bit tricky, and would somewhat overload the 'return' statement. I suggest that if these statement expressions want to return a value, a nice lambda would do the trick. The varactual patch is also done and nicely symmetrical with the varargs syntax: poly (poly...) compose (poly f, poly g) { poly[*] split((*poly[*]) a, int start, int len) { poly[*] p = [len]{}; for (int i = 0; i < len; i++) p[i] = a*[i+start]; return p; } poly composite(poly args...) { return f (g (split (&args, 0, func_args (g))...), split (&args, func_args (g), func_args (f)-1)...); } return composite; } Now compose works like it's supposed to. Note that you do get some typechecking with this; the requirement is that the type of the array element in the actual matches the type of each formal it will be assigned to. -keith From nickle@nickle.org Tue Jul 30 11:58:32 2002 From: nickle@nickle.org (Carl Worth) Date: Tue, 30 Jul 2002 10:58:32 +0000 Subject: [Nickle]Boolean type, twixt, static vs global, built-in profiling In-Reply-To: References: Message-ID: <15686.29016.358497.505728@scream.east.isi.edu> On Jul 29, Keith Packard wrote: > I've removed the 'else' clause from 'twixt' ... > > I've also added the expression-level statement, ... > > The varactual patch is also done and nicely symmetrical with the > varargs syntax... > > Now compose works like it's supposed to... which means you also added the reflexive func_args builtin. Wow, you've been busy Keith! Thanks for all the fun features. Looking at the split function you wrote for compose. Do we want some builtin support for array manipulation? It could be a function similar to your split, (although I'd vote for "slice"). Or the array index could accept an array of values. That, along with some support for easily constructing arrays of integers could be quite handy. So, given: int[*] a = {1, 2, 4, 8, 16, 32, 64, 128}; int[*] b; Some options for slicing out elements 2 through 6 could be, (sorted in order of increasing language support required): (A) b = slice(a, 2, 6); /* New builtin slice */ (B) b = a[seq(2, 6)]; /* New indexing by array */ (C) b = a[2 .. 6]; /* (B) plus array construction operator */ I have no idea if adding a ".." operator for integer array construction would be a good idea or not in terms of precedence or other concerns. Actually, it's probably not a good choice given the new use of "..." in varactual support. I suppose `:' might be another choice for the operator's name, (no, that would definitely interfere with "? :"). I think I'd prefer option (B). It doesn't require too much new language support, and it's quite flexible. For example, one might want to extract every even-indexed element from 2 to 6: b = a[seq(2, 2, 6)]; What think ye? Other ideas? -Carl -- Carl Worth USC Information Sciences Institute cworth@east.isi.edu 3811 N. Fairfax Dr. #200, Arlington VA 22203 703-812-3725 From nickle@nickle.org Tue Jul 30 17:51:46 2002 From: nickle@nickle.org (Keith Packard) Date: Tue, 30 Jul 2002 09:51:46 -0700 Subject: [Nickle]Boolean type, twixt, static vs global, built-in profiling In-Reply-To: Your message of "Tue, 30 Jul 2002 10:58:32 -0000." <15686.29016.358497.505728@scream.east.isi.edu> Message-ID: Around 10 o'clock on Jul 30, Carl Worth wrote: > Looking at the split function you wrote for compose. Do we want some > builtin support for array manipulation? It could be a function similar > to your split, (although I'd vote for "slice"). Or the array index > could accept an array of values. That, along with some support for > easily constructing arrays of integers could be quite handy. I think comprehensions would give us relatively clean syntax without any artificial limitations on what kinds of subsetting could be done. A comprehension is essentially a lambda evaluated for each element of an array during initialization: (int[3]) { comprehension (int i) { return i * 2; } } We can either use these as immediate values or build simple array manipulation functions from them. -keith