[Nickle] Hash keys not copied?

Bart Massey bart at po8.org
Mon Dec 31 10:19:37 PST 2007


In message <1199092680.5663.40.camel at koto.keithp.com> you wrote:
> I know this issue has come up, but it just bit me fairly hard and I
> thought I'd raise it again.
> 
> The keys in hash objects are not copied, which means that future changes
> to the key invalidate the hash table -- the table uses a 32-bit hash of
> the key to distribute objects across the array, but with the value
> changing, it will not be moved in the table. Thus, lookups on the old
> value will fail (as the value is now different), but lookups on the
> *new* value will also fail (it will look under the wrong 32-bit hash
> key).
> 
> I don't see any way to avoid copying the key.

If you restrict keys to being immutable values you avoid the
problem; there's a strong argument that keys should be
values, not data structures.

An alternative that gives them same semantics (with very
very high probability) as copying the keys is just to make
the hash much bigger and only compare hashes, not the actual
keys; this will save space and time on large keys, but might
make things worse on smaller ones.  You would do this only
with mutable keys; immutable keys don't need to be copied in
any case, as they can just be referenced.  You could also do
it only with keys larger than a given size, although the
user semantics get ugly here.

If you want the semantics that mutable data structures are
allowed and you always look up based on the current
contents, then you can play complicated games with the mark
bit on boxes to get what you want.  When a box is used in a
hash key, mark the box; when you store through a marked box,
go look in a list of hash tables containing that box and do
the updates.  Uggh.

I vote for immutable.  My second choice would be key-copying
and my third would be probabilistic, but it's close.  I
think current-value is confusing and ugly.

	Bart


More information about the Nickle mailing list