From: hbn on
Hi all,

I'm currently working on an implementation of van Emde Boas trees in C,
see http://en.wikipedia.org/wiki/Van_emde_boas_tree for some further
information and a few more links. I have a pretty good grasp on the
structure in general but I have a single question; specifically
regarding the construction of new (sub)trees.

A vEB-tree to hold integers of m bits contains an array of m/2
sub-trees, each capable of holding integers of m/2 bits. My question
regards the initialization of this array. I understand the concept of
lazy initialization - that is not creating the sub-trees before they
are actually required, but what I don't get is how to create a new
(sub)tree in O(1) time. The array holding the sub-trees can't be
initialized with, say NULL pointers, in constant time. And as far as I
can see the whole analysis breaks if it isn't possible to create new
(sub)trees in constant time, which is required if the running time of
"Insert new key" should be O(log log n).
When inserting new keys it's necessary to know which subtrees have
already been created, this can't be determined by looking at the array
alone if we don't initialize each element in the array with some know
value - and if we initialize the array explicitly (even via calloc or
some such) we break the desired running time.

A way to solve the above problem is to create a perfect hash-table with
the indices of the array where subtrees has been created, but none of
the articles I've read seems to say that this is a requirement. What am
I missing here?

Any opinions, pointers, advice greatly appreciated.

hbn