From: Robert Klemme on 4 Apr 2010 15:17
On 04.04.2010 20:21, Caleb Clausen wrote:
> On 4/4/10, Robert Klemme <shortcutter(a)googlemail.com> wrote:
>> On 04/03/2010 03:07 AM, Caleb Clausen wrote:
>>> Oh, I wish for a real integer set class!
>> Well, there's Fixnum / Bignum:
> Well, yeah, and that's fine for storing sets of small integers. Or I
> could put my Integers into a Set, and that's fine for small sets of
> integers. I have used both techniques in the past. What I have in mind
> is more scalable: a way to store large sets of large integers.
> Efficiently. So, what I would like to see is a data structure with
> O(1) (or at most O(log(n))) insertion, deletion, and lookup times, as
> well as (as close as possible to) O(1) memory cost per stored item.
> Contiguous ranges of integers should be storable with not much more
> memory cost than that of a single integer.
> Bitmaps come close, but also cost O(1) bits per integer (< the largest
> integer stored) which _isn't_ in the set. A hash-based set costs
> something like O(2*wordsize) bits per stored integer.
I believe, it's more since you explicitly stated you want large numbers
> Perhaps this can
> be reduced to O(2+wordsize) if using google's sparse array/sparse
> hash. (Which Roger has been kind enough to add a ruby interfaces
> for.... but not an integer set.)
> Of course there's a contradiction here; if:
> 1) the integer set is fairly dense, and
> 2) members are randomly distributed throughout it,
> then Kolmogorov complexity theory says you can't get any better
> storage efficiency than a bitmap. So, this miraculous data structure
> I'm imagining has to be allowed to discard one or the other of those
> assumptions. Discarding 1) leads to something like google's sparse
> array (sparse bitmap really; I'm not sure if google's sparse array
> code could be used as is). Discarding 2) leads you in the direction of
> a tree or trie. That's what I'm more interested in. Maybe something
> based on Judy would suit my purposes....
> I've wanted this several times, but never really needed it badly
> enough to go and implement what I want to see. (As it almost surely
> involves writing a fair bit of c code.)
Take that as a hint. :-)
remember.guy do |as, often| as.you_can - without end