3 questions about arrays [Lisp]

Prev: Checking for declarations in macros
Next: scripting in CL -- please comment

From: Pascal J. Bourguignon on 4 Dec 2009 15:50

Tim Bradshaw <tfb(a)cley.com> writes:

> On 2009-12-04 18:01:06 +0000, tar(a)sevak.isi.edu (Thomas A. Russ) said:
>
>> Yes. And you want to make sure array access is efficient. Thus the
>> limitation on only allowing fixnums as indicies
>
> This is an important point. People's expectations of arrays are that
> they will be fast. If array indexing involves consing bignums all
> over the place that is not likely to be true. People who are using
> machines with relatively small word sizes where (by some trick), a
> Lisp implementation can map much more memory than the machine can
> address (I suppose this might be the case on some architecture with
> long pointers but small integers) should still be able to assume that
> array access is quick. If they need much larger objects, they can use
> arrays of arrays, for instance.
>
> I can't imagine any case where the limitation of array indices to
> fixnums is a limitation: are there any current systems where an
> individual process can address more than a machine-word's worth of
> space? (Note, not where the system as a whole can have more physical
> memory than that - I think that was reasonably common on some later
> 32-bit systems wasn't it?)

AFAIK, there remains no such processors currently.

On the Intel processors before the 386 (you know, when 640 KB was
enough for anybody), where segmented addressing was the rule, arrays
were often limited to 64 KB, so a 16-bit fixnum was largely enough to
index them.

--
__Pascal Bourguignon__

From: Thomas A. Russ on 4 Dec 2009 17:04

Tim Bradshaw <tfb(a)cley.com> writes:

> I can't imagine any case where the limitation of array indices to
> fixnums is a limitation: are there any current systems where an
> individual process can address more than a machine-word's worth of
> space? (Note, not where the system as a whole can have more physical
> memory than that - I think that was reasonably common on some later
> 32-bit systems wasn't it?)

Well, we ran into some systems where the specfication LOWER limit on the
size of fixnums was actually realized. It was a 32-bit architecture
that had only 16-bit fixnums. That was a bit limiting.

(We ran into the problem with using our own hash codes, where we wanted
fixnums for speed, but ended up with some constants that turned out to
be too large for 16 bits....)

I think this was true for an early version of lispworks for Windows.

BTW, just for reference, here is the fixnum sizes for a number of lisps
that I happen to have easy access to.

Mac OS X:
bits implementation
---- --------------
29 LispWorks Personal 5.1 (32-bit)
60 ClozureCL x8664 (64-bit)
24 CLISP 2.47
29 CMUCL 19e
29 SBCL 1.0.30
31 ABCL 0.17.0

Red Hat Linux (32-bit)
bits implementation
---- --------------
29 Allegro CL 5.0, 6.2, 7.0
24 CLISP 2.41.1
29 CMUCL 19c

So, it looks like 29 bits is one of the most popular choices for 32-bit
architectures.

--
Thomas A. Russ, USC/Information Sciences Institute

From: Kaz Kylheku on 4 Dec 2009 20:42

On 2009-12-04, Pascal J. Bourguignon <pjb(a)informatimago.com> wrote:
> Kaz Kylheku <kkylheku(a)gmail.com> writes:
>> Why array-dimension-limit is an exclusive limit is probably for historic
>> reasons.
>>
>> Maybe when Lisp was standardized, implementations did not agree whether or not
>> this limit is inclusive or exclusive. If implementations don't agree about a
>> parameter like this, what can you recommend to programmers? Treat it as
>> exclusive, of course! That's the weaker, more portable assumption.
>>
>> Or maybe all of the implementations had the exclusive behavior, which
>> would have been codified as-is.
>
> And again, once you have filled the memory with your array, you need
> at least one word to store the only instruction that can make you
> program up, so there's no point in going to fixnum+1.

That's a separate issue from that of knowing whether array-dimension-limit
is exclusive or not. The actual limit may be as little as 1024, right,
or as large as most-positive-fixnum.

No matter how large or small the limit is, if you want to make arrays which
stretch to the limit, you have to know whether the limit is exclusive or
inclusive. If you are forced to guess, the safer guess is exclusive.

On CLISP, you have a most-positive-fixnum of 16.7 million. You can easily make
an array of 32 bit words on that order on the average desktop PC, without
coming anywhere near filling memory.

From: Kaz Kylheku on 4 Dec 2009 21:05

On 2009-12-04, Thomas A. Russ <tar(a)sevak.isi.edu> wrote:
> Tim Bradshaw <tfb(a)cley.com> writes:
>
>> I can't imagine any case where the limitation of array indices to
>> fixnums is a limitation: are there any current systems where an
>> individual process can address more than a machine-word's worth of
>> space? (Note, not where the system as a whole can have more physical
>> memory than that - I think that was reasonably common on some later
>> 32-bit systems wasn't it?)
>
> Well, we ran into some systems where the specfication LOWER limit on the
> size of fixnums was actually realized. It was a 32-bit architecture
> that had only 16-bit fixnums. That was a bit limiting.

I recall that the Win32 API uses a kind of fixnum scheme for distinguishing
resource name strings and resource ID's. In the FindResource function,
a resource is specified as a pointer to a name: a LPCTSTR lpName
parameter. But if this pointer is in the address range 0 to 65535,
the pointer is treated as a numeric resource ID, and not a pointer to a name.
You can convert an ID number n into the string pointer using
the MAKEINTRESOURCE(n) macro.

Could it be that this Lisp did a similar trick? If the value lands in the 16
bit range, it's not a pointer to something, but a fixnum value. This is a
possible valid alternative to reserving tag bits. The nice thing is that you
don't have to do shifting and masking---just do arithmetic directly on it---but
the range limitation sucks.

I can't imagine otherwise why a Lisp in 32 bit land would chew off 16 bits in
the representation of a fixnum; nobody could possibly need that many bits
for a type tag.

I thought about doing this kind of trick in txr, but I went for a two bit mask
instead. One little fly in that ointment is not just the restricted range, but
the fact that I want nil to correspond to the null pointer, which is all zero
bits in most C compilers, and so the range-based scheme would alias 0 and NIL,
which is repugnant. :) Choosing a different NIL is possible, but then all
that Lispy C code can no longer be written like
``if (foo_list) { /* foo_list is not empty ... */ }''. It would always
have to be ``if (!null(foo_list)) ...'' which is irritating and error
prone.

From: Kaz Kylheku on 4 Dec 2009 23:17

On 2009-12-04, Thomas A. Russ <tar(a)sevak.isi.edu> wrote:
> Kaz Kylheku <kkylheku(a)gmail.com> writes:
>
>> >> for ARRAY-DIMENSION-LIMIT says that it's a fixnum and "The upper
>> >> exclusive bound on each individual dimension of an array". Why not make
>> >> it an inclusive bound ?
> ...
>> Why array-dimension-limit is an exclusive limit is probably for historic
>> reasons.
>>
>> Maybe when Lisp was standardized, implementations did not agree whether or not
>> this limit is inclusive or exclusive. If implementations don't agree about a
>> parameter like this, what can you recommend to programmers? Treat it as
>> exclusive, of course! That's the weaker, more portable assumption.
>>
>> Or maybe all of the implementations had the exclusive behavior, which
>> would have been codified as-is.
>
> Well, my guess would be related to considering how one often uses a loop
> to access elements in an array. Typically, you would iterate starting
> at 0 and ending when the index is >= the dimension.

Thus, the dimension can be max-positive-bignum!

> So, by making the maximum dimension be max-positive-fixnum minus 1, you
> can safely use fixnums to iterate through the indices without worrying
> that the final increment will create a bignum.

I thought about this same thing at first yesterday, but concluded that
it can't be the correct rationale.

Here is why. Even if max-positive-fixnum is an /inclusive/ bound on
array size, there is no overflow.

If an array has max-positive-fixnum elements, they are numbered
from 0 to (1- max-positive-fixnum).

Everything is representable in fixnum: all of the indices,
the array size, and even the index which is one element
past the end array.

You can walk through the array like this:

(loop for x below (length array) ...)

and x never exceeds max-positive-fixnum.

Your `>= dimension' test is the same. It does not preclude dimension
from being fixnum.

You just have to ensure you don't do something silly, like try to
increment the index even in the case when the loop guard fails.

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Prev: Checking for declarations in macros
Next: scripting in CL -- please comment