From: Tamas K Papp on
Hi,

I am doing Markov-Chain Monte Carlo in CL. Specifically, I draw a
vector (of about 10^5 elements) from a distribution. I need about
10^4 draws. This makes a huge table --- I am not sure I would like to
fit that in memory, even if I could (single float would take 4e9
bytes, but 1e9 is not a fixnum any more, so plain vanilla Lisp arrays
would not work on my 32-bit platform).

I don't know much about these things, but I think that the best
solution would be a database of some kind. I am wondering what would
be the simplest and most hassle-free way to do this in CL (if that
matters, I am using SBCL).

If I think of this tables as a matrix, I will save data along one
dimension (eg rows, each draw), but I will retrieve data along the
other (eg columns, multiple draws for each variable). The second step
will be done more often, so I want that to be fast. Does it matter
for speed which dimension I consider rows or columns?

I am totally clueless about these things (never used databases), so links
to examples, tutorials, CL code snippets etc would be welcome.

Thanks,

Tamas
From: Mario S. Mommer on

Hi Tamas,

IMO, a DB is overkill. Best to use the file system for that, and store
these large vectors in FASLs. Some provision has to be made to upgrade
the FASLs when SBCL's version changes, though. But a FASL really loads
fast and your vector is there already in the right form.

I'd also like to point out that the price for a 64 bit machine with
enough ram is not *that* high. Buying a new machine is probably the by
far least painfull solution.

Regards,
Mario.
From: Tamas K Papp on
On Mon, 01 Feb 2010 13:50:43 +0100, Mario S. Mommer wrote:

> Hi Tamas,
>
> IMO, a DB is overkill. Best to use the file system for that, and store
> these large vectors in FASLs. Some provision has to be made to upgrade
> the FASLs when SBCL's version changes, though. But a FASL really loads
> fast and your vector is there already in the right form.

Hi Mario,

Thanks for your reply. I don't really get the solution with FASL.
How do you create them---save the vector to a file and compile it?

> I'd also like to point out that the price for a 64 bit machine with
> enough ram is not *that* high. Buying a new machine is probably the by
> far least painfull solution.

I have a good 64 bit machine with tons of ram, but in a momentary
lapse of reason, I installed 32 bit ubuntu on it in the past. Maybe a
reinstall would be less hassle than a DB.

I notice that you are using SBCL. I posted the message below to the
SBCL list, but got no reply so far. I wonder if you could help me:

"Currently, I am using SBCL on 32-bit Ubuntu (x86). I ran into a
specific limitation (fixnum limits my array size), so am wondering
whether to switch to 64-bit SBCL. This would require a reinstall,
which is not a major issue but a minor PITA which would surely take a
few hours. Before I undertake this, I have a few questions:

- how big is ARRAY-TOTAL-SIZE-LIMIT on 64-bit SBCL? Will this allow
me to use larger arrays? Is there another limit (provided that I
take enough memory with a --dynamic-space-size)?

- Does 64-bit result in significantly a higher memory consumption? I
understand that fixnums will now take twice the space, but does
anything else take up more memory?

- Does 64 vs 32 bit have any impact on speed (positively or
negatively)? Can single floats be unboxed in 64-bit?"

Thanks,

Tamas
From: Pascal J. Bourguignon on
m_mommer(a)yahoo.com (Mario S. Mommer) writes:

> Hi Tamas,
>
> IMO, a DB is overkill. Best to use the file system for that, and store
> these large vectors in FASLs. Some provision has to be made to upgrade
> the FASLs when SBCL's version changes, though. But a FASL really loads
> fast and your vector is there already in the right form.
>
> I'd also like to point out that the price for a 64 bit machine with
> enough ram is not *that* high. Buying a new machine is probably the by
> far least painfull solution.

Err, it is also inevitable. Last week-end I went to buy the cheapest
new CPU I could find (to repair an old computer), and couldn't find
anything less than an Athlon 64 X2 Dual Core 6000+ with 2 GB DDR2 RAM.

The time to upgrade to a 64-bit OS will be more costly...


--
__Pascal Bourguignon__
From: Pascal J. Bourguignon on
Tamas K Papp <tkpapp(a)gmail.com> writes:

> - Does 64-bit result in significantly a higher memory consumption? I
> understand that fixnums will now take twice the space, but does
> anything else take up more memory?

Basically, yes. (Strings won't take twice the memory, but they
already took 32-bit per character...). On the other hand, nowadays
fast memory is less than half the price it was a few years ago. So
plan to more than double the memory size, but be happy, you'll spend
less on memory than before.


> - Does 64 vs 32 bit have any impact on speed (positively or
> negatively)? Can single floats be unboxed in 64-bit?"

Depends on the application.


--
__Pascal Bourguignon__