From: Rune Allnor on
Hi all.

I am implementing this application for bathymetry terrain modeling.
The application basically takes a list of (x,y,z) data points, and
manipulates these. Present-day sonar technologies deliver
data in the order of 5-10 million data points per data set.

Emerging sonar technologies could well increase the spatial sampling
density such that near-future bathymetry data sets extend into some
100 million - 1 billion points/data set.

With 64-bit OSs, it seems that the HW to deal with such data
sets will be available in the near future, which means I have to plan
my software accordingly.

The 'long int' data type can be used to index byte arrays up to 2Gb
length, the 'unsigned long int' can be used to index byte arrays of
4G length. This is more than sufficient today; it may not continue
to be in, say, a 5-10 year perspective.

What would be the alternative ways to prepare for implementing
arrays of more than 2 billion elements under the emerging 64 bit OSs?
Use templates everywhere? Use 'long long int's? Something else?
What are the pros and cons of the different ways?

I do my programming on a Win XP laptop, but make efforts to
be able to use the same source code on other systems and OSs.
The code obviously have to be recompiled when ported, so
it is no need to make the executables portable. I am only
concerned about source code maintenance.

I want my program to run fast right now, at the same time I do
not want to implement everything from scratch in three years,
when the program is ported to a larger computer system.

Rune


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: Maxim Yegorushkin on
Rune Allnor wrote:

[]

> The 'long int' data type can be used to index byte arrays up to 2Gb
> length, the 'unsigned long int' can be used to index byte arrays of
> 4G length. This is more than sufficient today; it may not continue
> to be in, say, a 5-10 year perspective.

On linux and many unixes long has the size of void*, i.e. long is 32
bit on a 32-bit platform and 64 bit on a 64-bit platform. This may be
not true for windoze.

> What would be the alternative ways to prepare for implementing
> arrays of more than 2 billion elements under the emerging 64 bit OSs?
> Use templates everywhere? Use 'long long int's? Something else?
> What are the pros and cons of the different ways?

Use size_t.


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: andrew_nuss@yahoo.com on

Maxim Yegorushkin wrote:
> Rune Allnor wrote:
>
>
> Use size_t.
>

Aside from using size_t for the index, and using a template-based
array, I would consider using a btree of depth 2, with the first index
being the high bits of size_t and the second index being the low bits
of size_t, costing a shift+mask+2 array indexes versus 1 array index.
I am doing this for my very large arrays, though it is just a little
slower to index, it dramatically reduces the problem of heap
fragmentation.


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: Rune Allnor on

andrew_nuss(a)yahoo.com skrev:
> Maxim Yegorushkin wrote:
> > Rune Allnor wrote:
> >
> >
> > Use size_t.
> >
>
> Aside from using size_t for the index, and using a template-based
> array, I would consider using a btree of depth 2, with the first index
> being the high bits of size_t and the second index being the low bits
> of size_t, costing a shift+mask+2 array indexes versus 1 array index.
> I am doing this for my very large arrays, though it is just a little
> slower to index, it dramatically reduces the problem of heap
> fragmentation.

Interesting.

Could you elaborate, please? First, I can't really see how this
solves the 2G indexing limit (unless one can hide some
'long long long int' data types inside a template class)?
Second, I get horrible flashbacks to the times of MSDOS
and 64k segment size limits...

Rune


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: Jens Kilian on
"Maxim Yegorushkin" <maxim.yegorushkin(a)gmail.com> writes:
> Use size_t.

IIRC, the recommended index type for an overloaded operator[] is ptrdiff_t.
--
mailto:jjk(a)acm.org As the air to a bird, or the sea to a
fish,
http://www.bawue.de/~jjk/ so is contempt to the contemptible.
[Blake]
http://del.icio.us/jjk

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]