From: persres on
Hi,
I have some low level questions, all discussions in user space
scenario.

1) How do I get architecture specific info -
a) whether machine is SMP/ASMP/NUMA?
b) Number of Caches and how they are chared across core?
c) size of cacheline?

2) If I say _declspec(align(64)) will the data be cache aligned? It
seems MSFT specific, any platform independent ways?

3) how do I allocate cache-aligned memory in heap?. I believe it is
_aligned_malloc?. Is there any C++ call?

Also, it is MSFT specific. Any platform independent call for
aligned malloc?

4) Can we get page aligned memory in user space? can we ask for non-
cached memory in user space heap?

5) Is cacheline size 64 bytes on all x86 machines?

I hope I could get answers to some of those. Thanks very much.
Cheers
From: Francois PIETTE on
> I have some low level questions, all discussions in user space
> scenario.
>
> 1) How do I get architecture specific info -
> a) whether machine is SMP/ASMP/NUMA?
> b) Number of Caches and how they are chared across core?
> c) size of cacheline?

System Information Development Kit is probably somthing useful for you.
See http://www.cpuid-pro.com/sysinfo.php

--
francois.piette(a)overbyte.be
The author of the freeware multi-tier middleware MidWare
The author of the freeware Internet Component Suite (ICS)
http://www.overbyte.be


From: Tim Roberts on

persres <persres(a)googlemail.com> wrote:
>
> I have some low level questions, all discussions in user space
>scenario.
>
>1) How do I get architecture specific info -
> a) whether machine is SMP/ASMP/NUMA?
> b) Number of Caches and how they are chared across core?
> c) size of cacheline?

You can get some of this information from WMI:
http://msdn.microsoft.com/en-us/library/aa394373.aspx
and you can get some of this information from the CPUID instruction.

>2) If I say _declspec(align(64)) will the data be cache aligned? It
>seems MSFT specific, any platform independent ways?

There is nothing platform-independent about caches. That's a very
implementation-specific detail. Indeed, you can't assume that every
processor even has a cache.

>3) how do I allocate cache-aligned memory in heap?. I believe it is
>_aligned_malloc?. Is there any C++ call?

_aligned_malloc works just fine in C++. You can always use placement "new"
with it:

MyObject pobj = new (_aligned_malloc(sizeof MyObject)) MyObject;

Note, however, that cache-aligning an object is not a terribly useful thing
to do.

> Also, it is MSFT specific. Any platform independent call for
>aligned malloc?

No. Again, caching is very implementation-specific detail.

>4) Can we get page aligned memory in user space?

You can pass any alignment you want to _aligned_malloc. Why would you want
to?

You can also call the VirtualAlloc API to allocate pages directly.

>can we ask for non-cached memory in user space heap?

No. That requires a kernel module. Why would you want it?

>5) Is cacheline size 64 bytes on all x86 machines?

Most, but not all. Also remember that current x86 processors include 3
different levels of cache.
--
Tim Roberts, timr(a)probo.com
Providenza & Boekelheide, Inc.
From: Chris M. Thomasson on
"Tim Roberts" <timr(a)probo.com> wrote in message
news:nuovl5phr0rqo0ts9c5v32if30pg9bhpic(a)4ax.com...
>
> persres <persres(a)googlemail.com> wrote:
[...]
>>3) how do I allocate cache-aligned memory in heap?. I believe it is
>>_aligned_malloc?. Is there any C++ call?
>
> _aligned_malloc works just fine in C++. You can always use placement
> "new"
> with it:
>
> MyObject pobj = new (_aligned_malloc(sizeof MyObject)) MyObject;
>
> Note, however, that cache-aligning an object is not a terribly useful
> thing
> to do.

I am curious as to what makes you say that? Padding critical data-structures
to L2 cache lines and aligning them in memory on cache line boundaries can
be __essential__ if you are interested in scalability and performance.

[...]

From: persres on
On 27 Jan, 09:42, "Chris M. Thomasson" <n...(a)spam.invalid> wrote:
> "Tim Roberts" <t...(a)probo.com> wrote in message
>
> news:nuovl5phr0rqo0ts9c5v32if30pg9bhpic(a)4ax.com...
>
>
>
> > persres <pers...(a)googlemail.com> wrote:
> [...]
> >>3) how do I allocate cache-aligned memory in heap?. I believe it is
> >>_aligned_malloc?. Is there any C++ call?
>
> > _aligned_malloc works just fine in C++.  You can always use placement
> > "new"
> > with it:
>
Thanks for all the responses.
> >   MyObject pobj = new (_aligned_malloc(sizeof MyObject)) MyObject;
>
> > Note, however, that cache-aligning an object is not a terribly useful
> > thing
> > to do.
>
I have an array of 64 chars (bytes) all under the same lock. I think
making sure they are in the same cache line can speed things up.

> I am curious as to what makes you say that? Padding critical data-structures
> to L2 cache lines and aligning them in memory on cache line boundaries can
> be __essential__ if you are interested in scalability and performance.
>
> [...]