From: Dmitry Torokhov on
On Tue, May 11, 2010 at 11:48:36PM +0200, Johannes Weiner wrote:
> On Tue, May 11, 2010 at 04:54:41PM -0400, Mike Frysinger wrote:
> > does the phrase "DMA safe buffer" imply cache alignment ?
>
> I guess that depends on the architectural requirements. On x86,
> apparently, not so much. On ARM, probably yes, as it's the
> requirement to properly maintain coherency.

It looks liek ARM (and a few others) do this:

[dtor(a)hammer work]$ grep -r ARCH_KMALLOC_MINALIGN arch/
arch/sh/include/asm/page.h:#define ARCH_KMALLOC_MINALIGN L1_CACHE_BYTES
arch/frv/include/asm/mem-layout.h:#define ARCH_KMALLOC_MINALIGN 8
arch/powerpc/include/asm/page_32.h:#define ARCH_KMALLOC_MINALIGN L1_CACHE_BYTES
arch/arm/include/asm/cache.h:#define ARCH_KMALLOC_MINALIGN L1_CACHE_BYTES
arch/microblaze/include/asm/page.h:#define ARCH_KMALLOC_MINALIGN L1_CACHE_BYTES
arch/mips/include/asm/mach-tx49xx/kmalloc.h: * All happy, no need to define ARCH_KMALLOC_MINALIGN
arch/mips/include/asm/mach-ip27/kmalloc.h: * All happy, no need to define ARCH_KMALLOC_MINALIGN
arch/mips/include/asm/mach-ip32/kmalloc.h:#define ARCH_KMALLOC_MINALIGN 32
arch/mips/include/asm/mach-ip32/kmalloc.h:#define ARCH_KMALLOC_MINALIGN 128
arch/mips/include/asm/mach-generic/kmalloc.h:#define ARCH_KMALLOC_MINALIGN 128
arch/avr32/include/asm/cache.h:#define ARCH_KMALLOC_MINALIGN L1_CACHE_BYTES

--
Dmitry
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Alan Cox on
> SPI transfer into a DMA safe buffer. what is the exact API to
> dynamically allocate memory for the structure with this buffer
> embedded in it such that the start of the structure is cached aligned
> ? creating a dedicated kmem cache may work, but it isnt a scalable
> solution if every SPI driver needs to create its own cache.

If you are embedding structures then one solution is to cheat a bit and
make the structure, compiler and existing kernel compile abuse do the
work. Something like

struct {
void *except_where_prohibited;
long boat;
unsigned photograph;

u8 pad[0] __cacheline_aligned;
}..

ought to do the trick providing you align the start of the object - which
should happen naturally with kmalloc or dma_* apis


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Mike Frysinger on
On Tue, May 11, 2010 at 17:53, Dmitry Torokhov wrote:
> On Tue, May 11, 2010 at 11:48:36PM +0200, Johannes Weiner wrote:
>> On Tue, May 11, 2010 at 04:54:41PM -0400, Mike Frysinger wrote:
>> > does the phrase "DMA safe buffer" imply cache alignment ?
>>
>> I guess that depends on the architectural requirements.  On x86,
>> apparently, not so much.  On ARM, probably yes, as it's the
>> requirement to properly maintain coherency.
>
> It looks liek ARM (and a few others) do this:
>
> [dtor(a)hammer work]$ grep -r ARCH_KMALLOC_MINALIGN arch/
> arch/sh/include/asm/page.h:#define ARCH_KMALLOC_MINALIGN        L1_CACHE_BYTES
> arch/frv/include/asm/mem-layout.h:#define       ARCH_KMALLOC_MINALIGN           8
> arch/powerpc/include/asm/page_32.h:#define ARCH_KMALLOC_MINALIGN        L1_CACHE_BYTES
> arch/arm/include/asm/cache.h:#define ARCH_KMALLOC_MINALIGN      L1_CACHE_BYTES
> arch/microblaze/include/asm/page.h:#define ARCH_KMALLOC_MINALIGN        L1_CACHE_BYTES
> arch/mips/include/asm/mach-tx49xx/kmalloc.h: * All happy, no need to define ARCH_KMALLOC_MINALIGN
> arch/mips/include/asm/mach-ip27/kmalloc.h: * All happy, no need to define ARCH_KMALLOC_MINALIGN
> arch/mips/include/asm/mach-ip32/kmalloc.h:#define ARCH_KMALLOC_MINALIGN 32
> arch/mips/include/asm/mach-ip32/kmalloc.h:#define ARCH_KMALLOC_MINALIGN 128
> arch/mips/include/asm/mach-generic/kmalloc.h:#define ARCH_KMALLOC_MINALIGN      128
> arch/avr32/include/asm/cache.h:#define ARCH_KMALLOC_MINALIGN    L1_CACHE_BYTES

if ARCH_KMALLOC_MINALIGN is not defined, the current allocators default to:
slub - alignof(unsigned long long)
slab - alignof(unsigned long long)
slob - alignof(unsigned long)
which for many arches can mean an alignment of merely 4 or 8

lets look at the cacheline sizes for arches that dont set
ARCH_KMALLOC_MINALIGN to L1_CACHE_BYTES:
- alplha - 32 or 64
- frv - 32 or 64
- blackfin - 32
- parisc - 32 or 64
- mn10300 - 16
- s390 - 256
- score - 16
- sparc - 32
- xtensa - 16 or 32

assuming alpha and s390 handle cache coherency in hardware, it looks
to me like the proposed assumption (kmalloc returns cachealigned
pointers when cache management is in software) does not hold true.

so should these other arches also be setting ARCH_KMALLOC_MINALIGN to
L1_CACHE_BYTES ?
-mike
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: FUJITA Tomonori on
On Wed, 12 May 2010 00:01:02 +0300
Pekka Enberg <penberg(a)cs.helsinki.fi> wrote:

> Mike Frysinger wrote:
> > On Tue, May 11, 2010 at 16:46, Christoph Lameter wrote:
> >> On Tue, 11 May 2010, Mike Frysinger wrote:
> >>>> DMA. If the arch can only DMA into cacheline aligned objects then the
> >>>> correct method is to force kmalloc alignment to cacheline size.
> >>> these are SPI drivers and are usable on any arch that supports a SPI
> >>> bus (which is pretty much every arch). forget about "embedded"
> >>> arches.
> >>>
> >>> the issue here is simple: a SPI driver (AD7877) needs to do a receive
> >>> SPI transfer into a DMA safe buffer. what is the exact API to
> >>> dynamically allocate memory for the structure with this buffer
> >>> embedded in it such that the start of the structure is cached aligned
> >>> ? creating a dedicated kmem cache may work, but it isnt a scalable
> >>> solution if every SPI driver needs to create its own cache.
> >> kmalloc returns a pointer to a DMA safe buffer. There is no requirement on
> >> the x86 hardware that the DMA buffers have to be cache aligned. Cachelines
> >> will be invalidated as needed.
> >
> > so this guarantee is made by the kmalloc() API ? and for arches where
> > the cacheline invalidation is handled in software rather than
> > hardware, they must declare a min alignment value for kmalloc to be at
> > least as big as their cache alignment ?
> >
> > does the phrase "DMA safe buffer" imply cache alignment ?
>
> Yes, you should be able to DMA into kmalloc'd memory. IIRC the block or
> the SCSI layer depends on that.

Yeah, SCSI subsystem and drivers, and block drivers depend on it.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Paul Mundt on
On Tue, May 11, 2010 at 11:47:20PM +0300, Pekka Enberg wrote:
> Mike Frysinger wrote:
> >On Tue, May 11, 2010 at 16:38, Pekka Enberg wrote:
> >>Mike Frysinger wrote:
> >>>that is a question for David/Grant. i'm not the SPI core maintainer,
> >>>i'm merely watching over some SPI drivers. however, this answer also
> >>>doesnt sound like it's thinking big enough because what you're
> >>>proposing isnt specific to the SPI bus -- any time a DMA safe buffer
> >>>is needed dynamically, this function could be used.
> >>Well, we have dma_alloc_coherent(), shouldn't you be using that instead?
> >
> >my understanding is that dma_alloc_coherent() gives you a buffer that
> >is always coherent. the SPI layers take care of flushing and such on
> >the fly which means allocating coherent memory is overkill and bad for
> >performance.
>
> OK, I'm out of my expert area here but if dma_alloc_coherent() doesn't
> work for you, you should probably extend the DMA API, not kmalloc().
>
Note that the DMA API already has dma_alloc_noncoherent() for these
sorts of cases. If the driver is taking care of cache maintenance then
dma_alloc_noncoherent() is certainly a reasonable way to go.

Most architectures today simply wrap dma_alloc_noncoherent() to
dma_alloc_coherent(), but if there were more users of the API then
that would quickly change.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/