swiotlb detection should be memory hotplug aware ? [Kernel]

Prev: [git pull v3] documentation: fix almost duplicate filenames (io/IO-mapping.txt)
Next: x86, xsave: make init_xstate_buf static

From: FUJITA Tomonori on 28 Jul 2010 06:20

On Fri, 23 Jul 2010 17:23:33 +0200
Andi Kleen <ak(a)linux.intel.com> wrote:

>
> > I was thinking about this at some point. I think the first step is to
> > make SWIOTLB use the debugfs to actually print out how much of its
> > buffers are used - and see if the 64MB is a good fit.
>
> swiotlb is near always wrongly sized. For most system it's far too much,
> but for some
> not enough. I have some systemtap scripts around to instrument it.

True, it's impossible to preallocate the best iotlb size statically.

> Also it depends on the IO load, so if you size it reasonable you
> risk overflow on large IO (however these days this very rarely happens
> because
> all "serious" IO devices don't need swiotlb anymore)

Yeah, nowadays it's pointless to try to get the good performance with
swiotlb.

> The other problem is that using only two bits for the needed address
> space is also extremly
> inefficient (4GB and 16MB on x86). Really want masks everywhere and
> optimize for the
> actual requirements.

swiotlb doesn't allocate GFP_DMA memory. It handles only GFP_DMA32.

swiotlb doesn't work for drivers with some odd dma mask (non 32bit)
but we have been lived with it so I don't think that it's a big issue.

I think, supporting expanding swiotlb dynamically is enough. The
default swiotlb size, 64MB is too large for majority.

I have a half-baked patch for it. I'll send it later.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Andi Kleen on 28 Jul 2010 10:20

FUJITA Tomonori <fujita.tomonori(a)lab.ntt.co.jp> writes:
>
>> The other problem is that using only two bits for the needed address
>> space is also extremly
>> inefficient (4GB and 16MB on x86). Really want masks everywhere and
>> optimize for the
>> actual requirements.
>
> swiotlb doesn't allocate GFP_DMA memory. It handles only GFP_DMA32.

I was lumping GFP_DMA and swiotlb together here. The
pci_alloc_consistent() function uses both interchangedly.
They really effectively are the same thing these days
and just separated by historical accident.

> I have a half-baked patch for it. I'll send it later.

The problem are still the *_map users which usually cannot sleep,
and then it's difficult to grow.
For *_alloc it's relatively easy and to some extend already
implemented.

-Andi

--
ak(a)linux.intel.com -- Speaking for myself only.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: FUJITA Tomonori on 28 Jul 2010 10:30

On Wed, 28 Jul 2010 13:09:57 +0200
Andi Kleen <andi(a)firstfloor.org> wrote:

> FUJITA Tomonori <fujita.tomonori(a)lab.ntt.co.jp> writes:
> >
> >> The other problem is that using only two bits for the needed address
> >> space is also extremly
> >> inefficient (4GB and 16MB on x86). Really want masks everywhere and
> >> optimize for the
> >> actual requirements.
> >
> > swiotlb doesn't allocate GFP_DMA memory. It handles only GFP_DMA32.
>
> I was lumping GFP_DMA and swiotlb together here. The
> pci_alloc_consistent() function uses both interchangedly.
> They really effectively are the same thing these days
> and just separated by historical accident.

Sorry, I meant to ZONE_DMA.

You are talking about your dma mask allocation patchset, right?

I meant that swiotlb doesn't need to handle ZONE_DMA. It handles only
devices that can handle ZONE_DMA32.

> > I have a half-baked patch for it. I'll send it later.
>
> The problem are still the *_map users which usually cannot sleep,
> and then it's difficult to grow.

Why we can't use GFP_NOWAIT?

My approach is starting with small (like 4MB) and increasing io_tbl by
chunk such as 4MB.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

First | Prev |
Pages: 1 2 3
Prev: [git pull v3] documentation: fix almost duplicate filenames (io/IO-mapping.txt)
Next: x86, xsave: make init_xstate_buf static