From: Alok Kataria on
Hi,

Reviving a 4 month old thread.
I am still waiting for any clues on this question below.

>> 2. Instead of checking the max_pfn value in pci_swiotlb_detect, check
>> for max_hotpluggable_pfn (or some such) value. Though I don't see such a
>> value readily available. I could parse the SRAT and get hotplug memory
>> information but that will make swiotlb detection logic a little too
>> complex. A quick look around srat_xx.c files and the acpi_memhotplug
>> module didn't find any useful API that could be used directly either.
>> So was wondering if any of you are aware of an easy way to get such
>> information ?

Thanks,
Alok

On Wed, 2010-03-17 at 15:48 -0700, Alok Kataria wrote:
> On Tue, 2010-03-16 at 05:45 -0700, Konrad Rzeszutek Wilk wrote:
> > On Tue, Mar 16, 2010 at 10:33:20AM +0900, FUJITA Tomonori wrote:
> > > On Mon, 15 Mar 2010 20:51:40 -0400
> > > Konrad Rzeszutek Wilk <konrad.wilk(a)oracle.com> wrote:
> > >
> > > > On Fri, Mar 12, 2010 at 07:09:41PM -0800, Andi Kleen wrote:
> > > > > , Alok Kataria wrote:
> > > > >
> > > > > Hi Alok,
> > > > >
> > > > >> Hi,
> > > > >>
> > > > >> Looking at the current code swiotlb is initialized for 64bit kernels
> > > > >> only when the max_pfn value is greater than 4G (MAX_DMA32_PFN value).
> > > > >> So in cases when the initial memory is less than 4GB the kernel boots
> > > > >> without enabling swiotlb, when we hotadd memory to such a kernel and go
> > > > >> beyond the 4G limit, swiotlb is still disabled. As a result when any
> > > > >> 32bit devices start using this newly added memory beyond 4G, the kernel
> > > > >> starts spitting error messages like below or in some cases it causes
> > > > >> kernel panics.
> > > > >
> > > > > Yes seems like a real problem.
> > > > >
> > > > >>
> > > > >> 1. Enable swiotlb for all 64bit kernels which have memory hot-add
> > > > >> support.
> > > > >
> > > > > I don't think that's a good idea. It would enable it everywhere on
> > > > > distributions which compile with hotadd. Need (2)
> > > > >
> > > > >> 2. Instead of checking the max_pfn value in pci_swiotlb_detect, check
> > > > >> for max_hotpluggable_pfn (or some such) value. Though I don't see such a
> > > > >> value readily available. I could parse the SRAT and get hotplug memory
> > > > >> information but that will make swiotlb detection logic a little too
> > > > >> complex. A quick look around srat_xx.c files and the acpi_memhotplug
> > > > >> module didn't find any useful API that could be used directly either.
> > > > >> So was wondering if any of you are aware of an easy way to get such
> > > > >> information ?
> > > > >
> > > > > I have a patchkit to revamp the SRAT parsing to store the hotadd information
> > > >
>
> Andi...ping any pointers to the patchkit.

> > > > There is a late mechanism to do kickoff the SWIOTLB. Perhaps the hot-add
> > > > could use swiotlb_init_late and start up the SWIOTLB?
>
> I don't see why we need to do this via late_init, swiotlb detection that
> happens through pci_swiotlb_detect, is already late enough that SRAT is
> already parsed. Or am I missing something ?
> > >
> > > I guess that you are talking about
> > > swiotlb_late_init_with_default_size(), which IA64 uses. However, you
> > > can use swiotlb_late_init_with_default_size() only before we
> > > initialize devices. Making it work after initializing devices is not
> > > so easy, I think (that is, we need to change dma_ops).
>
> > That is a good point. Especially if we have some outstanding DMA pages
> > allocated via dma_alloc_coherent.
> >
> > I thought that the machines that have hot-add memory they have their
> > own fancy IOMMU. For example the IBM x3955 (and its family) utilize the
> > Calgary IOMMU. The HP boxes utilize the Intel VT-D (or the AMD
> > equivalant).
> > So is this mostly specialized in the areas of virtualized guests? (Xen
> > PV guests with PCI passthrough suffer the same problem, btw).
>
>
> I am assuming that there were Intel based servers which supported memory
> hot-add before VT-d too. So, IMO this is not specialized to
> virtualization, though might be hard to prove if there are actual
> physical machines out there which have similar constraints (no HWIOMMU +
> MEMHOT add support)
>
> Thanks,
> Alok

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: FUJITA Tomonori on
On Tue, 20 Jul 2010 15:14:57 -0700
Alok Kataria <akataria(a)vmware.com> wrote:

> Reviving a 4 month old thread.
> I am still waiting for any clues on this question below.

Basically, you want to add hot-plug memory and enable swiotlb, right?

We can't start swiotlb reliably after a system starts.

See dma32_reserve_boatmen() and dma32_free_bootmem(). What we do is
reserving huge memory in DMA32 zone for swiotlb and releasing it if we
find that we don't need swiotlb. We can't find enough memory for
swiotlb in dma32 after a system starts.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Alok Kataria on

On Tue, 2010-07-20 at 21:58 -0700, FUJITA Tomonori wrote:
> On Tue, 20 Jul 2010 15:14:57 -0700
> Alok Kataria <akataria(a)vmware.com> wrote:
>
> > Reviving a 4 month old thread.
> > I am still waiting for any clues on this question below.
>
> Basically, you want to add hot-plug memory and enable swiotlb, right?

Not really, I am planning to do something like this,

@@ -52,7 +52,7 @@ int __init pci_swiotlb_detect(void)

/* don't initialize swiotlb if iommu=off (no_iommu=1) */
#ifdef CONFIG_X86_64
- if (!no_iommu && max_pfn > MAX_DMA32_PFN)
+ if (!no_iommu && (max_pfn > MAX_DMA32_PFN || hotplug_possible()))
swiotlb = 1;
#endif
if (swiotlb_force)

BUT, I don't know how that hotplug_possible function will look like or
if such an interface already exists in the kernel (my search didn't turn
up any) ?

IMO, it should be possible to go read the SRAT to see if this system has
support for hotplug memory and then enable swiotlb if it does.

Sounds right ?

Thanks,
Alok

>
> We can't start swiotlb reliably after a system starts.
>
> See dma32_reserve_boatmen() and dma32_free_bootmem(). What we do is
> reserving huge memory in DMA32 zone for swiotlb and releasing it if we
> find that we don't need swiotlb. We can't find enough memory for
> swiotlb in dma32 after a system starts.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: FUJITA Tomonori on
On Wed, 21 Jul 2010 10:13:34 -0700
Alok Kataria <akataria(a)vmware.com> wrote:

> > Basically, you want to add hot-plug memory and enable swiotlb, right?
>
> Not really, I am planning to do something like this,
>
> @@ -52,7 +52,7 @@ int __init pci_swiotlb_detect(void)
>
> /* don't initialize swiotlb if iommu=off (no_iommu=1) */
> #ifdef CONFIG_X86_64
> - if (!no_iommu && max_pfn > MAX_DMA32_PFN)
> + if (!no_iommu && (max_pfn > MAX_DMA32_PFN || hotplug_possible()))
> swiotlb = 1;

Always enable swiotlb with memory hotplug enabled? Wasting 64MB on a
x86_64 system with 128MB doesn't look to be a good idea. I don't think
that there is an easy solution for this issue though.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: FUJITA Tomonori on
On Thu, 22 Jul 2010 08:44:42 +0900
FUJITA Tomonori <fujita.tomonori(a)lab.ntt.co.jp> wrote:

> On Wed, 21 Jul 2010 10:13:34 -0700
> Alok Kataria <akataria(a)vmware.com> wrote:
>
> > > Basically, you want to add hot-plug memory and enable swiotlb, right?
> >
> > Not really, I am planning to do something like this,
> >
> > @@ -52,7 +52,7 @@ int __init pci_swiotlb_detect(void)
> >
> > /* don't initialize swiotlb if iommu=off (no_iommu=1) */
> > #ifdef CONFIG_X86_64
> > - if (!no_iommu && max_pfn > MAX_DMA32_PFN)
> > + if (!no_iommu && (max_pfn > MAX_DMA32_PFN || hotplug_possible()))
> > swiotlb = 1;
>
> Always enable swiotlb with memory hotplug enabled? Wasting 64MB on a
> x86_64 system with 128MB doesn't look to be a good idea. I don't think
> that there is an easy solution for this issue though.

btw, you need more work to enable switch on the fly.

You need to change the dma_ops pointer (see get_dma_ops()). It means
that you need to track outstanding dma operations per device, locking,
etc.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/