From: H. Peter Anvin on
On 07/27/2010 10:00 AM, Konrad Rzeszutek Wilk wrote:
> It is paramount that we call pci_xen_swiotlb_detect before
> pci_swiotlb_detect as both implementations use the 'swiotlb'
> and 'swiotlb_force' flags. The pci-xen_swiotlb_detect inhibits
> the swiotlb_force and swiotlb flag so that the native SWIOTLB
> implementation is not enabled when running under Xen.
>
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk(a)oracle.com>
> Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge(a)citrix.com>
> Cc: FUJITA Tomonori <fujita.tomonori(a)lab.ntt.co.jp>
> Cc: Albert Herranz <albert_herranz(a)yahoo.es>
> Cc: Ian Campbell <Ian.Campbell(a)citrix.com>
> Cc: Thomas Gleixner <tglx(a)linutronix.de>
> Cc: "H. Peter Anvin" <hpa(a)zytor.com>
> Cc: x86(a)kernel.org
> Cc: Jesse Barnes <jbarnes(a)virtuousgeek.org>
> ---
> arch/x86/kernel/pci-dma.c | 7 +++++--
> 1 files changed, 5 insertions(+), 2 deletions(-)
>

Is there any way we can abstract this out a bit more instead of crapping
on generic code?

-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Konrad Rzeszutek Wilk on
On Tue, Jul 27, 2010 at 12:03:56PM -0700, H. Peter Anvin wrote:
> On 07/27/2010 10:00 AM, Konrad Rzeszutek Wilk wrote:
> > It is paramount that we call pci_xen_swiotlb_detect before
> > pci_swiotlb_detect as both implementations use the 'swiotlb'
> > and 'swiotlb_force' flags. The pci-xen_swiotlb_detect inhibits
> > the swiotlb_force and swiotlb flag so that the native SWIOTLB
> > implementation is not enabled when running under Xen.
> >
> > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk(a)oracle.com>
> > Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge(a)citrix.com>
> > Cc: FUJITA Tomonori <fujita.tomonori(a)lab.ntt.co.jp>
> > Cc: Albert Herranz <albert_herranz(a)yahoo.es>
> > Cc: Ian Campbell <Ian.Campbell(a)citrix.com>
> > Cc: Thomas Gleixner <tglx(a)linutronix.de>
> > Cc: "H. Peter Anvin" <hpa(a)zytor.com>
> > Cc: x86(a)kernel.org
> > Cc: Jesse Barnes <jbarnes(a)virtuousgeek.org>
> > ---
> > arch/x86/kernel/pci-dma.c | 7 +++++--
> > 1 files changed, 5 insertions(+), 2 deletions(-)
> >
>
> Is there any way we can abstract this out a bit more instead of crapping
> on generic code?

I was toying with something like this:

diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c
index 9f07cfc..e0cd388 100644
--- a/arch/x86/kernel/pci-dma.c
+++ b/arch/x86/kernel/pci-dma.c
@@ -45,6 +45,25 @@ int iommu_detected __read_mostly = 0;
*/
int iommu_pass_through __read_mostly;

+initcall_t __swiotlb_initcall_detect[] =
+ {pci_xen_swiotlb_detect,
+ pci_swiotlb_detect,
+ NULL};
+
+initcall_t __swiotlb_initcall_init[] = {
+ pci_xen_swiotlb_init,
+ pci_swiotlb_init,
+ NULL};
+
+
+initcall_t __iommu_initcall_detect[] = {
+ gart_iommu_hole_init,
+ detect_calgary,
+ detect_intel_iommu,
+ /* needs to be called after gart_iommu_hole_init */
+ amd_iommu_detect,
+ NULL};
+
/* Dummy device used for NULL arguments (normally ISA). */
struct device x86_dma_fallback_dev = {
.init_name = "fallback device",
@@ -130,24 +149,22 @@ static void __init dma32_free_bootmem(void)

void __init pci_iommu_alloc(void)
{
+ initcall_t *fn;
+
/* free the range so iommu could get some range less than 4G */
dma32_free_bootmem();

- if (pci_xen_swiotlb_detect() || pci_swiotlb_detect())
- goto out;
-
- gart_iommu_hole_init();
-
- detect_calgary();
+ /* First do the SWIOTLB - if they work, skip the IOMMUs. */
+ for (fn = __swiotlb_initcall_detect; fn != NULL; fn++)
+ if ((*fn)())
+ goto swiotlb_init;

- detect_intel_iommu();
-
- /* needs to be called after gart_iommu_hole_init */
- amd_iommu_detect();
-out:
- pci_xen_swiotlb_init();
+ for (fn = __iommu_initcall_detect; fn != NULL; fn++)
+ (*fn)();

- pci_swiotlb_init();
+swiotlb_init:
+ for (fn = __swiotlb_initcall_init; fn != NULL; fn++)
+ (*fn)();
}

void *dma_generic_alloc_coherent(struct device *dev, size_t size,


(compiles with warnings and has not yet been completly flushed), but
Fujita mentioned that it might the right choice to use this
when we have more than just these two swiotlb-implentors.


>
> -hpa
> --
> H. Peter Anvin, Intel Open Source Technology Center
> I work for Intel. I don't speak on their behalf.
>
> _______________________________________________
> iommu mailing list
> iommu(a)lists.linux-foundation.org
> https://lists.linux-foundation.org/mailman/listinfo/iommu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: FUJITA Tomonori on
On Tue, 27 Jul 2010 15:41:05 -0400
Konrad Rzeszutek Wilk <konrad.wilk(a)oracle.com> wrote:

> On Tue, Jul 27, 2010 at 12:03:56PM -0700, H. Peter Anvin wrote:
> > On 07/27/2010 10:00 AM, Konrad Rzeszutek Wilk wrote:
> > > It is paramount that we call pci_xen_swiotlb_detect before
> > > pci_swiotlb_detect as both implementations use the 'swiotlb'
> > > and 'swiotlb_force' flags. The pci-xen_swiotlb_detect inhibits
> > > the swiotlb_force and swiotlb flag so that the native SWIOTLB
> > > implementation is not enabled when running under Xen.
> > >
> > > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk(a)oracle.com>
> > > Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge(a)citrix.com>
> > > Cc: FUJITA Tomonori <fujita.tomonori(a)lab.ntt.co.jp>
> > > Cc: Albert Herranz <albert_herranz(a)yahoo.es>
> > > Cc: Ian Campbell <Ian.Campbell(a)citrix.com>
> > > Cc: Thomas Gleixner <tglx(a)linutronix.de>
> > > Cc: "H. Peter Anvin" <hpa(a)zytor.com>
> > > Cc: x86(a)kernel.org
> > > Cc: Jesse Barnes <jbarnes(a)virtuousgeek.org>
> > > ---
> > > arch/x86/kernel/pci-dma.c | 7 +++++--
> > > 1 files changed, 5 insertions(+), 2 deletions(-)
> > >
> >
> > Is there any way we can abstract this out a bit more instead of crapping
> > on generic code?

I don't like this change much too, however I think that this is the
most simple and straightforward.

Basically, Xen's swiotlb works like a new IOMMU implementation so we
need to initialize it like other IOMMU implementations (call the
detect and init functions in order).


> I was toying with something like this:
>
> diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c
> index 9f07cfc..e0cd388 100644
> --- a/arch/x86/kernel/pci-dma.c
> +++ b/arch/x86/kernel/pci-dma.c
> @@ -45,6 +45,25 @@ int iommu_detected __read_mostly = 0;
> */
> int iommu_pass_through __read_mostly;
>
> +initcall_t __swiotlb_initcall_detect[] =
> + {pci_xen_swiotlb_detect,
> + pci_swiotlb_detect,
> + NULL};
> +
> +initcall_t __swiotlb_initcall_init[] = {
> + pci_xen_swiotlb_init,
> + pci_swiotlb_init,
> + NULL};
> +
> +
> +initcall_t __iommu_initcall_detect[] = {
> + gart_iommu_hole_init,
> + detect_calgary,
> + detect_intel_iommu,
> + /* needs to be called after gart_iommu_hole_init */
> + amd_iommu_detect,
> + NULL};

I really don't think that this makes the code better. I prefer the
current simple (dumb) code.


> /* Dummy device used for NULL arguments (normally ISA). */
> struct device x86_dma_fallback_dev = {
> .init_name = "fallback device",
> @@ -130,24 +149,22 @@ static void __init dma32_free_bootmem(void)
>
> void __init pci_iommu_alloc(void)
> {
> + initcall_t *fn;
> +
> /* free the range so iommu could get some range less than 4G */
> dma32_free_bootmem();
>
> - if (pci_xen_swiotlb_detect() || pci_swiotlb_detect())
> - goto out;
> -
> - gart_iommu_hole_init();
> -
> - detect_calgary();
> + /* First do the SWIOTLB - if they work, skip the IOMMUs. */

btw, this comment is wrong. We check if we are forced to use SWIOTLB
by kernel command line here.

Even if SWIOTLB works, we see if hardware IOMMU is available. SWIOTLB
is a last resort. We prefer hardware IOMMU.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: H. Peter Anvin on
On 07/27/2010 04:36 PM, FUJITA Tomonori wrote:
>>>
>>> Is there any way we can abstract this out a bit more instead of crapping
>>> on generic code?
>
> I don't like this change much too, however I think that this is the
> most simple and straightforward.
>
> Basically, Xen's swiotlb works like a new IOMMU implementation so we
> need to initialize it like other IOMMU implementations (call the
> detect and init functions in order).
>

Even mentioning "xen" in generic code should be considered a bug. I
think we *do* need to driverize the iommu stuff, and yes, Xen's swiotlb
should just be handled like one in the list.

>
>> I was toying with something like this:
>>
>> diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c
>> index 9f07cfc..e0cd388 100644
>> --- a/arch/x86/kernel/pci-dma.c
>> +++ b/arch/x86/kernel/pci-dma.c
>> @@ -45,6 +45,25 @@ int iommu_detected __read_mostly = 0;
>> */
>> int iommu_pass_through __read_mostly;
>>
>> +initcall_t __swiotlb_initcall_detect[] =
>> + {pci_xen_swiotlb_detect,
>> + pci_swiotlb_detect,
>> + NULL};
>> +
>> +initcall_t __swiotlb_initcall_init[] = {
>> + pci_xen_swiotlb_init,
>> + pci_swiotlb_init,
>> + NULL};
>> +
>> +
>> +initcall_t __iommu_initcall_detect[] = {
>> + gart_iommu_hole_init,
>> + detect_calgary,
>> + detect_intel_iommu,
>> + /* needs to be called after gart_iommu_hole_init */
>> + amd_iommu_detect,
>> + NULL};
>
> I really don't think that this makes the code better. I prefer the
> current simple (dumb) code.
>

The special handling of swiotlb here really looks wrong, but otherwise I
think it's the right idea.

> btw, this comment is wrong. We check if we are forced to use SWIOTLB
> by kernel command line here.
>
> Even if SWIOTLB works, we see if hardware IOMMU is available. SWIOTLB
> is a last resort. We prefer hardware IOMMU.

Any reason to not just handle swiotlb like any of the other iommus, at
the bottom of the list?

-hpa

--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: FUJITA Tomonori on
On Tue, 27 Jul 2010 17:19:41 -0700
"H. Peter Anvin" <hpa(a)zytor.com> wrote:

> On 07/27/2010 04:36 PM, FUJITA Tomonori wrote:
> >>>
> >>> Is there any way we can abstract this out a bit more instead of crapping
> >>> on generic code?
> >
> > I don't like this change much too, however I think that this is the
> > most simple and straightforward.
> >
> > Basically, Xen's swiotlb works like a new IOMMU implementation so we
> > need to initialize it like other IOMMU implementations (call the
> > detect and init functions in order).
> >
>
> Even mentioning "xen" in generic code should be considered a bug. I
> think we *do* need to driverize the iommu stuff, and yes, Xen's swiotlb
> should just be handled like one in the list.
>
> >
> >> I was toying with something like this:
> >>
> >> diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c
> >> index 9f07cfc..e0cd388 100644
> >> --- a/arch/x86/kernel/pci-dma.c
> >> +++ b/arch/x86/kernel/pci-dma.c
> >> @@ -45,6 +45,25 @@ int iommu_detected __read_mostly = 0;
> >> */
> >> int iommu_pass_through __read_mostly;
> >>
> >> +initcall_t __swiotlb_initcall_detect[] =
> >> + {pci_xen_swiotlb_detect,
> >> + pci_swiotlb_detect,
> >> + NULL};
> >> +
> >> +initcall_t __swiotlb_initcall_init[] = {
> >> + pci_xen_swiotlb_init,
> >> + pci_swiotlb_init,
> >> + NULL};
> >> +
> >> +
> >> +initcall_t __iommu_initcall_detect[] = {
> >> + gart_iommu_hole_init,
> >> + detect_calgary,
> >> + detect_intel_iommu,
> >> + /* needs to be called after gart_iommu_hole_init */
> >> + amd_iommu_detect,
> >> + NULL};
> >
> > I really don't think that this makes the code better. I prefer the
> > current simple (dumb) code.
> >
>
> The special handling of swiotlb here really looks wrong, but otherwise I
> think it's the right idea.
>
> > btw, this comment is wrong. We check if we are forced to use SWIOTLB
> > by kernel command line here.
> >
> > Even if SWIOTLB works, we see if hardware IOMMU is available. SWIOTLB
> > is a last resort. We prefer hardware IOMMU.
>
> Any reason to not just handle swiotlb like any of the other iommus, at
> the bottom of the list?

we need to check if swiotlb usage is forced by the command line since:

- we skip hardware IOMMU initialization if so.


We also need swiotlb initialization after all hardware IOMMU
initialization since:

- if all hardware IOMMU initialization fails, we might need to
initialize swiotlb.

- even if hardware IOMMU initialization is successful, we might need
to initialize swiotlb (even if a system has hardware IOMMU, some
devices are not connected to hardware IOMMU).

- swiotlb initialization must be after GART initialization. We reserve
some DMA32 memory for broken bios with GART. The order must be freeing
the memory, initializing GART, then initializing swiotlb.
Initializing swiotlb before GART steals the reserved memory. It breaks
GART.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/