x86: ioremap: fix wrong physical address handling [Kernel]

Prev: pcmcia: fix 'driver ... did not release config properly' warning
Next: [PATCH] x86: OLPC: use pr_debug() for EC commands

From: Jeremy Fitzhardinge on 18 Jun 2010 07:10

On 06/18/2010 04:22 AM, Kenji Kaneshige wrote:
> Current x86 ioremap() doesn't handle physical address higher than
> 32-bit properly in X86_32 PAE mode. When physical address higher than
> 32-bit is passed to ioremap(), higher 32-bits in physical address is
> cleared wrongly. Due to this bug, ioremap() can map wrong address to
> linear address space.
>
> In my case, 64-bit MMIO region was assigned to a PCI device (ioat
> device) on my system. Because of the ioremap()'s bug, wrong physical
> address (instead of MMIO region) was mapped to linear address space.
> Because of this, loading ioatdma driver caused unexpected behavior
> (kernel panic, kernel hangup, ...).
>
> Signed-off-by: Kenji Kaneshige <kaneshige.kenji(a)jp.fujitsu.com>
>
> ---
> arch/x86/mm/ioremap.c | 12 +++++-------
> include/linux/io.h | 4 ++--
> include/linux/vmalloc.h | 2 +-
> lib/ioremap.c | 10 +++++-----
> mm/vmalloc.c | 2 +-
> 5 files changed, 14 insertions(+), 16 deletions(-)
>
> Index: linux-2.6.34/arch/x86/mm/ioremap.c
> ===================================================================
> --- linux-2.6.34.orig/arch/x86/mm/ioremap.c
> +++ linux-2.6.34/arch/x86/mm/ioremap.c
> @@ -62,8 +62,8 @@ int ioremap_change_attr(unsigned long va
> static void __iomem *__ioremap_caller(resource_size_t phys_addr,
> unsigned long size, unsigned long prot_val, void *caller)
> {
> - unsigned long pfn, offset, vaddr;
> - resource_size_t last_addr;
> + unsigned long offset, vaddr;
> + resource_size_t pfn, last_pfn, last_addr;
>

Why is pfn resource_size_t here? Is it to avoid casting, or does it
actually need to hold more than 32 bits? I don't see any use of pfn
aside from the page_is_ram loop, and I don't think that can go beyond 32
bits. If you're worried about boundary conditions at the 2^44 limit,
then you can make last_pfn inclusive, or compute num_pages and use that
for the loop condition.

> const resource_size_t unaligned_phys_addr = phys_addr;
> const unsigned long unaligned_size = size;
> struct vm_struct *area;
> @@ -100,10 +100,8 @@ static void __iomem *__ioremap_caller(re
> /*
> * Don't allow anybody to remap normal RAM that we're using..
> */
> - for (pfn = phys_addr >> PAGE_SHIFT;
> - (pfn << PAGE_SHIFT) < (last_addr & PAGE_MASK);
> - pfn++) {
> -
> + last_pfn = last_addr >> PAGE_SHIFT;
>

If last_addr can be non-page aligned, should it be rounding up to the
next pfn rather than rounding down? Ah, looks like you fix it in the
second patch.

J
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Kenji Kaneshige on 20 Jun 2010 21:50

(2010/06/18 20:07), Jeremy Fitzhardinge wrote:
> On 06/18/2010 04:22 AM, Kenji Kaneshige wrote:
>> Current x86 ioremap() doesn't handle physical address higher than
>> 32-bit properly in X86_32 PAE mode. When physical address higher than
>> 32-bit is passed to ioremap(), higher 32-bits in physical address is
>> cleared wrongly. Due to this bug, ioremap() can map wrong address to
>> linear address space.
>>
>> In my case, 64-bit MMIO region was assigned to a PCI device (ioat
>> device) on my system. Because of the ioremap()'s bug, wrong physical
>> address (instead of MMIO region) was mapped to linear address space.
>> Because of this, loading ioatdma driver caused unexpected behavior
>> (kernel panic, kernel hangup, ...).
>>
>> Signed-off-by: Kenji Kaneshige<kaneshige.kenji(a)jp.fujitsu.com>
>>
>> ---
>> arch/x86/mm/ioremap.c | 12 +++++-------
>> include/linux/io.h | 4 ++--
>> include/linux/vmalloc.h | 2 +-
>> lib/ioremap.c | 10 +++++-----
>> mm/vmalloc.c | 2 +-
>> 5 files changed, 14 insertions(+), 16 deletions(-)
>>
>> Index: linux-2.6.34/arch/x86/mm/ioremap.c
>> ===================================================================
>> --- linux-2.6.34.orig/arch/x86/mm/ioremap.c
>> +++ linux-2.6.34/arch/x86/mm/ioremap.c
>> @@ -62,8 +62,8 @@ int ioremap_change_attr(unsigned long va
>> static void __iomem *__ioremap_caller(resource_size_t phys_addr,
>> unsigned long size, unsigned long prot_val, void *caller)
>> {
>> - unsigned long pfn, offset, vaddr;
>> - resource_size_t last_addr;
>> + unsigned long offset, vaddr;
>> + resource_size_t pfn, last_pfn, last_addr;
>>
>
> Why is pfn resource_size_t here? Is it to avoid casting, or does it
> actually need to hold more than 32 bits? I don't see any use of pfn
> aside from the page_is_ram loop, and I don't think that can go beyond 32
> bits. If you're worried about boundary conditions at the 2^44 limit,
> then you can make last_pfn inclusive, or compute num_pages and use that
> for the loop condition.
>

The reason I changed here was phys_addr might be higher than 2^44. After
the discussion, I realized there would probably be many other codes that
cannot handle more than 32-bits pfn, and this would cause problems even
if I changed ioremap() to be able to handle more than 32-bits pfn. So I
decided to focus on making 44-bits physical address work properly this
time. But, I didn't find any reason to make it go back to unsigned long.
So I still make it resource_size_t even in v.3. Is there any problem on
this change? And I don't understand why pfn can't go beyond 32-bits.
Could you tell me why?

>> const resource_size_t unaligned_phys_addr = phys_addr;
>> const unsigned long unaligned_size = size;
>> struct vm_struct *area;
>> @@ -100,10 +100,8 @@ static void __iomem *__ioremap_caller(re
>> /*
>> * Don't allow anybody to remap normal RAM that we're using..
>> */
>> - for (pfn = phys_addr>> PAGE_SHIFT;
>> - (pfn<< PAGE_SHIFT)< (last_addr& PAGE_MASK);
>> - pfn++) {
>> -
>> + last_pfn = last_addr>> PAGE_SHIFT;
>>
>
> If last_addr can be non-page aligned, should it be rounding up to the
> next pfn rather than rounding down? Ah, looks like you fix it in the
> second patch.
>

Yes, I fixed it in the [PATCH 2/2].

Thanks,
Kenji Kaneshige

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Simon Horman on 9 Jul 2010 00:30

On Thu, Jun 17, 2010 at 10:35:19AM +0100, Jeremy Fitzhardinge wrote:
> On 06/17/2010 07:03 AM, H. Peter Anvin wrote:
> > On 06/16/2010 09:55 PM, Kenji Kaneshige wrote:

[snip]

> >> greater value when 44-bits physical address limit is eliminated. And
> >> maybe we need to change phys_addr_valid() returns error if physical
> >> address is above (1 << __PHYSICAL_MASK_SHIFT)?
> >>
> > The real question is how much we can fix without an unreasonable cost.
> >
>
> I think it would be a pretty large change. From the Xen's perspective,
> any machine even approximately approaching the 2^44 limit will be
> capable of running Xen guests in hvm mode, so PV isn't really a concern.

Hi Jeremy,

Is the implication of that statement that HVM is preferred where
supported by HW?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Jeremy Fitzhardinge on 9 Jul 2010 01:40

On 07/08/2010 09:24 PM, Simon Horman wrote:
>> I think it would be a pretty large change. From the Xen's perspective,
>> any machine even approximately approaching the 2^44 limit will be
>> capable of running Xen guests in hvm mode, so PV isn't really a concern.
>>
> Hi Jeremy,
>
> Is the implication of that statement that HVM is preferred where
> supported by HW?
>

I wouldn't go that far; the PV vs HVM choice is pretty complex, and
depends on what your workload is and what hardware you have available.
All I meant was what I said: that if you're running on a machine with a
large amount of memory, then you should run your 32-bit domains as HVM
rather than PV. Though Xen could easily keep domains limited to memory
that they can actually use (it already does this, in fact).

J
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Simon Horman on 9 Jul 2010 02:20

On Thu, Jul 08, 2010 at 10:33:08PM -0700, Jeremy Fitzhardinge wrote:
> On 07/08/2010 09:24 PM, Simon Horman wrote:
> >> I think it would be a pretty large change. From the Xen's perspective,
> >> any machine even approximately approaching the 2^44 limit will be
> >> capable of running Xen guests in hvm mode, so PV isn't really a concern.
> >>
> > Hi Jeremy,
> >
> > Is the implication of that statement that HVM is preferred where
> > supported by HW?
> >
>
> I wouldn't go that far; the PV vs HVM choice is pretty complex, and
> depends on what your workload is and what hardware you have available.
> All I meant was what I said: that if you're running on a machine with a
> large amount of memory, then you should run your 32-bit domains as HVM
> rather than PV. Though Xen could easily keep domains limited to memory
> that they can actually use (it already does this, in fact).

Hi Jeremy,

thanks for the clarification.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

First | Prev |
Pages: 1 2 3
Prev: pcmcia: fix 'driver ... did not release config properly' warning
Next: [PATCH] x86: OLPC: use pr_debug() for EC commands