Prev: pcmcia: fix 'driver ... did not release config properly' warning
Next: [PATCH] x86: OLPC: use pr_debug() for EC commands
From: Jeremy Fitzhardinge on 18 Jun 2010 07:10 On 06/18/2010 04:22 AM, Kenji Kaneshige wrote: > Current x86 ioremap() doesn't handle physical address higher than > 32-bit properly in X86_32 PAE mode. When physical address higher than > 32-bit is passed to ioremap(), higher 32-bits in physical address is > cleared wrongly. Due to this bug, ioremap() can map wrong address to > linear address space. > > In my case, 64-bit MMIO region was assigned to a PCI device (ioat > device) on my system. Because of the ioremap()'s bug, wrong physical > address (instead of MMIO region) was mapped to linear address space. > Because of this, loading ioatdma driver caused unexpected behavior > (kernel panic, kernel hangup, ...). > > Signed-off-by: Kenji Kaneshige <kaneshige.kenji(a)jp.fujitsu.com> > > --- > arch/x86/mm/ioremap.c | 12 +++++------- > include/linux/io.h | 4 ++-- > include/linux/vmalloc.h | 2 +- > lib/ioremap.c | 10 +++++----- > mm/vmalloc.c | 2 +- > 5 files changed, 14 insertions(+), 16 deletions(-) > > Index: linux-2.6.34/arch/x86/mm/ioremap.c > =================================================================== > --- linux-2.6.34.orig/arch/x86/mm/ioremap.c > +++ linux-2.6.34/arch/x86/mm/ioremap.c > @@ -62,8 +62,8 @@ int ioremap_change_attr(unsigned long va > static void __iomem *__ioremap_caller(resource_size_t phys_addr, > unsigned long size, unsigned long prot_val, void *caller) > { > - unsigned long pfn, offset, vaddr; > - resource_size_t last_addr; > + unsigned long offset, vaddr; > + resource_size_t pfn, last_pfn, last_addr; > Why is pfn resource_size_t here? Is it to avoid casting, or does it actually need to hold more than 32 bits? I don't see any use of pfn aside from the page_is_ram loop, and I don't think that can go beyond 32 bits. If you're worried about boundary conditions at the 2^44 limit, then you can make last_pfn inclusive, or compute num_pages and use that for the loop condition. > const resource_size_t unaligned_phys_addr = phys_addr; > const unsigned long unaligned_size = size; > struct vm_struct *area; > @@ -100,10 +100,8 @@ static void __iomem *__ioremap_caller(re > /* > * Don't allow anybody to remap normal RAM that we're using.. > */ > - for (pfn = phys_addr >> PAGE_SHIFT; > - (pfn << PAGE_SHIFT) < (last_addr & PAGE_MASK); > - pfn++) { > - > + last_pfn = last_addr >> PAGE_SHIFT; > If last_addr can be non-page aligned, should it be rounding up to the next pfn rather than rounding down? Ah, looks like you fix it in the second patch. J -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Kenji Kaneshige on 20 Jun 2010 21:50 (2010/06/18 20:07), Jeremy Fitzhardinge wrote: > On 06/18/2010 04:22 AM, Kenji Kaneshige wrote: >> Current x86 ioremap() doesn't handle physical address higher than >> 32-bit properly in X86_32 PAE mode. When physical address higher than >> 32-bit is passed to ioremap(), higher 32-bits in physical address is >> cleared wrongly. Due to this bug, ioremap() can map wrong address to >> linear address space. >> >> In my case, 64-bit MMIO region was assigned to a PCI device (ioat >> device) on my system. Because of the ioremap()'s bug, wrong physical >> address (instead of MMIO region) was mapped to linear address space. >> Because of this, loading ioatdma driver caused unexpected behavior >> (kernel panic, kernel hangup, ...). >> >> Signed-off-by: Kenji Kaneshige<kaneshige.kenji(a)jp.fujitsu.com> >> >> --- >> arch/x86/mm/ioremap.c | 12 +++++------- >> include/linux/io.h | 4 ++-- >> include/linux/vmalloc.h | 2 +- >> lib/ioremap.c | 10 +++++----- >> mm/vmalloc.c | 2 +- >> 5 files changed, 14 insertions(+), 16 deletions(-) >> >> Index: linux-2.6.34/arch/x86/mm/ioremap.c >> =================================================================== >> --- linux-2.6.34.orig/arch/x86/mm/ioremap.c >> +++ linux-2.6.34/arch/x86/mm/ioremap.c >> @@ -62,8 +62,8 @@ int ioremap_change_attr(unsigned long va >> static void __iomem *__ioremap_caller(resource_size_t phys_addr, >> unsigned long size, unsigned long prot_val, void *caller) >> { >> - unsigned long pfn, offset, vaddr; >> - resource_size_t last_addr; >> + unsigned long offset, vaddr; >> + resource_size_t pfn, last_pfn, last_addr; >> > > Why is pfn resource_size_t here? Is it to avoid casting, or does it > actually need to hold more than 32 bits? I don't see any use of pfn > aside from the page_is_ram loop, and I don't think that can go beyond 32 > bits. If you're worried about boundary conditions at the 2^44 limit, > then you can make last_pfn inclusive, or compute num_pages and use that > for the loop condition. > The reason I changed here was phys_addr might be higher than 2^44. After the discussion, I realized there would probably be many other codes that cannot handle more than 32-bits pfn, and this would cause problems even if I changed ioremap() to be able to handle more than 32-bits pfn. So I decided to focus on making 44-bits physical address work properly this time. But, I didn't find any reason to make it go back to unsigned long. So I still make it resource_size_t even in v.3. Is there any problem on this change? And I don't understand why pfn can't go beyond 32-bits. Could you tell me why? >> const resource_size_t unaligned_phys_addr = phys_addr; >> const unsigned long unaligned_size = size; >> struct vm_struct *area; >> @@ -100,10 +100,8 @@ static void __iomem *__ioremap_caller(re >> /* >> * Don't allow anybody to remap normal RAM that we're using.. >> */ >> - for (pfn = phys_addr>> PAGE_SHIFT; >> - (pfn<< PAGE_SHIFT)< (last_addr& PAGE_MASK); >> - pfn++) { >> - >> + last_pfn = last_addr>> PAGE_SHIFT; >> > > If last_addr can be non-page aligned, should it be rounding up to the > next pfn rather than rounding down? Ah, looks like you fix it in the > second patch. > Yes, I fixed it in the [PATCH 2/2]. Thanks, Kenji Kaneshige -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Simon Horman on 9 Jul 2010 00:30 On Thu, Jun 17, 2010 at 10:35:19AM +0100, Jeremy Fitzhardinge wrote: > On 06/17/2010 07:03 AM, H. Peter Anvin wrote: > > On 06/16/2010 09:55 PM, Kenji Kaneshige wrote: [snip] > >> greater value when 44-bits physical address limit is eliminated. And > >> maybe we need to change phys_addr_valid() returns error if physical > >> address is above (1 << __PHYSICAL_MASK_SHIFT)? > >> > > The real question is how much we can fix without an unreasonable cost. > > > > I think it would be a pretty large change. From the Xen's perspective, > any machine even approximately approaching the 2^44 limit will be > capable of running Xen guests in hvm mode, so PV isn't really a concern. Hi Jeremy, Is the implication of that statement that HVM is preferred where supported by HW? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Jeremy Fitzhardinge on 9 Jul 2010 01:40 On 07/08/2010 09:24 PM, Simon Horman wrote: >> I think it would be a pretty large change. From the Xen's perspective, >> any machine even approximately approaching the 2^44 limit will be >> capable of running Xen guests in hvm mode, so PV isn't really a concern. >> > Hi Jeremy, > > Is the implication of that statement that HVM is preferred where > supported by HW? > I wouldn't go that far; the PV vs HVM choice is pretty complex, and depends on what your workload is and what hardware you have available. All I meant was what I said: that if you're running on a machine with a large amount of memory, then you should run your 32-bit domains as HVM rather than PV. Though Xen could easily keep domains limited to memory that they can actually use (it already does this, in fact). J -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Simon Horman on 9 Jul 2010 02:20
On Thu, Jul 08, 2010 at 10:33:08PM -0700, Jeremy Fitzhardinge wrote: > On 07/08/2010 09:24 PM, Simon Horman wrote: > >> I think it would be a pretty large change. From the Xen's perspective, > >> any machine even approximately approaching the 2^44 limit will be > >> capable of running Xen guests in hvm mode, so PV isn't really a concern. > >> > > Hi Jeremy, > > > > Is the implication of that statement that HVM is preferred where > > supported by HW? > > > > I wouldn't go that far; the PV vs HVM choice is pretty complex, and > depends on what your workload is and what hardware you have available. > All I meant was what I said: that if you're running on a machine with a > large amount of memory, then you should run your 32-bit domains as HVM > rather than PV. Though Xen could easily keep domains limited to memory > that they can actually use (it already does this, in fact). Hi Jeremy, thanks for the clarification. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |