From: Hugh Dickins on
On Thu, 29 Jul 2010, dann frazier wrote:
> On Wed, Jul 28, 2010 at 08:50:18PM -0700, Hugh Dickins wrote:
> >
> > Let's note that gdb's gcore is building up its own version of a
> > coredump, not going through the get_dump_page() code I was wondering
> > about. If I read gcore correctly (possibly not!), it will be reading
> > selected areas from /proc/<pid>/mem i.e. using access_process_vm().
>
> This appears to be correct. I was able to collect the following
> stacktrace using INIT:
>
> [ 2535.074197] Backtrace of pid 4605 (gdb)
> [ 2535.074197]
> [ 2535.074197] Call Trace:
> [ 2535.074197] [<a00000010000bb00>] ia64_native_leave_kernel+0x0/0x270
> [ 2535.074197] sp=e000004081c77c40 bsp=e000004081c71018
> [ 2535.074197] [<a000000100334720>] __copy_user+0x160/0x960
> [ 2535.074197] sp=e000004081c77e10 bsp=e000004081c71018
> [ 2535.074197] [<a000000100176b00>] access_process_vm+0x2c0/0x380
> [ 2535.074197] sp=e000004081c77e10 bsp=e000004081c70f60

Thanks a lot, dann. But it was the [vdso] line in foo's /proc/<pid>/maps
which you sent me privately, that set me thinking on the right track.
Here's what I believe is the appropriate patch: please give it a try
and let us know...

[PATCH] mm: fix ia64 crash when gcore reads gate area

Debian's ia64 autobuilders have been seeing kernel freeze or reboot
when running the gdb testsuite (Debian bug 588574): dannf bisected to
2.6.32 62eede62dafb4a6633eae7ffbeb34c60dba5e7b1 "mm: ZERO_PAGE without
PTE_SPECIAL"; and reproduced it with gdb's gcore on a simple target.

I'd missed updating the gate_vma handling in __get_user_pages(): that
happens to use vm_normal_page() (nowadays failing on the zero page),
yet reported success even when it failed to get a page - boom when
access_process_vm() tried to copy that to its intermediate buffer.

Fix this, resisting cleanups: in particular, leave it for now reporting
success when not asked to get any pages - very probably safe to change,
but let's not risk it without testing exposure.

Why did ia64 crash with 16kB pages, but succeed with 64kB pages?
Because setup_gate() pads each 64kB of its gate area with zero pages.

Reported-by: Andreas Barth <aba(a)not.so.argh.org>
Bisected-by: dann frazier <dannf(a)debian.org>
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Cc: stable(a)kernel.org
---

mm/memory.c | 16 +++++++++++++---
1 file changed, 13 insertions(+), 3 deletions(-)

--- 2.6.35-rc6/mm/memory.c 2010-05-30 17:58:57.000000000 -0700
+++ linux/mm/memory.c 2010-07-29 17:57:29.000000000 -0700
@@ -1394,10 +1394,20 @@ int __get_user_pages(struct task_struct
return i ? : -EFAULT;
}
if (pages) {
- struct page *page = vm_normal_page(gate_vma, start, *pte);
+ struct page *page;
+
+ page = vm_normal_page(gate_vma, start, *pte);
+ if (!page) {
+ if (!(gup_flags & FOLL_DUMP) &&
+ is_zero_pfn(pte_pfn(*pte)))
+ page = pte_page(*pte);
+ else {
+ pte_unmap(pte);
+ return i ? : -EFAULT;
+ }
+ }
pages[i] = page;
- if (page)
- get_page(page);
+ get_page(page);
}
pte_unmap(pte);
if (vmas)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: dann frazier on
On Thu, Jul 29, 2010 at 07:01:56PM -0700, Hugh Dickins wrote:
> On Thu, 29 Jul 2010, dann frazier wrote:
> > On Wed, Jul 28, 2010 at 08:50:18PM -0700, Hugh Dickins wrote:
> > >
> > > Let's note that gdb's gcore is building up its own version of a
> > > coredump, not going through the get_dump_page() code I was wondering
> > > about. If I read gcore correctly (possibly not!), it will be reading
> > > selected areas from /proc/<pid>/mem i.e. using access_process_vm().
> >
> > This appears to be correct. I was able to collect the following
> > stacktrace using INIT:
> >
> > [ 2535.074197] Backtrace of pid 4605 (gdb)
> > [ 2535.074197]
> > [ 2535.074197] Call Trace:
> > [ 2535.074197] [<a00000010000bb00>] ia64_native_leave_kernel+0x0/0x270
> > [ 2535.074197] sp=e000004081c77c40 bsp=e000004081c71018
> > [ 2535.074197] [<a000000100334720>] __copy_user+0x160/0x960
> > [ 2535.074197] sp=e000004081c77e10 bsp=e000004081c71018
> > [ 2535.074197] [<a000000100176b00>] access_process_vm+0x2c0/0x380
> > [ 2535.074197] sp=e000004081c77e10 bsp=e000004081c70f60
>
> Thanks a lot, dann. But it was the [vdso] line in foo's /proc/<pid>/maps
> which you sent me privately, that set me thinking on the right track.
> Here's what I believe is the appropriate patch: please give it a try
> and let us know...

dannf(a)rx2600:~> gdb foo
GNU gdb (GDB) SUSE (7.0-0.4.16)
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "ia64-suse-linux".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/dannf/foo...done.
(gdb) break leaf
Breakpoint 1 at 0x4000000000000401: file foo.c, line 2.
(gdb) run
Starting program: /home/dannf/foo

Breakpoint 1, leaf () at foo.c:2
2 return 0;
(gdb) gcore
Saved corefile core.3952
(gdb)

good work Hugh!

-dann

>
> [PATCH] mm: fix ia64 crash when gcore reads gate area
>
> Debian's ia64 autobuilders have been seeing kernel freeze or reboot
> when running the gdb testsuite (Debian bug 588574): dannf bisected to
> 2.6.32 62eede62dafb4a6633eae7ffbeb34c60dba5e7b1 "mm: ZERO_PAGE without
> PTE_SPECIAL"; and reproduced it with gdb's gcore on a simple target.
>
> I'd missed updating the gate_vma handling in __get_user_pages(): that
> happens to use vm_normal_page() (nowadays failing on the zero page),
> yet reported success even when it failed to get a page - boom when
> access_process_vm() tried to copy that to its intermediate buffer.
>
> Fix this, resisting cleanups: in particular, leave it for now reporting
> success when not asked to get any pages - very probably safe to change,
> but let's not risk it without testing exposure.
>
> Why did ia64 crash with 16kB pages, but succeed with 64kB pages?
> Because setup_gate() pads each 64kB of its gate area with zero pages.
>
> Reported-by: Andreas Barth <aba(a)not.so.argh.org>
> Bisected-by: dann frazier <dannf(a)debian.org>
> Signed-off-by: Hugh Dickins <hughd(a)google.com>
> Cc: stable(a)kernel.org
> ---
>
> mm/memory.c | 16 +++++++++++++---
> 1 file changed, 13 insertions(+), 3 deletions(-)
>
> --- 2.6.35-rc6/mm/memory.c 2010-05-30 17:58:57.000000000 -0700
> +++ linux/mm/memory.c 2010-07-29 17:57:29.000000000 -0700
> @@ -1394,10 +1394,20 @@ int __get_user_pages(struct task_struct
> return i ? : -EFAULT;
> }
> if (pages) {
> - struct page *page = vm_normal_page(gate_vma, start, *pte);
> + struct page *page;
> +
> + page = vm_normal_page(gate_vma, start, *pte);
> + if (!page) {
> + if (!(gup_flags & FOLL_DUMP) &&
> + is_zero_pfn(pte_pfn(*pte)))
> + page = pte_page(*pte);
> + else {
> + pte_unmap(pte);
> + return i ? : -EFAULT;
> + }
> + }
> pages[i] = page;
> - if (page)
> - get_page(page);
> + get_page(page);
> }
> pte_unmap(pte);
> if (vmas)
>

--
dann frazier

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Hugh Dickins on
On Thu, 29 Jul 2010, dann frazier wrote:
>
> dannf(a)rx2600:~> gdb foo
> GNU gdb (GDB) SUSE (7.0-0.4.16)
> Copyright (C) 2009 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law. Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "ia64-suse-linux".
> For bug reporting instructions, please see:
> <http://www.gnu.org/software/gdb/bugs/>...
> Reading symbols from /home/dannf/foo...done.
> (gdb) break leaf
> Breakpoint 1 at 0x4000000000000401: file foo.c, line 2.
> (gdb) run
> Starting program: /home/dannf/foo
>
> Breakpoint 1, leaf () at foo.c:2
> 2 return 0;
> (gdb) gcore
> Saved corefile core.3952
> (gdb)

Many thanks for pursuing this and reporting back, dann.
Patch to Linus follows in a few moments.

Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/