From: Minchan Kim on
On Fri, Nov 6, 2009 at 5:37 AM, Jody Belka <jody+lkml(a)jj79.org> wrote:
> Norbert Preining <preining <at> logic.at> writes:
>> Don't ask me why, please, and I don't have a serial/net console so that
>> I can tell you more, but the booting hangs badly at:
>
> <snip>
>
>>
>> > diff --git a/mm/memory.c b/mm/memory.c
>> > index 7e91b5f..47e4b15 100644
>> > --- a/mm/memory.c
>> > +++ b/mm/memory.c
>> > @@ -2713,7 +2713,11 @@ static int __do_fault(struct mm_struct *mm,
>> > struct vm_area_struct *vma,
>> > � � � �vmf.page = NULL;
>> >
>> > � � � �ret = vma->vm_ops->fault(vma, &vmf);
>> > - � � � if (unlikely(ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE)))
>> > + � � � if (unlikely(ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE))) {
>> > + � � � � � � � printk(KERN_DEBUG "vma->vm_ops->fault : 0x%lx\n",
>> > vma->vm_ops->fault);
>> > + � � � � � � � WARN_ON(1);
>> > +
>> > + � � � }
>> > � � � � � � � �return ret;
>> >
>> > � � � �if (unlikely(PageHWPoison(vmf.page))) {
>>
>
> Erm, could it not be due to the "return ret;" line being moved outside of the
> if(), so that it always executes?

Right. Sorry it's my fault.
I become blind.
'return ret' should be inclueded in debug code.

>
>
> J
>
> ps, sending this through gmane, don't know if it'll keep cc's or not, so
> apologies if not. please cc me on any replies
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo(a)kvack.org. �For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont(a)kvack.org"> email(a)kvack.org </a>
>



--
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Norbert Preining on
On Fr, 06 Nov 2009, Minchan Kim wrote:
> > Erm, could it not be due to the "return ret;" line being moved outside of the
> > if(), so that it always executes?
>
> Right. Sorry it's my fault.
> I become blind.
> 'return ret' should be inclueded in debug code.


Bummer, I'm blind, too, that was in fact obvious, since the codeflow
was changed. Could have seen that myself, sorry.

Recompiling already and trying to recreate the oom-killer boom.

Best wishes

Norbert

-------------------------------------------------------------------------------
Dr. Norbert Preining Associate Professor
JAIST Japan Advanced Institute of Science and Technology preining(a)jaist.ac.jp
Vienna University of Technology preining(a)logic.at
Debian Developer (Debian TeX Task Force) preining(a)debian.org
gpg DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
-------------------------------------------------------------------------------
I'm going to have a look.'
He glanced round at the others.
`Is no one going to say, "No you can't possibly, let me go
instead"?'
They all shook their heads.
`Oh well.'
--- Ford attempting to be heroic whilst being seiged by
--- Shooty and Bangbang.
--- Douglas Adams, The Hitchhikers Guide to the Galaxy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Norbert Preining on
Hi Kim,

On Fr, 06 Nov 2009, preining wrote:
> Recompiling already and trying to recreate the oom-killer boom.

Well, after rebooting into that kernel I get *loads*, every few seconds,
of warnings in the log. Hard to sort out what is real. Is that expected?

Excerpt from the log:
[ 2077.753841] vma->vm_ops->fault : 0xffffffff811df4bd
[ 2077.753842] ------------[ cut here ]------------
[ 2077.753845] WARNING: at mm/memory.c:2722 __do_fault+0x89/0x382()
[ 2077.753847] Hardware name: VGN-Z11VN_B
....
[ 2077.753880] Pid: 4892, comm: Xorg Tainted: G W 2.6.32-rc6 #5
[ 2077.753881] Call Trace:
[ 2077.753884] [<ffffffff8108c6cc>] ? __do_fault+0x89/0x382
[ 2077.753887] [<ffffffff8108c6cc>] ? __do_fault+0x89/0x382
[ 2077.753889] [<ffffffff8103ae54>] ? warn_slowpath_common+0x77/0xa3
[ 2077.753892] [<ffffffff8108c6cc>] ? __do_fault+0x89/0x382
[ 2077.753895] [<ffffffff81341a82>] ? _spin_unlock+0x23/0x2f
[ 2077.753898] [<ffffffff8108e5d0>] ? handle_mm_fault+0x2b9/0x608
[ 2077.753900] [<ffffffff810af792>] ? do_vfs_ioctl+0x443/0x47b
[ 2077.753903] [<ffffffff81026759>] ? do_page_fault+0x25f/0x27b
[ 2077.753906] [<ffffffff81341e8f>] ? page_fault+0x1f/0x30
[ 2077.753908] ---[ end trace d3324ef5061f0136 ]---

hundreds/thousands of them.

And even without starting anything else. Is that what you want?
My syslog file has grown to some hundred megabytes ...


Best wishes

Norbert

-------------------------------------------------------------------------------
Dr. Norbert Preining Associate Professor
JAIST Japan Advanced Institute of Science and Technology preining(a)jaist.ac.jp
Vienna University of Technology preining(a)logic.at
Debian Developer (Debian TeX Task Force) preining(a)debian.org
gpg DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
-------------------------------------------------------------------------------
LARGOWARD (n.)
Motorists' name for the kind of pedestrian who stands beside a main
road and waves on the traffic, as if it's their right of way.
--- Douglas Adams, The Meaning of Liff
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Minchan Kim on
On Fri, Nov 6, 2009 at 10:38 PM, Norbert Preining <preining(a)logic.at> wrote:
> Hi Kim,
>
> On Fr, 06 Nov 2009, preining wrote:
>> Recompiling already and trying to recreate the oom-killer boom.
>
> Well, after rebooting into that kernel I get *loads*, every few seconds,
> of warnings in the log. Hard to sort out what is real. Is that expected?

I guess it is VM_FAULT_NOPAGE of i915_gem or somethings.
It's not of our concern but VM_FAULT_OOM.
I couldn't expect that. So let's change debug patch following as.

Most important thing is "Who return VM_FAULT_OOM".
It it return VM_FAULT_OOM, OOM killer will kill any process who have a
high score. In case of you, it was 'X'.

If you don't see it until 2.6.32-rc5, It should be regression in somewhere.
If we can know it, we can pass the problem to maintainer of it.

Could you try it again below patch?
If you reproduce it, you can match function address of log with
function address
of your System.map. Pz, let me know it. :)

diff --git a/mm/memory.c b/mm/memory.c
index 7e91b5f..97a6fcb 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2713,8 +2713,13 @@ static int __do_fault(struct mm_struct *mm,
struct vm_area_struct *vma,
vmf.page = NULL;

ret = vma->vm_ops->fault(vma, &vmf);
- if (unlikely(ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE)))
+ if (unlikely(ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE))) {
+ if (ret & VM_FAULT_OOM) {
+ printk(KERN_DEBUG "fault handler : 0x%lx\n", vma->vm_ops->fault);
+
+ }
return ret;
+ }

if (unlikely(PageHWPoison(vmf.page))) {
if (ret & VM_FAULT_LOCKED)



>
> Excerpt from the log:
> [ 2077.753841] vma->vm_ops->fault : 0xffffffff811df4bd
> [ 2077.753842] ------------[ cut here ]------------
> [ 2077.753845] WARNING: at mm/memory.c:2722 __do_fault+0x89/0x382()
> [ 2077.753847] Hardware name: VGN-Z11VN_B
> ...
> [ 2077.753880] Pid: 4892, comm: Xorg Tainted: G � � � �W �2.6.32-rc6 #5
> [ 2077.753881] Call Trace:
> [ 2077.753884] �[<ffffffff8108c6cc>] ? __do_fault+0x89/0x382
> [ 2077.753887] �[<ffffffff8108c6cc>] ? __do_fault+0x89/0x382
> [ 2077.753889] �[<ffffffff8103ae54>] ? warn_slowpath_common+0x77/0xa3
> [ 2077.753892] �[<ffffffff8108c6cc>] ? __do_fault+0x89/0x382
> [ 2077.753895] �[<ffffffff81341a82>] ? _spin_unlock+0x23/0x2f
> [ 2077.753898] �[<ffffffff8108e5d0>] ? handle_mm_fault+0x2b9/0x608
> [ 2077.753900] �[<ffffffff810af792>] ? do_vfs_ioctl+0x443/0x47b
> [ 2077.753903] �[<ffffffff81026759>] ? do_page_fault+0x25f/0x27b
> [ 2077.753906] �[<ffffffff81341e8f>] ? page_fault+0x1f/0x30
> [ 2077.753908] ---[ end trace d3324ef5061f0136 ]---
>
> hundreds/thousands of them.
>
> And even without starting anything else. Is that what you want?
> My syslog file has grown to some hundred megabytes ...
>
>
> Best wishes
>
> Norbert
>
> -------------------------------------------------------------------------------
> Dr. Norbert Preining � � � � � � � � � � � � � � � � � � � �Associate Professor
> JAIST Japan Advanced Institute of Science and Technology � preining(a)jaist.ac.jp
> Vienna University of Technology � � � � � � � � � � � � � � � preining(a)logic.at
> Debian Developer (Debian TeX Task Force) � � � � � � � � � �preining(a)debian.org
> gpg DSA: 0x09C5B094 � � �fp: 14DF 2E6C 0307 BE6D AD76 �A9C0 D2BF 4AA3 09C5 B094
> -------------------------------------------------------------------------------
> LARGOWARD (n.)
> Motorists' name for the kind of pedestrian who stands beside a main
> road and waves on the traffic, as if it's their right of way.
> � � � � � � � � � � � �--- Douglas Adams, The Meaning of Liff
>



--
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Norbert Preining on
recompiling and retrying ...

On Sa, 07 Nov 2009, Minchan Kim wrote:
> + printk(KERN_DEBUG "fault handler : 0x%lx\n", vma->vm_ops->fault);

BTW:
m/memory.c:2722: warning: format '%lx' expects type 'long unsigned int', but argument 2 has type 'int (* const)(struct vm_area_struct *, struct vm_fault *)'

Best wishes

Norbert

-------------------------------------------------------------------------------
Dr. Norbert Preining Associate Professor
JAIST Japan Advanced Institute of Science and Technology preining(a)jaist.ac.jp
Vienna University of Technology preining(a)logic.at
Debian Developer (Debian TeX Task Force) preining(a)debian.org
gpg DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
-------------------------------------------------------------------------------
LOWTHER (vb.)
(Of a large group of people who have been to the cinema together.) To
stand aimlessly about on the pavement and argue about whatever to go
and eat either a Chinese meal nearby or an Indian meal at a restaurant
which somebody says is very good but isn't certain where it is, or
have a drink and think about it, or just go home, or have a Chinese
meal nearby - until by the time agreement is reached everything is
shut.
--- Douglas Adams, The Meaning of Liff
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/