From: Rafael J. Wysocki on
On Tuesday 25 May 2010, Rafael J. Wysocki wrote:
> Hi,
>
> Recent -git kernels crash for me quite often during boot, reporting that kernel
> paging request could not be satisfied at load_module+0x18e4, which
> corresponds to line 2483 of kernel/module.c for the kernel in question (however
> the same call trace says IP is at precpu_modfree+0x1 at the moment of the
> crash, which seems to point to line 2481 of the same file). It doesn't happen
> every time, but it does happen often enough to be annoying and the call
> trace is always exactly the same.
>
> I'm going to revert post-2.6.34 commits that touch kernel/module.c directly
> and see what happens.

I'm not able to reproduce the issue with the following commit reverted:

commit 480b02df3aa9f07d1c7df0cd8be7a5ca73893455
Author: Rusty Russell <rusty(a)rustcorp.com.au>
Date: Wed May 19 17:33:39 2010 -0600

module: drop the lock while waiting for module to complete initialization.

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Linus Torvalds on


On Wed, 26 May 2010, Rafael J. Wysocki wrote:
>
> I'm not able to reproduce the issue with the following commit reverted:
>
> commit 480b02df3aa9f07d1c7df0cd8be7a5ca73893455
> Author: Rusty Russell <rusty(a)rustcorp.com.au>
> Date: Wed May 19 17:33:39 2010 -0600
>
> module: drop the lock while waiting for module to complete initialization.

Hmm. That does seem to be buggy. We can't just drop and re-take the lock:
that may make sense _internally_ as far as resolve_symbol() itself is
concerned, but the caller will its own local variables, and some of those
will no longer be valid if the lock was dropped.

That commit also changes the return value semantics of "use_module()",
which is an exported interface where the only users seem to be
out-of-kernel (the only in-kernel use is in kernel/module.c itself). That
seems like a really really bad idea too.

So I think reverting it is definitely the right thing to do. The commit
seems fundamentally broken. And having modules do request_module() in
their init functions has always been invalid anyway, so that excuse
doesn't really seem to be a reason to do anything crazy like this either.

Rewriting the logic to
- not drop the lock
- not change the return semantics of an exported interface
- just make 'resolve_symbol()' fail if the module isn't fully loaded
would seem to be a more reasonable approach, no?

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Rusty Russell on
On Wed, 26 May 2010 09:17:32 am Linus Torvalds wrote:
>
> On Wed, 26 May 2010, Rafael J. Wysocki wrote:
> >
> > I'm not able to reproduce the issue with the following commit reverted:
> >
> > commit 480b02df3aa9f07d1c7df0cd8be7a5ca73893455
> > Author: Rusty Russell <rusty(a)rustcorp.com.au>
> > Date: Wed May 19 17:33:39 2010 -0600
> >
> > module: drop the lock while waiting for module to complete initialization.
>
> Hmm. That does seem to be buggy. We can't just drop and re-take the lock:
> that may make sense _internally_ as far as resolve_symbol() itself is
> concerned, but the caller will its own local variables, and some of those
> will no longer be valid if the lock was dropped.

Well, yes, obviously I missed something :( I'll look at it tonight after
Arabella is asleep.

> That commit also changes the return value semantics of "use_module()",
> which is an exported interface where the only users seem to be
> out-of-kernel (the only in-kernel use is in kernel/module.c itself). That
> seems like a really really bad idea too.

The kprobes guys: they were cc'd about the change.

> So I think reverting it is definitely the right thing to do. The commit
> seems fundamentally broken.

> And having modules do request_module() in
> their init functions has always been invalid anyway, so that excuse
> doesn't really seem to be a reason to do anything crazy like this either.

I'd have to look back through the pre-git history, but we've dropped the
lock around the initfn for a long time now because people wanted to do odd
things (ISTR it sucked when modules oopsed on load, too).

So then we have the problem that crc32 is finished its init and needs the
lock back, and bnx2x which needs crc32 is waiting for it. We could just
fail bnx2x; and in fact, we did prior to this patch (we timeout) and it breaks
network on booting on some box according to Brandon:

http://www.mail-archive.com/linux-crypto(a)vger.kernel.org/msg04331.html

This *used* not to be a problem, because userspace placed locks on
modules and so it would never try to load bnx2x until crc32 was loaded.
ISTR a mention that Jon removed that...

> Rewriting the logic to
> - not drop the lock
> - not change the return semantics of an exported interface
> - just make 'resolve_symbol()' fail if the module isn't fully loaded
> would seem to be a more reasonable approach, no?

Sure, then userspace needs to change :(

Rusty.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Rusty Russell on
On Wed, 26 May 2010 05:30:58 pm Rusty Russell wrote:
> On Wed, 26 May 2010 09:17:32 am Linus Torvalds wrote:
> >
> > On Wed, 26 May 2010, Rafael J. Wysocki wrote:
> > >
> > > I'm not able to reproduce the issue with the following commit reverted:
> > >
> > > commit 480b02df3aa9f07d1c7df0cd8be7a5ca73893455
> > > Author: Rusty Russell <rusty(a)rustcorp.com.au>
> > > Date: Wed May 19 17:33:39 2010 -0600
> > >
> > > module: drop the lock while waiting for module to complete initialization.
> >
> > Hmm. That does seem to be buggy. We can't just drop and re-take the lock:
> > that may make sense _internally_ as far as resolve_symbol() itself is
> > concerned, but the caller will its own local variables, and some of those
> > will no longer be valid if the lock was dropped.
>
> Well, yes, obviously I missed something :( I'll look at it tonight after
> Arabella is asleep.

See if you can spot it (I acked the patch, so I can't point fingers):

free_core:
module_free(mod, mod->module_core);
/* mod will be freed with core. Don't access it beyond this line! */
free_percpu:
percpu_modfree(mod);

Only a year after Masami fixed that and added the comment, too :(

I suspect that the increased parallelism enabled by this patch uncovered this
bug. Does this fix it?

(Side note: the locking should be simplified. No code before simplify_symbols
actually needs the lock, so we should grab it just for that, then again at the
end. We use kobjects to protect us from multiple loads as a side-effect, but
we should move that registration to the end).

Subject: module: fix reference to mod->percpu after freeing module.

The comment about the mod being freed is self-explanatory, but neither
Tejun nor I read it. This bug was introduced in 259354deaa, after it
had previously been fixed in 6e2b75740b. How embarrassing.

Signed-off-by: Rusty Russell <rusty(a)rustcorp.com.au>
Cc: Tejun Heo <tj(a)kernel.org>
Cc: Masami Hiramatsu <mhiramat(a)redhat.com>

diff --git a/kernel/module.c b/kernel/module.c
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -2031,6 +2031,7 @@ static noinline struct module *load_modu
long err = 0;
void *ptr = NULL; /* Stops spurious gcc warning */
unsigned long symoffs, stroffs, *strmap;
+ void __percpu *percpu;

mm_segment_t old_fs;

@@ -2175,6 +2176,8 @@ static noinline struct module *load_modu
goto free_mod;
sechdrs[pcpuindex].sh_flags &= ~(unsigned long)SHF_ALLOC;
}
+ /* Keep this around for failure path. */
+ percpu = mod_percpu(mod);

/* Determine total sizes, and put offsets in sh_entsize. For now
this is done generically; there doesn't appear to be any
@@ -2480,7 +2483,7 @@ static noinline struct module *load_modu
module_free(mod, mod->module_core);
/* mod will be freed with core. Don't access it beyond this line! */
free_percpu:
- percpu_modfree(mod);
+ free_percpu(percpu);
free_mod:
kfree(args);
kfree(strmap);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Linus Torvalds on


On Wed, 26 May 2010, Rusty Russell wrote:
>
> So then we have the problem that crc32 is finished its init and needs the
> lock back, and bnx2x which needs crc32 is waiting for it. We could just
> fail bnx2x; and in fact, we did prior to this patch (we timeout) and it breaks
> network on booting on some box according to Brandon:

So the sane thing to do seems to simply not get that mutex_lock after a
successful init. Which probably implies having a new lock for the actual
low-level symbol table stuff.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/