From: Rafael J. Wysocki on
On Wednesday 26 May 2010, Rusty Russell wrote:
> On Wed, 26 May 2010 05:30:58 pm Rusty Russell wrote:
> > On Wed, 26 May 2010 09:17:32 am Linus Torvalds wrote:
> > >
> > > On Wed, 26 May 2010, Rafael J. Wysocki wrote:
> > > >
> > > > I'm not able to reproduce the issue with the following commit reverted:
> > > >
> > > > commit 480b02df3aa9f07d1c7df0cd8be7a5ca73893455
> > > > Author: Rusty Russell <rusty(a)rustcorp.com.au>
> > > > Date: Wed May 19 17:33:39 2010 -0600
> > > >
> > > > module: drop the lock while waiting for module to complete initialization.
> > >
> > > Hmm. That does seem to be buggy. We can't just drop and re-take the lock:
> > > that may make sense _internally_ as far as resolve_symbol() itself is
> > > concerned, but the caller will its own local variables, and some of those
> > > will no longer be valid if the lock was dropped.
> >
> > Well, yes, obviously I missed something :( I'll look at it tonight after
> > Arabella is asleep.
>
> See if you can spot it (I acked the patch, so I can't point fingers):
>
> free_core:
> module_free(mod, mod->module_core);
> /* mod will be freed with core. Don't access it beyond this line! */
> free_percpu:
> percpu_modfree(mod);
>
> Only a year after Masami fixed that and added the comment, too :(
>
> I suspect that the increased parallelism enabled by this patch uncovered this
> bug. Does this fix it?

Since the commit has been reverted, do you still want me to test this patch?
Quite frankly I'd prefer to test a complete replacement for that commit on top
of current -git.

Rafael


> (Side note: the locking should be simplified. No code before simplify_symbols
> actually needs the lock, so we should grab it just for that, then again at the
> end. We use kobjects to protect us from multiple loads as a side-effect, but
> we should move that registration to the end).
>
> Subject: module: fix reference to mod->percpu after freeing module.
>
> The comment about the mod being freed is self-explanatory, but neither
> Tejun nor I read it. This bug was introduced in 259354deaa, after it
> had previously been fixed in 6e2b75740b. How embarrassing.
>
> Signed-off-by: Rusty Russell <rusty(a)rustcorp.com.au>
> Cc: Tejun Heo <tj(a)kernel.org>
> Cc: Masami Hiramatsu <mhiramat(a)redhat.com>
>
> diff --git a/kernel/module.c b/kernel/module.c
> --- a/kernel/module.c
> +++ b/kernel/module.c
> @@ -2031,6 +2031,7 @@ static noinline struct module *load_modu
> long err = 0;
> void *ptr = NULL; /* Stops spurious gcc warning */
> unsigned long symoffs, stroffs, *strmap;
> + void __percpu *percpu;
>
> mm_segment_t old_fs;
>
> @@ -2175,6 +2176,8 @@ static noinline struct module *load_modu
> goto free_mod;
> sechdrs[pcpuindex].sh_flags &= ~(unsigned long)SHF_ALLOC;
> }
> + /* Keep this around for failure path. */
> + percpu = mod_percpu(mod);
>
> /* Determine total sizes, and put offsets in sh_entsize. For now
> this is done generically; there doesn't appear to be any
> @@ -2480,7 +2483,7 @@ static noinline struct module *load_modu
> module_free(mod, mod->module_core);
> /* mod will be freed with core. Don't access it beyond this line! */
> free_percpu:
> - percpu_modfree(mod);
> + free_percpu(percpu);
> free_mod:
> kfree(args);
> kfree(strmap);
>
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Linus Torvalds on


On Thu, 27 May 2010, Rafael J. Wysocki wrote:
>
> Since the commit has been reverted, do you still want me to test this patch?
> Quite frankly I'd prefer to test a complete replacement for that commit on top
> of current -git.

I'm not re-applying it with the pointless semantic changes that are
visible to modules. It doesn't matter if they were informed, if it means
that they'll then just have some odd version dependency and add crazy
"#if LINUX_VERSION" tests that aren't even exact.

I also wonder exactly what that module_mutex() actually ends up
protecting. 99% of load_module() seems to be stuff that is purely about
local issues. Maybe we could tighten the actual lock section to just the
parts that actually expose the vmalloc'ed area to others?

It's generally pointless releasing a lock in the middle - that just makes
the lock even less valid. If it's valid to just release the lock (without
some retry logic starting everything from the beginning), it likely the
lock shouldn't have been held there in the first place.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Rusty Russell on
On Thu, 27 May 2010 08:26:25 am Rafael J. Wysocki wrote:
> On Wednesday 26 May 2010, Rusty Russell wrote:
> > I suspect that the increased parallelism enabled by this patch uncovered this
> > bug. Does this fix it?
>
> Since the commit has been reverted, do you still want me to test this patch?
> Quite frankly I'd prefer to test a complete replacement for that commit on top
> of current -git.

OK, combo meal deal below, against Linus' latest. I'd really appreciate
a report, since AFAIK you're the only one hitting it, and only when that
other (now reverted) patch was applied.

As an side to Brandon: I can see how my patch fixed an explicit request_module
inside module_init (that's how I tested it). I can't see how we have a
problem with an implicit dependency such as bne2x->crc32. Modules go into
the live state without retaking the lock. If you can still reproduce that
now Linus has reverted, I'm afraid we need to dig deeper...

Thanks,
Rusty.

diff --git a/kernel/module.c b/kernel/module.c
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -563,33 +563,26 @@ int use_module(struct module *a, struct
struct module_use *use;
int no_warn, err;

- if (b == NULL || already_uses(a, b)) return 1;
+ if (b == NULL || already_uses(a, b))
+ return 0;

/* If we're interrupted or time out, we fail. */
- if (wait_event_interruptible_timeout(
- module_wq, (err = strong_try_module_get(b)) != -EBUSY,
- 30 * HZ) <= 0) {
- printk("%s: gave up waiting for init of module %s.\n",
- a->name, b->name);
- return 0;
- }
-
- /* If strong_try_module_get() returned a different error, we fail. */
+ err = strong_try_module_get(b);
if (err)
- return 0;
+ return err;

DEBUGP("Allocating new usage for %s.\n", a->name);
use = kmalloc(sizeof(*use), GFP_ATOMIC);
if (!use) {
printk("%s: out of memory loading\n", a->name);
module_put(b);
- return 0;
+ return -ENOMEM;
}

use->module_which_uses = a;
list_add(&use->list, &b->modules_which_use_me);
no_warn = sysfs_create_link(b->holders_dir, &a->mkobj.kobj, a->name);
- return 1;
+ return 0;
}
EXPORT_SYMBOL_GPL(use_module);

@@ -882,7 +875,7 @@ static inline void module_unload_free(st

int use_module(struct module *a, struct module *b)
{
- return strong_try_module_get(b) == 0;
+ return strong_try_module_get(b);
}
EXPORT_SYMBOL_GPL(use_module);

@@ -1053,17 +1046,39 @@ static const struct kernel_symbol *resol
struct module *owner;
const struct kernel_symbol *sym;
const unsigned long *crc;
-
+ DEFINE_WAIT(wait);
+ int err;
+ long timeleft = 30 * HZ;
+
+again:
sym = find_symbol(name, &owner, &crc,
!(mod->taints & (1 << TAINT_PROPRIETARY_MODULE)), true);
- /* use_module can fail due to OOM,
- or module initialization or unloading */
- if (sym) {
- if (!check_version(sechdrs, versindex, name, mod, crc, owner)
- || !use_module(mod, owner))
- sym = NULL;
+ if (!sym)
+ return NULL;
+
+ if (!check_version(sechdrs, versindex, name, mod, crc, owner))
+ return NULL;
+
+ prepare_to_wait(&module_wq, &wait, TASK_INTERRUPTIBLE);
+ err = use_module(mod, owner);
+ if (likely(!err) || err != -EBUSY || signal_pending(current)) {
+ finish_wait(&module_wq, &wait);
+ return err ? NULL : sym;
}
- return sym;
+
+ /* Module is still loading. Drop lock and wait. */
+ mutex_unlock(&module_mutex);
+ timeleft = schedule_timeout(timeleft);
+ mutex_lock(&module_mutex);
+ finish_wait(&module_wq, &wait);
+
+ /* Module might be gone entirely, or replaced. Re-lookup. */
+ if (timeleft)
+ goto again;
+
+ printk(KERN_WARNING "%s: gave up waiting for init of module %s.\n",
+ mod->name, owner->name);
+ return NULL;
}

/*
@@ -2014,6 +2029,7 @@ static noinline struct module *load_modu
long err = 0;
void *ptr = NULL; /* Stops spurious gcc warning */
unsigned long symoffs, stroffs, *strmap;
+ void __percpu *percpu;

mm_segment_t old_fs;

@@ -2158,6 +2174,8 @@ static noinline struct module *load_modu
goto free_mod;
sechdrs[pcpuindex].sh_flags &= ~(unsigned long)SHF_ALLOC;
}
+ /* Keep this around for failure path. */
+ percpu = mod_percpu(mod);

/* Determine total sizes, and put offsets in sh_entsize. For now
this is done generically; there doesn't appear to be any
@@ -2463,7 +2481,7 @@ static noinline struct module *load_modu
module_free(mod, mod->module_core);
/* mod will be freed with core. Don't access it beyond this line! */
free_percpu:
- percpu_modfree(mod);
+ free_percpu(percpu);
free_mod:
kfree(args);
kfree(strmap);


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Brandon Philips on
On 14:56 Thu 27 May 2010, Rusty Russell wrote:
> On Thu, 27 May 2010 08:26:25 am Rafael J. Wysocki wrote:
> > On Wednesday 26 May 2010, Rusty Russell wrote:
> > > I suspect that the increased parallelism enabled by this patch uncovered this
> > > bug. Does this fix it?
> >
> > Since the commit has been reverted, do you still want me to test this patch?
> > Quite frankly I'd prefer to test a complete replacement for that commit on top
> > of current -git.
>
> OK, combo meal deal below, against Linus' latest. I'd really appreciate
> a report, since AFAIK you're the only one hitting it, and only when that
> other (now reverted) patch was applied.

I tested this patch on my machine on top of Linus's latest and it
fixes the issue. Without the patch and using Linus's latest I
reproduce the original issue:

[ 60.836022] bnx2: gave up waiting for init of module libcrc32c.
[ 60.847997] bnx2: Unknown symbol crc32c

Note: Again, since I don't have bnx2x hardware I forced bnx2.ko to
depend on libcrc32c as bnx2x does:
http://ifup.org/~philips/review/bnx2-hack-to-use-libcrc32c.patch

> As an side to Brandon: I can see how my patch fixed an explicit
> request_module inside module_init (that's how I tested it). I can't
> see how we have a problem with an implicit dependency such as
> bne2x->crc32. Modules go into the live state without retaking the
> lock.

libcrc32c is doing an explicit request_module inside of its
module_init. Follow the call chain in libcrc32c_mod_init()

libcrc32c_mod_init
crypto_alloc_shash
crypto_alloc_tfm
crypto_find_alg
crypto_alg_mod_lookup
crypto_larval_lookup
request_module

Cheers,

Brandon
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Rafael J. Wysocki on
On Thursday 27 May 2010, Rusty Russell wrote:
> On Thu, 27 May 2010 08:26:25 am Rafael J. Wysocki wrote:
> > On Wednesday 26 May 2010, Rusty Russell wrote:
> > > I suspect that the increased parallelism enabled by this patch uncovered this
> > > bug. Does this fix it?
> >
> > Since the commit has been reverted, do you still want me to test this patch?
> > Quite frankly I'd prefer to test a complete replacement for that commit on top
> > of current -git.
>
> OK, combo meal deal below, against Linus' latest. I'd really appreciate
> a report, since AFAIK you're the only one hitting it, and only when that
> other (now reverted) patch was applied.

I cannot reproduce the crash with the patch below.

Thanks,
Rafael


> As an side to Brandon: I can see how my patch fixed an explicit request_module
> inside module_init (that's how I tested it). I can't see how we have a
> problem with an implicit dependency such as bne2x->crc32. Modules go into
> the live state without retaking the lock. If you can still reproduce that
> now Linus has reverted, I'm afraid we need to dig deeper...
>
> Thanks,
> Rusty.
>
> diff --git a/kernel/module.c b/kernel/module.c
> --- a/kernel/module.c
> +++ b/kernel/module.c
> @@ -563,33 +563,26 @@ int use_module(struct module *a, struct
> struct module_use *use;
> int no_warn, err;
>
> - if (b == NULL || already_uses(a, b)) return 1;
> + if (b == NULL || already_uses(a, b))
> + return 0;
>
> /* If we're interrupted or time out, we fail. */
> - if (wait_event_interruptible_timeout(
> - module_wq, (err = strong_try_module_get(b)) != -EBUSY,
> - 30 * HZ) <= 0) {
> - printk("%s: gave up waiting for init of module %s.\n",
> - a->name, b->name);
> - return 0;
> - }
> -
> - /* If strong_try_module_get() returned a different error, we fail. */
> + err = strong_try_module_get(b);
> if (err)
> - return 0;
> + return err;
>
> DEBUGP("Allocating new usage for %s.\n", a->name);
> use = kmalloc(sizeof(*use), GFP_ATOMIC);
> if (!use) {
> printk("%s: out of memory loading\n", a->name);
> module_put(b);
> - return 0;
> + return -ENOMEM;
> }
>
> use->module_which_uses = a;
> list_add(&use->list, &b->modules_which_use_me);
> no_warn = sysfs_create_link(b->holders_dir, &a->mkobj.kobj, a->name);
> - return 1;
> + return 0;
> }
> EXPORT_SYMBOL_GPL(use_module);
>
> @@ -882,7 +875,7 @@ static inline void module_unload_free(st
>
> int use_module(struct module *a, struct module *b)
> {
> - return strong_try_module_get(b) == 0;
> + return strong_try_module_get(b);
> }
> EXPORT_SYMBOL_GPL(use_module);
>
> @@ -1053,17 +1046,39 @@ static const struct kernel_symbol *resol
> struct module *owner;
> const struct kernel_symbol *sym;
> const unsigned long *crc;
> -
> + DEFINE_WAIT(wait);
> + int err;
> + long timeleft = 30 * HZ;
> +
> +again:
> sym = find_symbol(name, &owner, &crc,
> !(mod->taints & (1 << TAINT_PROPRIETARY_MODULE)), true);
> - /* use_module can fail due to OOM,
> - or module initialization or unloading */
> - if (sym) {
> - if (!check_version(sechdrs, versindex, name, mod, crc, owner)
> - || !use_module(mod, owner))
> - sym = NULL;
> + if (!sym)
> + return NULL;
> +
> + if (!check_version(sechdrs, versindex, name, mod, crc, owner))
> + return NULL;
> +
> + prepare_to_wait(&module_wq, &wait, TASK_INTERRUPTIBLE);
> + err = use_module(mod, owner);
> + if (likely(!err) || err != -EBUSY || signal_pending(current)) {
> + finish_wait(&module_wq, &wait);
> + return err ? NULL : sym;
> }
> - return sym;
> +
> + /* Module is still loading. Drop lock and wait. */
> + mutex_unlock(&module_mutex);
> + timeleft = schedule_timeout(timeleft);
> + mutex_lock(&module_mutex);
> + finish_wait(&module_wq, &wait);
> +
> + /* Module might be gone entirely, or replaced. Re-lookup. */
> + if (timeleft)
> + goto again;
> +
> + printk(KERN_WARNING "%s: gave up waiting for init of module %s.\n",
> + mod->name, owner->name);
> + return NULL;
> }
>
> /*
> @@ -2014,6 +2029,7 @@ static noinline struct module *load_modu
> long err = 0;
> void *ptr = NULL; /* Stops spurious gcc warning */
> unsigned long symoffs, stroffs, *strmap;
> + void __percpu *percpu;
>
> mm_segment_t old_fs;
>
> @@ -2158,6 +2174,8 @@ static noinline struct module *load_modu
> goto free_mod;
> sechdrs[pcpuindex].sh_flags &= ~(unsigned long)SHF_ALLOC;
> }
> + /* Keep this around for failure path. */
> + percpu = mod_percpu(mod);
>
> /* Determine total sizes, and put offsets in sh_entsize. For now
> this is done generically; there doesn't appear to be any
> @@ -2463,7 +2481,7 @@ static noinline struct module *load_modu
> module_free(mod, mod->module_core);
> /* mod will be freed with core. Don't access it beyond this line! */
> free_percpu:
> - percpu_modfree(mod);
> + free_percpu(percpu);
> free_mod:
> kfree(args);
> kfree(strmap);
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo(a)vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/