From: Linus Torvalds on


On Mon, 31 May 2010, Linus Torvalds wrote:
>
> It's entirely possible that an interim fix (if we can't just fix the
> locking) is to _not_ use "strong_try_module_get()" at all, but instead
> just use "try_module_get()", and then after we've dropped the
> module_mutex, but _before_ we call the "init" function for the module, we
> wait for all the modules that this module depends on.
>
> IOW, we'd link to other modules _before_ they are necessarily initialized
> (their symbol tables will be as initialized as they are going to be), but
> then before we call our own initialization routines we make sure that the
> modules we linked to have finished theirs.
>
> Doesn't that sound like the logical thing to do? And it wouldn't change
> any locking.

Ok, this is a two-patch series to do exactly that. It's totally untested,
although I _have_ booted the result, and tested that module loading and
unloading works. But I haven't tested the race condition, obviously.

It looks sane, and quite frankly, I think it's a much better design than
what we have now. I also note that the old code was broken for the case of
not having CONFIG_MODULE_UNLOAD - I didn't fix it, but it should be easy
to do (basically, we should do the whole module dependency list regardless
of whether we can unload modules or not - we _do_ need it in order to get
that whole case of waiting for them to load right).

The first patch doesn't really change anything, it just cleans things up
and introduces the two-way module usage list that the second patch needs
in order to wait for modules that we use to initialize.

Comments? Brandon - does this work for you? I may well have missed
something.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Linus Torvalds on


On Tue, 1 Jun 2010, Rusty Russell wrote:

> On Tue, 1 Jun 2010 05:45:37 am Linus Torvalds wrote:
> >
> > basically, we should do the whole module dependency list regardless
> > of whether we can unload modules or not
>
> Why?

Because the current non-CONFIG_MODULE_UNLOAD code is currently broken, and
it was broken exactly because the code had two totally different paths and
totally different logic. And one part simply missed the case.

We'd be much better off having as much of the logic shared as possible.
No?

Your 2/2 actually fixed that, because it moved the broken
wait_event_interruptible_timeout() out of the (non-shared) use_module()
into the (shared) resolve_symbol_wait(). But even that seemed to be almost
accidental, and seemed to be more about the fact that now the locking
rules required it (if you wanted to wait without holding the lock), rather
than anything else.

In contrast, wouldn't it be nice if we just made the waiting be a separate
event entirely? And have the !MODULE_UNLOAD case also able to share the
/proc/module format, for example?

You do realize - although maybe you don't - that right now, even with your
2/2 case, the !MODULE_UNLOAD case seems to get the module ref-counts
wrong. I didn't look through it all, but it _seems_ to increment the
ref-count for every symbol it resolves.

What happens if the module load fails in the middle (say, because some
symbols referred to module 'a' that was loaded successfully, and others
refer to module 'b' that had some probelsm)? I think the !MODULE_UNLOAD
case is _broken_ again.

Why?

Take a small guess. It's broken - even with your 2/2 - exactly because the
!MODULE_UNLOAD case does something fundamentally different from the
regular case. It increments the module counts for each symbol it looks up,
exactly BECAUSE IT DOESN'T TRACK MODULE DEPENDENCIES!

Which means that it then cannot clean up properly afterwards if an error
happens in the middle.

So I'd suggest we should just track those module dependencies, and share
more of the code and the logic. Because it looks to me like not sharing it
continually results in bugs.

No?

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Linus Torvalds on


On Mon, 31 May 2010, Linus Torvalds wrote:
>
> Which means that it then cannot clean up properly afterwards if an error
> happens in the middle.

Oh, I guess it doesn't matter. For the non-unload case, we don't even do
any module counts, so the fixup ends up being a no-op. So with the error
handling fixed (which your 2/2 did), I guess it's actually ok. I still
think it would be nice to have /proc/modules show the module dependencies
regardless of whether they can be unloaded or not, but that's really just
an oddity, maybe not a "bug" per se.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Linus Torvalds on


On Tue, 1 Jun 2010, Linus Torvalds wrote:
>
> So I do agree that your later two-patch series is the way to go.

Oh, btw, did we ever get that version tested by the only person who seems
to see the problem? Brandon?

I'd like to have a Tested-By: line too.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Linus Torvalds on


On Tue, 1 Jun 2010, Brandon Philips wrote:
>
> When I tested a Kernel with Rusty's modules branch pulled onto
> 2.6.35-rc1 I got duplicate sysfs path errors:

Hmm. Yeah, the module_mutex used to be held across the whole "find -> add"
state, but isn't any more.

> To fix this we need to make sure that we only have one copy of a module
> going through load_module at a time. Patch suggestion follows which
> boots without errors. I am sure there is a better way to do what it does
> ;)

I think Rusty may have made the lock a bit _too_ finegrained there, and
didn't add it to some places that needed it. It looks, for example, like
PATCH 1/2 actually drops the lock in places where it's needed
("find_module()" is documented to need it, but now load_module() didn't
hold it at all when it did the find_module()).

Rather than adding a new "module_loading" list, I think we should be able
to just use the existing "modules" list, and just fix up the locking a
bit.

In fact, maybe we could just move the "look up existing module" a bit
later - optimistically assuming that the module doesn't exist, and then
just undoing the work if it turns out that we were wrong, just before
adding ourselves to the list.

A patch something like the appended (TOTALLY UNTESTED!)

Linus
---
kernel/module.c | 13 ++++++++-----
1 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/kernel/module.c b/kernel/module.c
index a1f46a5..21f7ffa 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -2198,11 +2198,6 @@ static noinline struct module *load_module(void __user *umod,
goto free_mod;
}

- if (find_module(mod->name)) {
- err = -EEXIST;
- goto free_mod;
- }
-
mod->state = MODULE_STATE_COMING;

/* Allow arches to frob section contents and sizes. */
@@ -2486,6 +2481,13 @@ static noinline struct module *load_module(void __user *umod,
* The mutex protects against concurrent writers.
*/
mutex_lock(&module_mutex);
+
+ if (find_module(mod->name)) {
+ err = -EEXIST;
+ /* This will also unlock the mutex */
+ goto already_exists;
+ }
+
list_add_rcu(&mod->list, &modules);
mutex_unlock(&module_mutex);

@@ -2511,6 +2513,7 @@ static noinline struct module *load_module(void __user *umod,
mutex_lock(&module_mutex);
/* Unlink carefully: kallsyms could be walking list. */
list_del_rcu(&mod->list);
+ already_exists:
mutex_unlock(&module_mutex);
synchronize_sched();
module_arch_cleanup(mod);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/