From: Rusty Russell on
On Tue, 1 Jun 2010 02:18:34 am Andrew Morton wrote:
> On Mon, 31 May 2010 21:32:27 +0930 Rusty Russell <rusty(a)rustcorp.com.au> wrote:
>
> > Problem: it's hard to avoid an init routine stumbling over a
> > request_module these days. And it's not clear it's always a bad idea:
> > for example, a module like kvm with dynamic dependencies on kvm-intel
> > or kvm-amd would be neater if it could simply request_module the right
> > one.
> >
> > In this particular case, it's libcrc32c:
> >
> > libcrc32c_mod_init
> > crypto_alloc_shash
> > crypto_alloc_tfm
> > crypto_find_alg
> > crypto_alg_mod_lookup
> > crypto_larval_lookup
> > request_module
> >
> > If another module is waiting for libcrc32c to finish initializing
> > (ie. bne2 depends on libcrc32c) then it does so holding the module
> > lock, and our request_module() can't make progress until that is
> > released.
> >
> > Waiting without the lock isn't all that hard: we just need to pass the
> > -EBUSY up the call chain so we can sleep where we don't hold the lock.
> > Error reporting is a bit trickier: we need to copy the name of the
> > unfinished module before releasing the lock.
>
> Who's returning -EBUSY? request_module()? If so, are you requiring
> that all code which might call request_module() be correctly
> propagating error codes back? Please spell this all out?

Sorry if I was unclear. It's all inside module.c. Here's the updated
commentry:

If another module is waiting inside resolve_symbol() for libcrc32c to
finish initializing (ie. bne2 depends on libcrc32c) then it does so
holding the module lock, and our request_module() can't make progress
until that is released.

Waiting inside resolve_symbol() without the lock isn't all that hard:
we just need to pass the -EBUSY up the call chain so we can sleep
where we don't hold the lock. Error reporting is a bit trickier: we
need to copy the name of the unfinished module before releasing the
lock.

Hope that helps,
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Rusty Russell on
On Tue, 1 Jun 2010 05:45:37 am Linus Torvalds wrote:
>
> basically, we should do the whole module dependency list regardless
> of whether we can unload modules or not

Why?

Rusty.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Rusty Russell on
On Wed, 2 Jun 2010 11:40:29 am Brandon Philips wrote:
> On 16:51 Tue 01 Jun 2010, Linus Torvalds wrote:
> > On Tue, 1 Jun 2010, Brandon Philips wrote:
> > > When I tested a Kernel with Rusty's modules branch pulled onto
> > > 2.6.35-rc1 I got duplicate sysfs path errors:
> > Hmm. Yeah, the module_mutex used to be held across the whole "find -> add"
> > state, but isn't any more.
>
> Right.
>
> > > To fix this we need to make sure that we only have one copy of a
> > > module going through load_module at a time. Patch suggestion
> > > follows which boots without errors. I am sure there is a better
> > > way to do what it does ;)
> >
> > I think Rusty may have made the lock a bit _too_ finegrained there,
> > and didn't add it to some places that needed it. It looks, for
> > example, like PATCH 1/2 actually drops the lock in places where it's
> > needed ("find_module()" is documented to need it, but now
> > load_module() didn't hold it at all when it did the find_module()).
>
> Right, I noticed that too and held the lock in the patch I sent.
>
> > Rather than adding a new "module_loading" list, I think we should be
> > able to just use the existing "modules" list, and just fix up the
> > locking a bit.
> >
> > In fact, maybe we could just move the "look up existing module" a
> > bit later - optimistically assuming that the module doesn't exist,
> > and then just undoing the work if it turns out that we were wrong,
> > just before adding ourselves to the list.
> >
> > A patch something like the appended (TOTALLY UNTESTED!)
>
> FWIW, I tried this same idea initially and it breaks because the
> kobject EEXIST is coming from mod_sysfs_init() which happens further
> up in load_module() before the list_add_rcu().

I'll take a look. Linus was right that my lock reduction was overzealous.

The kobject error should be harmless.

I have a simplified version of your problem (see testing patch); I'll
see what I can fix...

Thanks!
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Rusty Russell on
On Wed, 2 Jun 2010 02:26:52 pm Linus Torvalds wrote:
>
> On Wed, 2 Jun 2010, Rusty Russell wrote:
> >
> > OK, this might be worthwhile. Take the very first mail you sent:
> >
> > Hmm. That does seem to be buggy. We can't just drop and re-take the lock:
> > that may make sense internally as far as resolve_symbol() itself is
> > concerned, but the caller will its own local variables, and some of those
> > will no longer be valid if the lock was dropped.
> >
> > The implication here is that that I don't know locking, and that this was
> > done without thought or care. In fact, I did worry about it and decided it
> > was less risky than a locking reduction given my limited cycles (indeed,
> > it has been slightly).
>
> Well, part of the context here is that the commit had been bisected as
> being buggy. It turns out the bug was a different issue, but at the same
> time, it very much looked like the locking was simply known a-priori to be
> buggy. No?

Hmm, you're still missing it. Let me try again.

You weren't the best person to make the call. That didn't occur to you, did
it?

Subsystem maintainers aren't just patch monkeys, we collect all the metadata
which informs these decisions. Without that, you wandered in, came to the
obvious conclusion, and were wrong.

Really grating is the arrogant lack of respect for that knowledge combined
with your multiple mistakes since then due to your lack of it. See below.

> > That commit also changes the return value semantics of "use_module()",
> > which is an exported interface where the only users seem to be
> > out-of-kernel (the only in-kernel use is in kernel/module.c itself). That
> > seems like a really really bad idea too.
> >
> > Again with the implication that this was done without consideration.
>
> No. The implication was that it's wrong to do and a bad idea. Anything
> else is just you reading things into it.
>
> It's an exported interface, and you changed it with no real reason.
> Whether you then talked to users or not is immaterial.
>
> You could easily have just changed the _internal_ thing, and exported the
> unchanged interface.

I *genuinely* thought our policy with out-of-tree code was to smother
out-of-tree code with absolute apathy so they'll get sick of riding the
churn.

Still, easy to avoid in this case.

> Even your second version is just very confused. It renames it, but then
> exports the renamed version. What does that help?

This *almost* looks like an honest "hey, I don't understand this".

But no, "very confused" already assumes you're right. You do ask if it
helps...

> It still means that the external module - for no good reason - needs to
> have basically a source-level version number check.

....and go on to assume it doesn't. I must be really confused, right?

Or maybe, I know something you don't about the code I maintain.

We have infrastructure to look up symbols dynamically: symbol_get().

Sure, they have to adapt their code, but now they can maintain their own
damn wrapper, *and* they'll discover the change as soon as they try to
build their module against the new kernel.

I'd go so far as to say it's clever...
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Rusty Russell on
On Wed, 2 Jun 2010 02:14:37 pm Linus Torvalds wrote:
> (That function _really_ should be peeled like an onion, and split into
> many smaller functions, so that we don't have ten error cases needing
> unwinding. I like "goto error", but at some point you can't see the
> unwinding any more, and that function has passed that point a long time
> ago, I think)

Agreed. Feel free to take a stab at it if you're bored. Last I tried,
there wasn't an obvious split point which actually reduced the size of the
function.

And here's the tree I'm testing now (it compiles, currently rebasing to
dodge some unrelated -rc1 fs/block weirdness):

The following changes since commit 67a3e12b05e055c0415c556a315a3d3eb637e29e:
Linus Torvalds (1):
Linux 2.6.35-rc1

are available in the git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6 modules

Eric Dumazet (1):
module: module_unload_init() cleanup

Linus Torvalds (2):
module: Make the 'usage' lists be two-way
module: move find_module check to end

Rusty Russell (6):
module: fix reference to mod->percpu after freeing module.
module: fix kdb's illicit use of struct module_use.
module: move sysfs exposure to end of load_module
module: Make module sysfs functions private.
module: make locking more fine-grained.
module: fix bne2 "gave up waiting for init of module libcrc32c"

include/linux/module.h | 44 ++-----
kernel/debug/kdb/kdb_main.c | 12 +--
kernel/module.c | 325 ++++++++++++++++++++++++++++---------------
3 files changed, 226 insertions(+), 155 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/