From: Thomas Renninger on
Hi,

I can confirm that this patch fixes the issue for me.

On Thursday 08 July 2010 23:53:00 Andrew Morton wrote:
> On Thu, 8 Jul 2010 17:39:28 -0400
> Jason Baron <jbaron(a)redhat.com> wrote:
>
> > Make sure we properly call ddebug_remove_module() when a module fails to
> > load. In addition, pass the pointer to the "debug table", to both
> > ddebug_add_module(), and ddebug_remove_module() so that we can uniquely
> > identify each set of debug statements. In this way even modules with the
> > same name can be properly identified and removed.
> >
> >
> > Signed-off-by: Jason Baron <jbaron(a)redhat.com>
>
> It'd be nice to track the Reported-by:s. And the Tested-by:s if/when
> they arrive. SighIllDoIt.
>
> The patch (almost) applies to 2.6.34. So are we missing a Cc:stable tag
> as well?
I'll resubmit with some more meta info and will include stable(a)kernel.org.

Could it be that this isn't a regression, but a bug that was always present,
but only gets exposed if you add modules with a specific implementation,
e.g. specific declarations of functions missing, etc.?

I tried to patch this into a 2.6.32.X kernel. While some hunks did not
succeed, it looks like an adjusted patch should get submitted for older
stable kernels as well?:

/dev/shm/linux-2.6.32> patch --dry-run -p1 -i ../linux-2.6.35-rc3/dynamic_debug_broken.patch
patching file include/linux/dynamic_debug.h
Hunk #2 succeeded at 76 with fuzz 2 (offset 3 lines).
patching file include/linux/module.h
Hunk #1 succeeded at 376 (offset -11 lines).
patching file kernel/module.c
Hunk #1 FAILED at 787.
Hunk #2 succeeded at 1596 with fuzz 2 (offset 47 lines).
Hunk #3 succeeded at 2098 (offset 44 lines).
Hunk #4 succeeded at 2548 (offset 62 lines).
Hunk #5 succeeded at 2638 with fuzz 2 (offset 73 lines).
1 out of 5 hunks FAILED -- saving rejects to file kernel/module.c.rej
patching file lib/dynamic_debug.c
Hunk #1 succeeded at 691 (offset -1 lines).
Hunk #2 succeeded at 702 (offset -1 lines).

Thanks Jason for this quick fix!

Thomas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Jason Baron on
On Fri, Jul 09, 2010 at 01:03:08PM +0200, Thomas Renninger wrote:
> Hi,
>
> I can confirm that this patch fixes the issue for me.
>
> On Thursday 08 July 2010 23:53:00 Andrew Morton wrote:
> > On Thu, 8 Jul 2010 17:39:28 -0400
> > Jason Baron <jbaron(a)redhat.com> wrote:
> >
> > > Make sure we properly call ddebug_remove_module() when a module fails to
> > > load. In addition, pass the pointer to the "debug table", to both
> > > ddebug_add_module(), and ddebug_remove_module() so that we can uniquely
> > > identify each set of debug statements. In this way even modules with the
> > > same name can be properly identified and removed.
> > >
> > >
> > > Signed-off-by: Jason Baron <jbaron(a)redhat.com>
> >
> > It'd be nice to track the Reported-by:s. And the Tested-by:s if/when
> > they arrive. SighIllDoIt.
> >
> > The patch (almost) applies to 2.6.34. So are we missing a Cc:stable tag
> > as well?
> I'll resubmit with some more meta info and will include stable(a)kernel.org.
>
> Could it be that this isn't a regression, but a bug that was always present,
> but only gets exposed if you add modules with a specific implementation,
> e.g. specific declarations of functions missing, etc.?
>

Hi Thomas,

yes, this race has likely been present for a while (i'd have to look at
specific kernel versions to verify). I suspect its getting exposed now
due to more usage of this feature, and the proliferation of kernel
modules...

> I tried to patch this into a 2.6.32.X kernel. While some hunks did not
> succeed, it looks like an adjusted patch should get submitted for older
> stable kernels as well?:
>

if nobody else has done the 2.6.32 stable patch, I can do it, just let
me know.

thanks again for reporting this to me.

thanks,

-Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Thomas Renninger on
Hi,

it's this one:
commit ff49d74ad383f54041378144ca1a229ee9aeaa59
Author: Yehuda Sadeh <yehuda(a)hq.newdream.net>
Date: Sat Jul 3 13:07:35 2010 +1000

which touches same code than Jason's fix does.
Possibly this patch also addresses (only parts of?) this problem?
Jason: Do you mind having a look at the latest git version and review
Yehuda's and adjust your patch if still necessary.
If Yehuda's patch is fixing this already, we still need it backported for
2.6.34 stable kernel and further?

One question about dynamic debug (unrelated to the mem corruption
issue):
Would it make sense to initialize dynamic debug earlier, somewhen shortly
after __setup is called.
Then a boot param ddebug_enable="xy" could be added.
The param could be in /sys/../control format or just be "all"?
My idea is to be able to track all the pr_debug calls (as) early (as possible)
at boot up.
One example is ec.c. Currently it is not possible to see the pr_debug messages
when EC accesses are done when the ACPI interpreter is started, there
is no userspace and no sysfs yet.
Same for PCI related pr_debug messages at early PCI(e) initialization time?
Would that be possible or do I miss something?

Thanks,

Thomas

On Friday 09 July 2010 03:30:19 pm Jason Baron wrote:
> On Fri, Jul 09, 2010 at 01:03:08PM +0200, Thomas Renninger wrote:
> > Hi,
> >
> > I can confirm that this patch fixes the issue for me.
> >
> > On Thursday 08 July 2010 23:53:00 Andrew Morton wrote:
> > > On Thu, 8 Jul 2010 17:39:28 -0400
> > >
> > > Jason Baron <jbaron(a)redhat.com> wrote:
> > > > Make sure we properly call ddebug_remove_module() when a module fails
> > > > to load. In addition, pass the pointer to the "debug table", to both
> > > > ddebug_add_module(), and ddebug_remove_module() so that we can
> > > > uniquely identify each set of debug statements. In this way even
> > > > modules with the same name can be properly identified and removed.
> > > >
> > > >
> > > > Signed-off-by: Jason Baron <jbaron(a)redhat.com>
> > >
> > > It'd be nice to track the Reported-by:s. And the Tested-by:s if/when
> > > they arrive. SighIllDoIt.
> > >
> > > The patch (almost) applies to 2.6.34. So are we missing a Cc:stable
> > > tag as well?
> >
> > I'll resubmit with some more meta info and will include
> > stable(a)kernel.org.
> >
> > Could it be that this isn't a regression, but a bug that was always
> > present, but only gets exposed if you add modules with a specific
> > implementation, e.g. specific declarations of functions missing, etc.?
>
> Hi Thomas,
>
> yes, this race has likely been present for a while (i'd have to look at
> specific kernel versions to verify). I suspect its getting exposed now
> due to more usage of this feature, and the proliferation of kernel
> modules...
>
> > I tried to patch this into a 2.6.32.X kernel. While some hunks did not
> > succeed, it looks like an adjusted patch should get submitted for older
> > stable kernels as well?:
>
> if nobody else has done the 2.6.32 stable patch, I can do it, just let
> me know.
>
> thanks again for reporting this to me.
>
> thanks,
>
> -Jason


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Jason Baron on
On Mon, Jul 12, 2010 at 04:24:16PM +0200, Thomas Renninger wrote:
> Hi,
>
> it's this one:
> commit ff49d74ad383f54041378144ca1a229ee9aeaa59
> Author: Yehuda Sadeh <yehuda(a)hq.newdream.net>
> Date: Sat Jul 3 13:07:35 2010 +1000
>
> which touches same code than Jason's fix does.
> Possibly this patch also addresses (only parts of?) this problem?
> Jason: Do you mind having a look at the latest git version and review
> Yehuda's and adjust your patch if still necessary.

So Yehuda's patch is meant to address this issue. I would still move the
remove code to free_module(), for the case that load_module() fails as
follows (on top of Yehuda's patch):

Signed-off-by: Jason Baron <jbaron(a)redhat.com>
---
kernel/module.c | 4 +++-
1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/kernel/module.c b/kernel/module.c
index 5d2d281..6c56282 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -787,7 +787,6 @@ SYSCALL_DEFINE2(delete_module, const char __user *, name_user,

/* Store the name of the last unloaded module for diagnostic purposes */
strlcpy(last_unloaded_module, mod->name, sizeof(last_unloaded_module));
- ddebug_remove_module(mod->name);

free_module(mod);
return 0;
@@ -1550,6 +1549,9 @@ static void free_module(struct module *mod)
remove_sect_attrs(mod);
mod_kobject_remove(mod);

+ /* Remove dynamic debug info */
+ ddebug_remove_module(mod->name);
+
/* Arch-specific cleanup. */
module_arch_cleanup(mod);

--
1.7.1

Also, the patch i wrote addressed the issue of duplicate "module" names.
Although you can't have duplicate names of loadable modules. Dynamic
debug also refers to built in vmlinux sections by using
"KBUILD_MODNAME". These are not necessarily globally unique. Thus, we
probably should be using a better globally unique identifier in the code.
However, that improvement can probably wait...

I will re-post the above as a separate thread. Also, I should probably
post a patch to add myself to the MAINTAINERS file, since I didn't see
Yehuda's patch until now...

> If Yehuda's patch is fixing this already, we still need it backported for
> 2.6.34 stable kernel and further?
>

Yes, patch might need to be adjusted a bit but is still relevant for
2.6.34 stable. The situation was exacerbated by:

commit 3bafeb6247042dcbb72b0141ec7c7107de9f0b99
Author: Linus Torvalds <torvalds(a)linux-foundation.org>
Date: Sat Jun 5 11:17:36 2010 -0600

module: move find_module check to end

during the 2.6.35 time frame.

> One question about dynamic debug (unrelated to the mem corruption
> issue):
> Would it make sense to initialize dynamic debug earlier, somewhen shortly
> after __setup is called.
> Then a boot param ddebug_enable="xy" could be added.
> The param could be in /sys/../control format or just be "all"?
> My idea is to be able to track all the pr_debug calls (as) early (as possible)
> at boot up.
> One example is ec.c. Currently it is not possible to see the pr_debug messages
> when EC accesses are done when the ACPI interpreter is started, there
> is no userspace and no sysfs yet.
> Same for PCI related pr_debug messages at early PCI(e) initialization time?
> Would that be possible or do I miss something?
>
> Thanks,
>
> Thomas

definitely a good idea, and something I just had not yet done
yet...after we resolve this issue, I'd be more than happy to review
and/or write a patch to do this. I'm glad ppl are interested in this :)

thanks,

-Jason

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Yehuda Sadeh on


On Mon, 12 Jul 2010, Jason Baron wrote:
....
> I will re-post the above as a separate thread. Also, I should probably
> post a patch to add myself to the MAINTAINERS file, since I didn't see
> Yehuda's patch until now...

Yeah, I wasn't sure who was responsible for this module. Wasn't clear
from the log.

>
>> If Yehuda's patch is fixing this already, we still need it backported for
>> 2.6.34 stable kernel and further?
>>
>
> Yes, patch might need to be adjusted a bit but is still relevant for
> 2.6.34 stable. The situation was exacerbated by:
>
> commit 3bafeb6247042dcbb72b0141ec7c7107de9f0b99
> Author: Linus Torvalds <torvalds(a)linux-foundation.org>
> Date: Sat Jun 5 11:17:36 2010 -0600
>
> module: move find_module check to end
>
> during the 2.6.35 time frame.
>

Right. Looking at the code, pre .35-rc2 loads the dynamic debug code after
verifying that the module is loadable. So it's a bit harder to hit
it, although it can still fail at the following module_finalize() or at
the parse_args(), so the problem is there.

Yehuda
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/