From: Rusty Russell on
On Fri, 28 May 2010 07:27:28 am Rafael J. Wysocki wrote:
> On Thursday 27 May 2010, Rusty Russell wrote:
> > On Thu, 27 May 2010 08:26:25 am Rafael J. Wysocki wrote:
> > > On Wednesday 26 May 2010, Rusty Russell wrote:
> > > > I suspect that the increased parallelism enabled by this patch uncovered this
> > > > bug. Does this fix it?
> > >
> > > Since the commit has been reverted, do you still want me to test this patch?
> > > Quite frankly I'd prefer to test a complete replacement for that commit on top
> > > of current -git.
> >
> > OK, combo meal deal below, against Linus' latest. I'd really appreciate
> > a report, since AFAIK you're the only one hitting it, and only when that
> > other (now reverted) patch was applied.
>
> I cannot reproduce the crash with the patch below.
>
> Thanks,
> Rafael

Great, thanks for testing! Though the patch which uncovered this has been
reverted, it could still bite someone.

Linus, I'll send the fix separately, then rework the other fix with clearer
locking and more love for the kprobes guys...

Thanks,
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Rusty Russell on
On Fri, 28 May 2010 04:16:02 am Brandon Philips wrote:
> On 14:56 Thu 27 May 2010, Rusty Russell wrote:
> > On Thu, 27 May 2010 08:26:25 am Rafael J. Wysocki wrote:
> > > On Wednesday 26 May 2010, Rusty Russell wrote:
> > > > I suspect that the increased parallelism enabled by this patch uncovered this
> > > > bug. Does this fix it?
> > >
> > > Since the commit has been reverted, do you still want me to test this patch?
> > > Quite frankly I'd prefer to test a complete replacement for that commit on top
> > > of current -git.
> >
> > OK, combo meal deal below, against Linus' latest. I'd really appreciate
> > a report, since AFAIK you're the only one hitting it, and only when that
> > other (now reverted) patch was applied.
>
> I tested this patch on my machine on top of Linus's latest and it
> fixes the issue. Without the patch and using Linus's latest I
> reproduce the original issue:
>
> [ 60.836022] bnx2: gave up waiting for init of module libcrc32c.
> [ 60.847997] bnx2: Unknown symbol crc32c
>
> Note: Again, since I don't have bnx2x hardware I forced bnx2.ko to
> depend on libcrc32c as bnx2x does:
> http://ifup.org/~philips/review/bnx2-hack-to-use-libcrc32c.patch
>
> > As an side to Brandon: I can see how my patch fixed an explicit
> > request_module inside module_init (that's how I tested it). I can't
> > see how we have a problem with an implicit dependency such as
> > bne2x->crc32. Modules go into the live state without retaking the
> > lock.
>
> libcrc32c is doing an explicit request_module inside of its
> module_init. Follow the call chain in libcrc32c_mod_init()

Thanks for confirmation, I figured that must be the case as I pondered
it on the weekend after sending my query.

Linus didn't like dropping the lock, so I'll create a more ambitious patch
which reduces the lock coverage to those places which really need it.

Cheers,
Rusty.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/