From: Ted Ts'o on
On Thu, Jul 22, 2010 at 01:49:57AM -0400, Christoph Hellwig wrote:
>
> I think ext4 is simply using an incorrectly typed tracepoint here.
> If you want it to be useful in any way it needs a sb paramter and
> an optional inode paramter, not the allocation context.

I agree; this is the patch that I had whipped up to fix the problem.
(See below)

> Also the whole ext4_mb_release_group_pa function seems to be a bit
> misdesigned. The code using ac is a totally separate block at the
> end of the function and does work that's unrelated to the rest
> of the function. Just making it a separate helper can calling it
> only from those places that have the allocation context would make
> the code more clear.

I need to look more closely at this. If I had time there would be a
lot of things that I'd be refactoring and cleaning up in mballoc.c....

- Ted


From 52f9a0d80ccdb1b23e364221167bb55b2886cc18 Mon Sep 17 00:00:00 2001
From: Theodore Ts'o <tytso(a)mit.edu>
Date: Thu, 22 Jul 2010 21:09:44 -0400
Subject: [PATCH] ext4: fix potential NULL dereference while tracing

The allocation_context pointer can be NULL.

Signed-off-by: "Theodore Ts'o" <tytso(a)mit.edu>
---
fs/ext4/mballoc.c | 4 ++--
include/trace/events/ext4.h | 20 ++++++++++++--------
2 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 3dfad95..8b3b934 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -3575,7 +3575,7 @@ ext4_mb_release_inode_pa(struct ext4_buddy *e4b, struct buffer_head *bitmap_bh,
trace_ext4_mballoc_discard(ac);
}

- trace_ext4_mb_release_inode_pa(ac, pa, grp_blk_start + bit,
+ trace_ext4_mb_release_inode_pa(sb, ac, pa, grp_blk_start + bit,
next - bit);
mb_free_blocks(pa->pa_inode, e4b, bit, next - bit);
bit = next + 1;
@@ -3606,7 +3606,7 @@ ext4_mb_release_group_pa(struct ext4_buddy *e4b,
ext4_group_t group;
ext4_grpblk_t bit;

- trace_ext4_mb_release_group_pa(ac, pa);
+ trace_ext4_mb_release_group_pa(sb, ac, pa);
BUG_ON(pa->pa_deleted == 0);
ext4_get_group_no_and_offset(sb, pa->pa_pstart, &group, &bit);
BUG_ON(group != e4b->bd_group && pa->pa_len != 0);
diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h
index f3865c7..01e9e00 100644
--- a/include/trace/events/ext4.h
+++ b/include/trace/events/ext4.h
@@ -395,11 +395,12 @@ DEFINE_EVENT(ext4__mb_new_pa, ext4_mb_new_group_pa,
);

TRACE_EVENT(ext4_mb_release_inode_pa,
- TP_PROTO(struct ext4_allocation_context *ac,
+ TP_PROTO(struct super_block *sb,
+ struct ext4_allocation_context *ac,
struct ext4_prealloc_space *pa,
unsigned long long block, unsigned int count),

- TP_ARGS(ac, pa, block, count),
+ TP_ARGS(sb, ac, pa, block, count),

TP_STRUCT__entry(
__field( dev_t, dev )
@@ -410,8 +411,9 @@ TRACE_EVENT(ext4_mb_release_inode_pa,
),

TP_fast_assign(
- __entry->dev = ac->ac_sb->s_dev;
- __entry->ino = ac->ac_inode->i_ino;
+ __entry->dev = sb->s_dev;
+ __entry->ino = (ac && ac->ac_inode) ?
+ ac->ac_inode->i_ino : 0;
__entry->block = block;
__entry->count = count;
),
@@ -422,10 +424,11 @@ TRACE_EVENT(ext4_mb_release_inode_pa,
);

TRACE_EVENT(ext4_mb_release_group_pa,
- TP_PROTO(struct ext4_allocation_context *ac,
+ TP_PROTO(struct super_block *sb,
+ struct ext4_allocation_context *ac,
struct ext4_prealloc_space *pa),

- TP_ARGS(ac, pa),
+ TP_ARGS(sb, ac, pa),

TP_STRUCT__entry(
__field( dev_t, dev )
@@ -436,8 +439,9 @@ TRACE_EVENT(ext4_mb_release_group_pa,
),

TP_fast_assign(
- __entry->dev = ac->ac_sb->s_dev;
- __entry->ino = ac->ac_inode->i_ino;
+ __entry->dev = sb->s_dev;
+ __entry->ino = (ac && ac->ac_inode) ?
+ ac->ac_inode->i_ino : 0;
__entry->pa_pstart = pa->pa_pstart;
__entry->pa_len = pa->pa_len;
),
--
1.7.0.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: KOSAKI Motohiro on
> On Thu, Jul 22, 2010 at 01:49:57AM -0400, Christoph Hellwig wrote:
> >
> > I think ext4 is simply using an incorrectly typed tracepoint here.
> > If you want it to be useful in any way it needs a sb paramter and
> > an optional inode paramter, not the allocation context.
>
> I agree; this is the patch that I had whipped up to fix the problem.
> (See below)

I'm not familiar ext4 so much. but you patch seems very nice!
thank you :)

>
> > Also the whole ext4_mb_release_group_pa function seems to be a bit
> > misdesigned. The code using ac is a totally separate block at the
> > end of the function and does work that's unrelated to the rest
> > of the function. Just making it a separate helper can calling it
> > only from those places that have the allocation context would make
> > the code more clear.
>
> I need to look more closely at this. If I had time there would be a
> lot of things that I'd be refactoring and cleaning up in mballoc.c....


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Theodore Tso on

On Jul 23, 2010, at 1:47 AM, KOSAKI Motohiro wrote:

>
> I'm not familiar ext4 so much. but you patch seems very nice!
> thank you :)

So, can someone confirm that this fixes the NULL pointer dereference? The patch itself makes sense, and it doesn't change the tracepoint output, so I'm inclined to include it, but I haven't been able to reproduce the failure myself using the reproduction recipe given at the beginning of this thread --- and I'd feel much better if someone who has been able to reproduce the crash can confirm this fixes things for them.

Thanks,

-- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Li Zefan on
Theodore Tso wrote:
> On Jul 23, 2010, at 1:47 AM, KOSAKI Motohiro wrote:
>
>> I'm not familiar ext4 so much. but you patch seems very nice!
>> thank you :)
>
> So, can someone confirm that this fixes the NULL pointer dereference? The patch itself makes sense, and it doesn't change the tracepoint output, so I'm inclined to include it, but I haven't been able to reproduce the failure myself using the reproduction recipe given at the beginning of this thread --- and I'd feel much better if someone who has been able to reproduce the crash can confirm this fixes things for them.
>

Thanks for the fix. I'll test the patch after the weekend.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Li Zefan on
Theodore Tso wrote:
> On Jul 23, 2010, at 1:47 AM, KOSAKI Motohiro wrote:
>
>> I'm not familiar ext4 so much. but you patch seems very nice!
>> thank you :)
>
> So, can someone confirm that this fixes the NULL pointer dereference? The patch itself makes sense, and it doesn't change the tracepoint output, so I'm inclined to include it, but I haven't been able to reproduce the failure myself using the reproduction recipe given at the beginning of this thread --- and I'd feel much better if someone who has been able to reproduce the crash can confirm this fixes things for them.
>

The patch works!

Tested-by: Li Zefan <lizf(a)cn.fujitsu.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/