From: KOSAKI Motohiro on
Hi Steven,

> create file: file_108.dat
> # sync
> (panic)
>
>
> Seems ac->ac_inode can be NULL:
>
> DECLARE_EVENT_CLASS(ext4__mballoc,
> ...
> TP_fast_assign(
> __entry->dev = ac->ac_inode->i_sb->s_dev;
> __entry->ino = ac->ac_inode->i_ino;
> ...
> ),
> ...
> );

Can you teach us proper tracepint writing way?


ext4_mb_release_group_pa() has a concern when ac is NULL.

============================================================
static noinline_for_stack int
ext4_mb_release_group_pa(struct ext4_buddy *e4b,
struct ext4_prealloc_space *pa,
struct ext4_allocation_context *ac)
{
struct super_block *sb = e4b->bd_sb;
ext4_group_t group;
ext4_grpblk_t bit;

trace_ext4_mb_release_group_pa(ac, pa);
BUG_ON(pa->pa_deleted == 0);
ext4_get_group_no_and_offset(sb, pa->pa_pstart, &group, &bit);
BUG_ON(group != e4b->bd_group && pa->pa_len != 0);
mb_free_blocks(pa->pa_inode, e4b, bit, pa->pa_len);
atomic_add(pa->pa_len, &EXT4_SB(sb)->s_mb_discarded);

if (ac) { // here
ac->ac_sb = sb;
ac->ac_inode = NULL;
ac->ac_b_ex.fe_group = group;
ac->ac_b_ex.fe_start = bit;
ac->ac_b_ex.fe_len = pa->pa_len;
ac->ac_b_ex.fe_logical = 0;
trace_ext4_mballoc_discard(ac);
}

return 0;
}
===================================================

but trace_ext4_mb_release_group_pa() doesn't care it.
=====================================================
TRACE_EVENT(ext4_mb_release_group_pa,
TP_PROTO(struct ext4_allocation_context *ac,
struct ext4_prealloc_space *pa),

TP_ARGS(ac, pa),

TP_STRUCT__entry(
__field( dev_t, dev )
__field( ino_t, ino )
__field( __u64, pa_pstart )
__field( __u32, pa_len )

),

TP_fast_assign(
__entry->dev = ac->ac_sb->s_dev;
__entry->ino = ac->ac_inode->i_ino;
__entry->pa_pstart = pa->pa_pstart;
__entry->pa_len = pa->pa_len;
),

TP_printk("dev %s pstart %llu len %u",
jbd2_dev_to_name(__entry->dev), __entry->pa_pstart, __entry->pa_len)
);
=====================================================

So, adding following branch should fix this issue.

if (ac)
trace_ext4_mb_release_group_pa(ac, pa);

But, I don't think this is proper fix because we don't want any overhead
if the tracepoint is disabled.

So, How do we check NULL in TP_fast_assign()?

Thanks.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Steven Rostedt on
On Wed, 2010-07-21 at 22:31 +0900, KOSAKI Motohiro wrote:
> Hi Steven,

> if (ac)
> trace_ext4_mb_release_group_pa(ac, pa);
>
> But, I don't think this is proper fix because we don't want any overhead
> if the tracepoint is disabled.
>
> So, How do we check NULL in TP_fast_assign()?

You could do:

TP_fast_assign(
if (ac) {
__entry->dev = ac->ac_sb->s_dev;
__entry->ino = ac->ac_inode->i_ino;
__entry->pa_pstart = pa->pa_pstart;
__entry->pa_len = pa->pa_len;
}
),

But this just makes the __entry null and wastes the ring buffer.

I may be able to add a __discard_entry that may help. Then we could do
something like this:

if (ac) {
__entry->dev = ac->ac_sb->s_dev;
__entry->ino = ac->ac_inode->i_ino;
__entry->pa_pstart = pa->pa_pstart;
__entry->pa_len = pa->pa_len;
} else
__discard_entry;

Does this seem reasonable?

But for now, the wasting the entry seems to be the only choice we have,
or to do as you suggested and have the "if (ac) trace_...", but I don't
like that.

-- Steve



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Frederic Weisbecker on
On Wed, Jul 21, 2010 at 10:16:06AM -0400, Steven Rostedt wrote:
> On Wed, 2010-07-21 at 22:31 +0900, KOSAKI Motohiro wrote:
> > Hi Steven,
>
> > if (ac)
> > trace_ext4_mb_release_group_pa(ac, pa);
> >
> > But, I don't think this is proper fix because we don't want any overhead
> > if the tracepoint is disabled.
> >
> > So, How do we check NULL in TP_fast_assign()?
>
> You could do:
>
> TP_fast_assign(
> if (ac) {
> __entry->dev = ac->ac_sb->s_dev;
> __entry->ino = ac->ac_inode->i_ino;
> __entry->pa_pstart = pa->pa_pstart;
> __entry->pa_len = pa->pa_len;
> }
> ),
>
> But this just makes the __entry null and wastes the ring buffer.
>
> I may be able to add a __discard_entry that may help. Then we could do
> something like this:
>
> if (ac) {
> __entry->dev = ac->ac_sb->s_dev;
> __entry->ino = ac->ac_inode->i_ino;
> __entry->pa_pstart = pa->pa_pstart;
> __entry->pa_len = pa->pa_len;
> } else
> __discard_entry;
>
> Does this seem reasonable?
>
> But for now, the wasting the entry seems to be the only choice we have,
> or to do as you suggested and have the "if (ac) trace_...", but I don't
> like that.
>
> -- Steve


Is there no already existing branch in ext4 you could reuse in order to
send the trace only if (ac) ?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Li Zefan on
Steven Rostedt wrote:
> On Wed, 2010-07-21 at 22:31 +0900, KOSAKI Motohiro wrote:
>> Hi Steven,
>
>> if (ac)
>> trace_ext4_mb_release_group_pa(ac, pa);
>>
>> But, I don't think this is proper fix because we don't want any overhead
>> if the tracepoint is disabled.
>>
>> So, How do we check NULL in TP_fast_assign()?
>
> You could do:
>
> TP_fast_assign(
> if (ac) {
> __entry->dev = ac->ac_sb->s_dev;
> __entry->ino = ac->ac_inode->i_ino;
> __entry->pa_pstart = pa->pa_pstart;
> __entry->pa_len = pa->pa_len;
> }

This leaves __entry->dev etc as arbitrary value, since the entry returned
by the ring buffer is not zeroed, so I think better add an else branch to
zero those values.

> ),
>
> But this just makes the __entry null and wastes the ring buffer.
>
> I may be able to add a __discard_entry that may help. Then we could do
> something like this:
>
> if (ac) {
> __entry->dev = ac->ac_sb->s_dev;
> __entry->ino = ac->ac_inode->i_ino;
> __entry->pa_pstart = pa->pa_pstart;
> __entry->pa_len = pa->pa_len;
> } else
> __discard_entry;
>
> Does this seem reasonable?
>
> But for now, the wasting the entry seems to be the only choice we have,
> or to do as you suggested and have the "if (ac) trace_...", but I don't
> like that.
>

As I was (and still am) not sure what is the best fix, I decided
to send out a bug report instead of a patch..
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Christoph Hellwig on
On Wed, Jul 21, 2010 at 10:31:20PM +0900, KOSAKI Motohiro wrote:
> But, I don't think this is proper fix because we don't want any overhead
> if the tracepoint is disabled.
>
> So, How do we check NULL in TP_fast_assign()?

I think ext4 is simply using an incorrectly typed tracepoint here.
If you want it to be useful in any way it needs a sb paramter and
an optional inode paramter, not the allocation context.

Also the whole ext4_mb_release_group_pa function seems to be a bit
misdesigned. The code using ac is a totally separate block at the
end of the function and does work that's unrelated to the rest
of the function. Just making it a separate helper can calling it
only from those places that have the allocation context would make
the code more clear.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/