From: Hitoshi Mitake on

Hi Arnaldo,

I suffered a mysterious floating point exception when executing perf
on current tip/master.

For example,

% sudo ./perf lock record ./perf bench sched messaging
# Running sched/messaging benchmark...
# 20 sender and receiver processes per group
# 10 groups == 400 processes run

Total time: 1.375 [sec]
[ perf record: Woken up 0 times to write data ]
[ perf record: Captured and wrote 160.002 MB perf.data (~6990599 samples) ]
[1] 26348 floating point exception sudo ./perf lock record ./perf
bench sched messaging

dmesg says that

[10212.752490] perf[26348] trap divide error ip:7fade8564310
sp:7fff46d926e8 error:0 in libnewt.so.0.52.2[7fade8556000+15000]

Example is of perf lock, but this is not a special phenomenon of perf lock,
other users of perf record, like perf sched, caused the same problem.

I found that this might be caused by the line 283 of perf.c

if (fclose(stdout)) # <- closing stdout!
die("close failed on standard output: %s", strerror(errno));

I don't know well about libnewt, do you know something like cause?

The version of libnewt I'm using is 0.52.2 (default of Debian lenny)

Thanks,
Hitoshi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Arnaldo Carvalho de Melo on
Em Sun, Apr 04, 2010 at 05:10:57PM +0900, Hitoshi Mitake escreveu:
> I suffered a mysterious floating point exception when executing perf
> on current tip/master.
>
> For example,
>
> % sudo ./perf lock record ./perf bench sched messaging
> # Running sched/messaging benchmark...
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
>
> Total time: 1.375 [sec]
> [ perf record: Woken up 0 times to write data ]
> [ perf record: Captured and wrote 160.002 MB perf.data (~6990599 samples) ]
> [1] 26348 floating point exception sudo ./perf lock record ./perf
> bench sched messaging
>
> dmesg says that
>
> [10212.752490] perf[26348] trap divide error ip:7fade8564310
> sp:7fff46d926e8 error:0 in libnewt.so.0.52.2[7fade8556000+15000]
>
> Example is of perf lock, but this is not a special phenomenon of perf lock,
> other users of perf record, like perf sched, caused the same problem.
>
> I found that this might be caused by the line 283 of perf.c
>
> if (fclose(stdout)) # <- closing stdout!
> die("close failed on standard output: %s", strerror(errno));
>
> I don't know well about libnewt, do you know something like cause?
>
> The version of libnewt I'm using is 0.52.2 (default of Debian lenny)

I just built a kernel where 'perf lock' can work, but I'm seeing a
different problem:

[root(a)doppio ~]# perf lock trace | head -3
Warning: Error: expected 'data' but read 'overwrite'
swapper-0 [000] 619.086424: lock_acquire: 0xffff8800029d4b28 &q->lock
swapper-0 [000] 619.086429: lock_acquire: 0xffff8800029d08f8 &ctx->lock
swapper-0 [000] 619.086439: lock_acquire: 0xffffffff81a38e58 clockevents_lock
[root(a)doppio ~]# perf lock report | head -3
Warning: Error: expected 'data' but read 'overwrite'
ID Name acquired contended total wait (ns) max wait (ns) min wait (ns)

0xffff8800029d4b28 &q->lock 0 0 0 0 0
[root(a)doppio ~]#

My system details:

[root(a)doppio ~]# uname -a
Linux doppio.ghostprotocols.net 2.6.34-rc3-tip+ #1 SMP Mon Apr 5 13:49:42 BRT 2010 x86_64 x86_64 x86_64 GNU/Linux
[root(a)doppio ~]# rpm -q newt
newt-0.52.10-4.fc11.x86_64
[root(a)doppio ~]#

No segfaults here :-\

- Arnaldo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Hitoshi Mitake on
On 04/06/10 03:31, Arnaldo Carvalho de Melo wrote:
> Em Sun, Apr 04, 2010 at 05:10:57PM +0900, Hitoshi Mitake escreveu:
>> I suffered a mysterious floating point exception when executing perf
>> on current tip/master.
>>
>> For example,
>>
>> % sudo ./perf lock record ./perf bench sched messaging
>> # Running sched/messaging benchmark...
>> # 20 sender and receiver processes per group
>> # 10 groups == 400 processes run
>>
>> Total time: 1.375 [sec]
>> [ perf record: Woken up 0 times to write data ]
>> [ perf record: Captured and wrote 160.002 MB perf.data (~6990599
samples) ]
>> [1] 26348 floating point exception sudo ./perf lock record ./perf
>> bench sched messaging
>>
>> dmesg says that
>>
>> [10212.752490] perf[26348] trap divide error ip:7fade8564310
>> sp:7fff46d926e8 error:0 in libnewt.so.0.52.2[7fade8556000+15000]
>>
>> Example is of perf lock, but this is not a special phenomenon of
perf lock,
>> other users of perf record, like perf sched, caused the same problem.
>>
>> I found that this might be caused by the line 283 of perf.c
>>
>> if (fclose(stdout)) #<- closing stdout!
>> die("close failed on standard output: %s", strerror(errno));
>>
>> I don't know well about libnewt, do you know something like cause?
>>
>> The version of libnewt I'm using is 0.52.2 (default of Debian lenny)
>
> I just built a kernel where 'perf lock' can work, but I'm seeing a
> different problem:
>
> [root(a)doppio ~]# perf lock trace | head -3
> Warning: Error: expected 'data' but read 'overwrite'
> swapper-0 [000] 619.086424: lock_acquire:
0xffff8800029d4b28&q->lock
> swapper-0 [000] 619.086429: lock_acquire:
0xffff8800029d08f8&ctx->lock
> swapper-0 [000] 619.086439: lock_acquire:
0xffffffff81a38e58 clockevents_lock
> [root(a)doppio ~]# perf lock report | head -3
> Warning: Error: expected 'data' but read 'overwrite'
> ID Name acquired contended total
wait (ns) max wait (ns) min wait (ns)
>
> 0xffff8800029d4b28&q->lock 0 0 0
0 0
> [root(a)doppio ~]#
>
> My system details:
>
> [root(a)doppio ~]# uname -a
> Linux doppio.ghostprotocols.net 2.6.34-rc3-tip+ #1 SMP Mon Apr 5
13:49:42 BRT 2010 x86_64 x86_64 x86_64 GNU/Linux
> [root(a)doppio ~]# rpm -q newt
> newt-0.52.10-4.fc11.x86_64
> [root(a)doppio ~]#
>
> No segfaults here :-\
>
> - Arnaldo
>

I tested perf lock trace and got a similar error (but different).
% sudo ./perf lock trace
Fatal: reading input file (size expected=3 received=0)
It seems that perf.data is broken :( I'll search the cause.

Hitoshi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Arnaldo Carvalho de Melo on
Em Wed, Apr 07, 2010 at 05:32:24PM +0900, Hitoshi Mitake escreveu:
> On 04/06/10 03:31, Arnaldo Carvalho de Melo wrote:
> > Em Sun, Apr 04, 2010 at 05:10:57PM +0900, Hitoshi Mitake escreveu:
> > [root(a)doppio ~]# perf lock report | head -3
> > Warning: Error: expected 'data' but read 'overwrite'
> > ID Name acquired contended
> total wait (ns) max wait (ns) min wait (ns)
> >
> > 0xffff8800029d4b28&q->lock 0 0 0
> 0 0
> > [root(a)doppio ~]#
> >
> > My system details:
> >
> > [root(a)doppio ~]# uname -a
> > Linux doppio.ghostprotocols.net 2.6.34-rc3-tip+ #1 SMP Mon Apr 5
> 13:49:42 BRT 2010 x86_64 x86_64 x86_64 GNU/Linux
> > [root(a)doppio ~]# rpm -q newt
> > newt-0.52.10-4.fc11.x86_64
> > [root(a)doppio ~]#
> >
> > No segfaults here :-\
> >
> > - Arnaldo
> >
>
> I tested perf lock trace and got a similar error (but different).
> % sudo ./perf lock trace
> Fatal: reading input file (size expected=3 received=0)
> It seems that perf.data is broken :( I'll search the cause.

Take a look at:

http://git.kernel.org/?p=linux/kernel/git/acme/linux-2.6.git;a=summary

"perf" branch, there is at least one fix there, the "overwrite" one.

- Arnaldo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Hitoshi Mitake on
On 04/07/10 19:38, Arnaldo Carvalho de Melo wrote:
> Em Wed, Apr 07, 2010 at 05:32:24PM +0900, Hitoshi Mitake escreveu:
>> On 04/06/10 03:31, Arnaldo Carvalho de Melo wrote:
>>> Em Sun, Apr 04, 2010 at 05:10:57PM +0900, Hitoshi Mitake escreveu:
>>> [root(a)doppio ~]# perf lock report | head -3
>>> Warning: Error: expected 'data' but read 'overwrite'
>>> ID Name acquired contended
>> total wait (ns) max wait (ns) min wait (ns)
>>>
>>> 0xffff8800029d4b28&q->lock 0 0 0
>> 0 0
>>> [root(a)doppio ~]#
>>>
>>> My system details:
>>>
>>> [root(a)doppio ~]# uname -a
>>> Linux doppio.ghostprotocols.net 2.6.34-rc3-tip+ #1 SMP Mon Apr 5
>> 13:49:42 BRT 2010 x86_64 x86_64 x86_64 GNU/Linux
>>> [root(a)doppio ~]# rpm -q newt
>>> newt-0.52.10-4.fc11.x86_64
>>> [root(a)doppio ~]#
>>>
>>> No segfaults here :-\
>>>
>>> - Arnaldo
>>>
>>
>> I tested perf lock trace and got a similar error (but different).
>> % sudo ./perf lock trace
>> Fatal: reading input file (size expected=3 received=0)
>> It seems that perf.data is broken :( I'll search the cause.
>
> Take a look at:
>
> http://git.kernel.org/?p=linux/kernel/git/acme/linux-2.6.git;a=summary
>
> "perf" branch, there is at least one fix there, the "overwrite" one.
>

Thanks for your notify!
I applied changes on your branch, but my problem was not solved...

I found that the member "offset" of struct perf_file_header is broken.
I made this easy change,

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 6c9aa16..dbea087 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -571,6 +571,7 @@ int perf_header__process_sections(struct perf_header
*self, int fd,
if (perf_header__getbuffer64(self, fd, feat_sec, sec_size))
goto out_free;

+ printf("feat_sec->offset:%llu\n", feat_sec->offset);
err = 0;
while (idx < nr_sections && feat < HEADER_LAST_FEATURE) {
if (perf_header__has_feat(self, feat)) {

And the result is this:

% sudo ./perf lock report
Fatal: reading input file (size expected=3 received=0)
feat_sec->offset:18014402804449281 # <- Wow! :(

It seems that header of perf.data is broken...
But it may be a cause of this problem, I'll try to fix it.

Thanks,
Hitoshi

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/