From: Ben Gamari on
On Sat, 27 Mar 2010 18:20:37 -0700 (PDT), Ben Gamari <bgamari.foss(a)gmail.com> wrote:
> Hey all,
>
> I have posted another profile[1] from an incident yesterday. As you can see,
> both swapper and init (strange?) show up prominently in the profile. Moreover,
> most processes seem to be in blk_peek_request a disturbingly large percentage
> of the time. Both of these profiles were taken with 2.6.34-rc kernels.
>

Apparently my initial email announcing my first set of profiles never made it
out. Sorry for the confusion. I've included it below.


From: Ben Gamari <bgamari.foss(a)gmail.com>
Subject: Re: Poor interactive performance with I/O loads with fsync()ing
To: Ingo Molnar <mingo(a)elte.hu>, Nick Piggin <npiggin(a)suse.de>
Cc: tytso(a)mit.edu, linux-kernel(a)vger.kernel.org, Olly Betts
<olly(a)survex.com>, martin f krafft <madduck(a)madduck.net>
Bcc: bgamari(a)gmail.com
In-Reply-To: <20100317093704.GA17146(a)elte.hu>
References: <4b9fa440.12135e0a.7fc8.ffffe745(a)mx.google.com>
<20100317045350.GA2869(a)laptop> <20100317093704.GA17146(a)elte.hu>
On Wed, 17 Mar 2010 10:37:04 +0100, Ingo Molnar <mingo(a)elte.hu> wrote:
> A call-graph profile will show the precise reason for IO latencies, and their
> relatively likelihood.

Well, here is something for now. I'm not sure how valid the reproduction
workload is (git pull, rsync, and 'notmuch new' all running at once), but I
certainly did produce a few stalls and swapper is highest on the profile.
This was on 2.6.34-rc2. I've included part of the profile below, although more
complete set of data is available at [1].

Thanks,

- Ben


[1] http://mw0.mooo.com/~ben/latency-2010-03-25-a/


# Samples: 25295
#
# Overhead Command Shared Object Symbol
# ........ ............... ................. ......
#
24.50% swapper [kernel.kallsyms] [k] blk_peek_request
|
--- blk_peek_request
scsi_request_fn
__blk_run_queue
|
|--98.32%-- blk_run_queue
| scsi_run_queue
| scsi_next_command
| scsi_io_completion
| scsi_finish_command
| scsi_softirq_done
| blk_done_softirq
| __do_softirq
| call_softirq
| do_softirq
| irq_exit
| |
| |--99.56%-- do_IRQ
| | ret_from_intr
| | |
| | |--98.02%-- cpuidle_idle_call
| | | cpu_idle
| | | rest_init
| | | start_kernel
| | | x86_64_start_reservations
| | | x86_64_start_kernel
| | |
| | |--0.91%-- clockevents_notify
| | | lapic_timer_state_broadcast
| | | |
| | | |--83.64%-- acpi_idle_enter_bm
| | | | cpuidle_idle_call
| | | | cpu_idle
| | | | rest_init
| | | | start_kernel
| | | | x86_64_start_reservations
| | | | x86_64_start_kernel
| | | |
| | | --16.36%-- acpi_idle_enter_simple
| | | cpuidle_idle_call
| | | cpu_idle
| | | rest_init
| | | start_kernel
| | | x86_64_start_reservations
| | | x86_64_start_kernel
| | |
| | |--0.81%-- cpu_idle
| | | rest_init
| | | start_kernel
| | | x86_64_start_reservations
| | | x86_64_start_kernel
| | --0.26%-- [...]
| --0.44%-- [...]
|
--1.68%-- elv_completed_request
__blk_put_request
blk_finish_request
blk_end_bidi_request
blk_end_request
scsi_io_completion
scsi_finish_command
scsi_softirq_done
blk_done_softirq
__do_softirq
call_softirq
do_softirq
irq_exit
do_IRQ
ret_from_intr
|
|--96.15%-- cpuidle_idle_call
| cpu_idle
| rest_init
| start_kernel
| x86_64_start_reservations
| x86_64_start_kernel
|
|--1.92%-- cpu_idle
| rest_init
| start_kernel
| x86_64_start_reservations
| x86_64_start_kernel
|
|--0.96%-- schedule
| cpu_idle
| rest_init
| start_kernel
| x86_64_start_reservations
| x86_64_start_kernel
|
--0.96%-- clockevents_notify
lapic_timer_state_broadcast
acpi_idle_enter_bm
cpuidle_idle_call
cpu_idle
rest_init
start_kernel
x86_64_start_reservations
x86_64_start_kernel

23.74% init [kernel.kallsyms] [k] blk_peek_request
|
--- blk_peek_request
scsi_request_fn
__blk_run_queue
|
|--98.77%-- blk_run_queue
| scsi_run_queue
| scsi_next_command
| scsi_io_completion
| scsi_finish_command
| scsi_softirq_done
| blk_done_softirq
| __do_softirq
| call_softirq
| do_softirq
| irq_exit
| |
| |--99.87%-- do_IRQ
| | ret_from_intr
| | |
| | |--98.38%-- cpuidle_idle_call
| | | cpu_idle
| | | start_secondary
| | |
| | |--0.81%-- schedule
| | | cpu_idle
| | | start_secondary
| | |
| | |--0.56%-- cpu_idle
| | | start_secondary
| | --0.25%-- [...]
| --0.13%-- [...]
|
--1.23%-- elv_completed_request
__blk_put_request
blk_finish_request
blk_end_bidi_request
blk_end_request
scsi_io_completion
scsi_finish_command
scsi_softirq_done
blk_done_softirq
__do_softirq
call_softirq
do_softirq
irq_exit
do_IRQ
ret_from_intr
cpuidle_idle_call
cpu_idle
start_secondary

5.85% chromium-browse [kernel.kallsyms] [k] blk_peek_request
|
--- blk_peek_request
scsi_request_fn
__blk_run_queue
blk_run_queue
scsi_run_queue
scsi_next_command
scsi_io_completion
scsi_finish_command
scsi_softirq_done
blk_done_softirq
__do_softirq
call_softirq
do_softirq
irq_exit
do_IRQ
ret_from_intr
|
|--50.00%-- check_match.8653
|
--50.00%-- unlink_anon_vmas
free_pgtables
exit_mmap
mmput
exit_mm
do_exit
do_group_exit
sys_exit_group
system_call
....
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Arjan van de Ven on
On Sat, 27 Mar 2010 18:20:37 -0700 (PDT)
Ben Gamari <bgamari.foss(a)gmail.com> wrote:

> Hey all,
>
> I have posted another profile[1] from an incident yesterday. As you
> can see, both swapper and init (strange?) show up prominently in the
> profile. Moreover, most processes seem to be in blk_peek_request a
> disturbingly large percentage of the time. Both of these profiles
> were taken with 2.6.34-rc kernels.
>
> Anyone have any ideas on how to proceed? Is more profile data
> necessary? Are the existing profiles at all useful? Thanks,


profiles tend to be about cpu usage... and are rather poor to deal with
anything IO related.

latencytop might get closer in giving useful information....

(btw some general suggestion.. make sure you're using noatime or
relatime as mount option)
--
Arjan van de Ven Intel Open Source Technology Centre
For development, discussion and tips for power savings,
visit http://www.lesswatts.org
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Ben Gamari on
On Sat, 27 Mar 2010 20:42:33 -0700, Arjan van de Ven <arjan(a)infradead.org> wrote:
> On Sat, 27 Mar 2010 18:20:37 -0700 (PDT)
> Ben Gamari <bgamari.foss(a)gmail.com> wrote:
>
> > Hey all,
> >
> > I have posted another profile[1] from an incident yesterday. As you
> > can see, both swapper and init (strange?) show up prominently in the
> > profile. Moreover, most processes seem to be in blk_peek_request a
> > disturbingly large percentage of the time.
> >
I suppose this statement was a tad misleading. The provided profiles were taken
with,

perf record -f -g -a -e block:block_rq_issue -c 1

Which I believe measures block requests issued, not CPU usage (correct me if
I'm wrong).

> profiles tend to be about cpu usage... and are rather poor to deal with
> anything IO related.
>
See above.

> latencytop might get closer in giving useful information....
>
Latencytop generally shows a large amount of time handling page faults.

> (btw some general suggestion.. make sure you're using noatime or
> relatime as mount option)

Thanks for the suggestion. I had actually forgotten relatime in my fstab, so
we'll see if there's any improvement now. That being said, I/O loads over small
numbers of files (e.g. xapian) are just as bad as loads over large numbers of
files. To me that weakly suggests perhaps atime updates aren't the issue (I
could be horribly wrong though).

- Ben

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Andi Kleen on
Ben Gamari <bgamari.foss(a)gmail.com> writes:

You don't say which file system you use, but ext3 and the file systems
with similar journal design (like reiserfs) all have known fsync starvation
issues. The problem is that any fsync has to wait for all transactions
to commit, and this might take a long time depending on how busy
the disk is.

ext4/XFS/JFS/btrfs should be better in this regard

-Andi
--
ak(a)linux.intel.com -- Speaking for myself only.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Pawel S on
2010/3/23 Jens Axboe <jens.axboe(a)oracle.com>:

> It's also been my sneaking suspicion that swap is involved. I had lots
> of RAM in anything I use, even the laptop and workstation. I'll try and
> run some tests with lower memory and force it into swap, I've seen nasty
> hangs that way.

I am not sure if the memory swapping is the case. According to KDE
System monitor swap is not even touched when copying files. I also
noticed similar responsiveness problems when extracting some rar files
- twelve parts , each about 100MB and the system becomes unresponsive
for some time since the first seconds of operation, then it behaves
normally and then problem is back. I have 2GB of RAM. I am using ext4
file system and I am using noatime mount option.

P.S. It seems it is a little better when I have AHCI mode set in BIOS
(at least when extracting archives).

P.S.2 I would be glad to provide some useful data. I created perf
chart, but if this is not enough just instruct me what should I do
next, please.

Regards

Pawel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/