From: Daniel J Blueman on
On Wed, Mar 31, 2010 at 3:21 PM, Chuck Lever <chuck.lever(a)oracle.com> wrote:
> On 03/31/2010 07:20 AM, Daniel J Blueman wrote:
>>
>> Talking of expensive, I see latencytop show>16000ms latency for
>> writing pages when I have a workload that does large buffered I/O to
>> an otherwise uncongested server. The gigabit network is saturated, and
>> reads often stall for 1000-4000ms (!). Client has the default 16 TCP
>> request slots, and server has 8 nfsds - the server is far from disk or
>> processor-saturated. I'll see if there is any useful debugging I can
>> get about this.
>
> That latency is pretty much guaranteed to be due to a long RPC backlog queue
> on the client. �Bumping the size of the slot table to 128 and increasing the
> number of NFSD threads may help.

Increasing these values did help quite a bit, though I was still
seeing 5000-8000ms at nfs_wait_bit_uninterruptible() [1] and close().
Then again, I also was seeing 'Scheduler waiting for cpu' taking up to
3000ms! I suspect processor throttling, due to exceeding thermal
limits.

Thanks,
Daniel

--- [1]

nfs_wait_bit_uninterruptible
nfs_wait_on_request
nfs_wait_in_requests_locked
nfs_sync_mapping_wait
nfs_write_mapping
nfs_wb_nocommit
nfs_getattr
vfs_getattr
vfs_fstat
sys_newfstat
system_call_fastpath
--
Daniel J Blueman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/