From: Ignoramus29207 on
On 2010-08-05, Rahul <nospam(a)nospam.invalid> wrote:
> Ignoramus16841 <ignoramus16841(a)NOSPAM.16841.invalid> wrote in news:5
> _KdnW29vrfPisbRnZ2dnUVZ_rmdnZ2d(a)giganews.com:
>
>> Could be some sort of timing bug in the NFS server code?
>>>
>>
>> It could be anything, but I am more inclined to blame the client, as
>>
>> 1) Many clients work just fine
>> 2) Even my bad desktop works fine when it connects after a reboot
>>
>
> Does it matter if it is a soft or hard mount?
>
> I'm just guessing here. I've a long mount option string on my server:
>
> eustorage:/opt /opt nfs rw,nodev,noatime,nfsvers=3,timeo=110,retrans=
> 50,hard,intr,proto=udp,rsize=32768,wsize=32768 0 0
>
> But I'd be hard pressed to explain why I chose each option that I have in
> there. :)
>
>

I culd explain some, but what is broken is something much more basic.

i
From: Rahul on
Ignoramus29207 <ignoramus29207(a)NOSPAM.29207.invalid> wrote in
news:2Oqdnb96RYTHvf3RnZ2dnUVZ_tOdnZ2d(a)giganews.com:

> [44743.592195] nfs: server myserver not responding, still trying
> [45103.592528] nfs: server myserver OK
> [45980.844190] nfs: server myserver not responding, still trying

Now that's interesting. I had the exact same problem on many machines a few
months ago. In that case we diagnosed it to be a problem with the Broadcom
drivers that caused them to hang under load.

http://bugs.centos.org/view.php?id=3832
http://lopsa.org/node/1836

The solution was to update modprobe.conf

options bnx2 disable_msi=1

Now it would be a stretch to imagine this is the same you are facing but I
thought I'd throw it out there.

In the past I've also seen this same error when many clients mount to the
same server and the number of NFS threads is too low. But I guess if you
see it on only one client I doubt this could be your issue. Have you tried
cat /proc/net/rpc/nfsd on the server to see if threads are all busy often?

--
Rahul
From: J G Miller on
On Monday, August 9th, 2010 at 16:58:37h +0000, Rahul explained:
>
> I had the exact same problem on many machines a few months ago.
> In that case we diagnosed it to be a problem with the
> Broadcom drivers that caused them to hang under load.

So nothing to do with NFS per se, but the ethernet network card
kernel modules.

Presumably network services other than NFS were also affected by
intermittent outages?

Would monitoring with iftop be useful to see network problems
of this type?
From: Ignoramus27168 on
On 2010-08-09, J G Miller <miller(a)yoyo.ORG> wrote:
> On Monday, August 9th, 2010 at 16:58:37h +0000, Rahul explained:
>>
>> I had the exact same problem on many machines a few months ago.
>> In that case we diagnosed it to be a problem with the
>> Broadcom drivers that caused them to hang under load.
>
> So nothing to do with NFS per se, but the ethernet network card
> kernel modules.
>
> Presumably network services other than NFS were also affected by
> intermittent outages?
>
> Would monitoring with iftop be useful to see network problems
> of this type?

I am not sure what is happening. I know that the share itself is
having unspecified "issues", however, they only affect these two
machines in this way. I am working towards setting up linux-ha to
replace the old share.

I am also upgrading both PCs to Lucid.

i
From: Rahul on
Ignoramus27168 <ignoramus27168(a)NOSPAM.27168.invalid> wrote in
news:QOidnYig_aLm5PzRnZ2dnUVZ_hudnZ2d(a)giganews.com:

> I am not sure what is happening. I know that the share itself is
> having unspecified "issues", however, they only affect these two
> machines in this way. I am working towards setting up linux-ha to
> replace the old share.
>
> I am also upgrading both PCs to Lucid.

Did you manage to solve the problem? What was the fix?

--
Rahul