From: Rick Jones on
Linux bonding does offer the prospect of doing round-robin scheduling
of packets across the links in the bond, but that comes at a price -
packet reordering. Get "too much" of that and you start to get
spurrious retransmissions and clamping of the congestion window.

Also, the packet scheduling algorithms in the bonding code are for
transmit only. A backup server is ostensibly a recv-mostly sort of
thing. The scheduling of packets for inbound to the backup server
would be determined by the algorithms in the switch to which it was
connected.

You might also sniff the wire to see what sort of window sizes are
being used. Also, check the CPU util of _each_ CPU on the server.
I'm assuming your filesystem/whatnot can take-in data >> 60 MB/s? Does
the side sending the data to the server report any TCP
retransmissions?

rick jones
--
The glass is neither half-empty nor half-full. The glass has a leak.
The real question is "Can it be patched?"
these opinions are mine, all mine; HP might not want them anyway... :)
feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...
From: Peer-Joachim Koch on
Hi,

thanks for all the answers.

To give a more general overview:

We are running a GFS (StorNEXT) as "normal" file system.
One server is working also as TSM client. 8 file systems
are defined, each is able to deliver ~60-100MB/s in avereage.
The file space is ~60TB with 22Million files.

2 dedicated NIC's (intel e1000) are used in a dedicated vlan
and configured using bionding without any further settings.

This link is connected to our TSM Server (AIX 8 core) which is
also connected to this vlan using a dedicated network interface.
In the moment we do not have many traffic on the file system, therefore
I can not measure many things.

What I currently trying after reading all the posts is to
configure the client to use not only one task (and streams), but
to use 2-4 streams. When everything is working corrently, it *might*
split 2 streams on each NIC ....

However tuning this parameter on the client nearly dropped the
backup time by a factor of 2 !

But I need more transfer, to see, if the usage of the Nic's is improved.

Maybe we have to try LANfree backup ...

Bye, Peer

Rick Jones schrieb:
> Linux bonding does offer the prospect of doing round-robin scheduling
> of packets across the links in the bond, but that comes at a price -
> packet reordering. Get "too much" of that and you start to get
> spurrious retransmissions and clamping of the congestion window.
>
> Also, the packet scheduling algorithms in the bonding code are for
> transmit only. A backup server is ostensibly a recv-mostly sort of
> thing. The scheduling of packets for inbound to the backup server
> would be determined by the algorithms in the switch to which it was
> connected.
>
> You might also sniff the wire to see what sort of window sizes are
> being used. Also, check the CPU util of _each_ CPU on the server.
> I'm assuming your filesystem/whatnot can take-in data >> 60 MB/s? Does
> the side sending the data to the server report any TCP
> retransmissions?
>
> rick jones
From: Dan Stromberg on
On Mon, 21 Apr 2008 10:26:54 +0200, Peer-Joachim Koch wrote:

> Hi,
>
> thanks for all the answers.
>
> To give a more general overview:
>
> We are running a GFS (StorNEXT) as "normal" file system. One server is
> working also as TSM client. 8 file systems are defined, each is able to
> deliver ~60-100MB/s in avereage. The file space is ~60TB with 22Million
> files.
>
> 2 dedicated NIC's (intel e1000) are used in a dedicated vlan and
> configured using bionding without any further settings.
>
> This link is connected to our TSM Server (AIX 8 core) which is also
> connected to this vlan using a dedicated network interface. In the
> moment we do not have many traffic on the file system, therefore I can
> not measure many things.
>
> What I currently trying after reading all the posts is to configure the
> client to use not only one task (and streams), but to use 2-4 streams.
> When everything is working corrently, it *might* split 2 streams on each
> NIC ....
>
> However tuning this parameter on the client nearly dropped the backup
> time by a factor of 2 !
>
> But I need more transfer, to see, if the usage of the Nic's is improved.
>
> Maybe we have to try LANfree backup ...
>
> Bye, Peer
>
> Rick Jones schrieb:
>> Linux bonding does offer the prospect of doing round-robin scheduling
>> of packets across the links in the bond, but that comes at a price -
>> packet reordering. Get "too much" of that and you start to get
>> spurrious retransmissions and clamping of the congestion window.
>>
>> Also, the packet scheduling algorithms in the bonding code are for
>> transmit only. A backup server is ostensibly a recv-mostly sort of
>> thing. The scheduling of packets for inbound to the backup server
>> would be determined by the algorithms in the switch to which it was
>> connected.
>>
>> You might also sniff the wire to see what sort of window sizes are
>> being used. Also, check the CPU util of _each_ CPU on the server. I'm
>> assuming your filesystem/whatnot can take-in data >> 60 MB/s? Does the
>> side sending the data to the server report any TCP retransmissions?
>>
>> rick jones

1) Enable jumbo frames on your gigabit NIC's and all network equipment
between them, if you haven't already. http://stromberg.dnsalias.org/
~strombrg/jumbo.html Path MTU Discovery would probably be a good idea
too.

2) Use a protocol that will allow you to use large block sizes to reduce
the CPU needs. http://stromberg.dnsalias.org/~dstromberg/protocol-
comparison.html If you must use openssh, patch it for performance.
Also, blowfish tends to be a good encryption algorithm for performance,
despite what Schneier says about it not being sufficiently vetted yet
(he's the author of the algorithm - and I haven't read cryptogram in a
while, so maybe he feels it's solid by now).

3) Use rsync to reduce the amount of data you're pushing again and again:
http://stromberg.dnsalias.org/~strombrg/Backup.remote.html It'll turn
your series of fullsaves and incrementals into what appears to be one
fullsave and many incrementals from the perspective of network
performance and disk use (except the inode use will be high).

From: Rick Jones on
Dan Stromberg <dstromberglists(a)gmail.com> wrote:

> 1) Enable jumbo frames on your gigabit NIC's and all network
> equipment between them, if you haven't already.
> http://stromberg.dnsalias.org/ ~strombrg/jumbo.html

Keeping in mind there are still switches out there (older ones at
least) which do not support JF, and since JF is not a de jure
standard, there can be NICs (and switches) for which the definition of
"JumboFrame" is something other than the 9000 byte MTU "de facto"
standard initiated (IIRC) by Alteon.

If the switch does not support JF, enabling JF on the NICs will result
in odd losses of connectivity, with stuff like telnet/ssh _mostly_
working and stuff like FTP and perhaps HTTP not working well at all.

Even if the switch supports JF, unless one enables it across the
_entire_ broadcast domain, UDP traffic for stuff like NFS can be
fubared - the NIC with JF support will fragment to JF sizes, which
will arrive at the NIC without JF and be dropped. Don't assume that
just because something like an FTP or netperf TCP_STREAM works that
all is OK - the TCP MSS exchange at the beginning of a TCP connection
will result in the smaller MSS being used, masking the MTU mismatch.

I _like_ JF, but it isn't a pancea. Also, as more and more NICs
support LRO (Large Receive Offload) in addition to TSO (Transport
Segmentation Offload) the benefit to JF becomes reduced.

rick jones
--
web2.0 n, the dot.com reunion tour...
these opinions are mine, all mine; HP might not want them anyway... :)
feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...