From: pk on
Rick Jones wrote:

>> $ netperf -H 1x.x.x.x -p 5002 -t UDP_STREAM -- -P 5003 -m 1460 -s 1M -S
>> 1M UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 5003 AF_INET
>> to 1x.x.x.x (1x.x.x.x) port 5003 AF_INET : demo
>> Socket Message Elapsed Messages
>> Size Size Time Okay Errors Throughput
>> bytes bytes secs # # 10^6bits/sec
>
>> 2097152 1460 10.00 1352768 0 1580.01
>> 2097152 10.00 65165 76.11
>
>> Ok, now I'm puzzled. Surely 76 Mbit/sec look quite a lot to me. The
>> bandwidth purchased from the colo at the slowest of the two sites
>> (the one running the above test) should be around 10 Mbit...I'm not
>> sure how to interpret those results.
>
> The top line of numbers is what the number of perceived successful
> sendto() calls multiplied by the bytes for each sendto() divided by
> the test time (along with unit conversion to 10^6 bits per second -
> aka megabits) The second line is what the receiver reported receiving
> over the same interval.

Well, then...what can I say? Cool! There is more bandwidth than we'd ever
dream of. On the other hand, this makes the TCP performance I measured
earlier look even more ridiculous.

>> In a single isolated instance, I even got
>
>> UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 5003 AF_INET to
>> 1x.x.x.x (1x.x.x.x) port 5003 AF_INET : demo
>> Socket Message Elapsed Messages
>> Size Size Time Okay Errors Throughput
>> bytes bytes secs # # 10^6bits/sec
>
>> 2097152 1460 10.00 1399679 0 1634.82
>> 2097152 10.00 384976 449.65
>
>> which doesn't make any sense to me. Except for this last one, tests
>> in both directions show figures around 75/79 Mbits/sec as the first
>> one above.
>
> There may be some "anomalies" in the way the colo does bandwidth
> throttling. And there being bandwidth throttling at/by the colo
> tosses-in a whole new set of potential issues with the single-stream
> TCP performance...

I'm starting to think that they may be doing QoS on TCP only (eg RED and
so). I haven't got an answer from the colos to this specific question (yet).
I've also ran burst ping tests at 0.01 secs. interval (100 packets/sec) with
almost no packet loss (6 or 7 in 50000 packets, which is within the norm at
those speeds in my experience).

I know that scp isn't a good test, but it's useful because it shows the
speed while it's copying. The oscillating speed I see would be in accordance
with my hypothesis, and also the fact that multiple TCP connections get each
the same bandwidth, so it's possible to get a bigger throughput
cumulatively.

> One further thing you could do is add a global -F <filename> where
> <filename> is a file with uncompressible data. I suppose there is a
> small possibility there is something doing data compression, and that
> would be a way to get around it.
>
> If your colo agreement has total bytes transferred
> limits/levels/charges, do be careful running netperf tests. It
> wouldn't do to have a "free" benchmark cause a big colo bill...

Fortunately that is not the case!

Thanks for your help.
From: Rick Jones on
pk <pk(a)pk.invalid> wrote:
> Well, then...what can I say? Cool! There is more bandwidth than we'd
> ever dream of. On the other hand, this makes the TCP performance I
> measured earlier look even more ridiculous.

Never underestimate the power of lost traffic. UDP - or more
accurately the netperf UDP_STREAM test doesn't give a rodent's
backside about dropped packets. TCP, on the other hand cares quite a
lot about dropped traffic.

> I'm starting to think that they may be doing QoS on TCP only (eg RED
> and so). I haven't got an answer from the colos to this specific
> question (yet). I've also ran burst ping tests at 0.01
> secs. interval (100 packets/sec) with almost no packet loss (6 or 7
> in 50000 packets, which is within the norm at those speeds in my
> experience).

If they do the traffic shaping by dropping TCP segments rather than
introducing delays that may indeed have a plusungood effect on
senders.

> I know that scp isn't a good test,

When one's workload is file transfer, it is an excellent test. It
just adds more complexity when it comes to trying to figure-out what
is going-on :)

> but it's useful because it shows the speed while it's copying. The
> oscillating speed I see would be in accordance with my hypothesis,
> and also the fact that multiple TCP connections get each the same
> bandwidth, so it's possible to get a bigger throughput cumulatively.

In the context of netperf, ./configure --enable-demo

and then add a global -D <time> option and netperf will emit "interim"
results while the test is running:

raj(a)raj-8510w:~/netperf2_trunk$ src/netperf -H lart.fc.hp.com -D 1
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lart.fc.hp.com (15.11.146.31) port 0 AF_INET : demo
Interim result: 143.37 10^6bits/s over 1.04 seconds
Interim result: 207.10 10^6bits/s over 1.06 seconds
Interim result: 214.06 10^6bits/s over 1.01 seconds
Interim result: 200.16 10^6bits/s over 1.07 seconds
Interim result: 202.81 10^6bits/s over 1.01 seconds
Interim result: 214.76 10^6bits/s over 1.02 seconds
Interim result: 210.60 10^6bits/s over 1.04 seconds
Interim result: 214.46 10^6bits/s over 1.03 seconds
Interim result: 220.77 10^6bits/s over 1.01 seconds
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec

87380 16384 16384 10.12 200.71

The interval will be approximate. You can use fractions of a second -
eg -D 0.250 etc etc.

If you want to find the limit of queueing, in the context of netperf,
../configure --enable-intervals

and then add global -w <time> -b <burstsize> options and netperf will
make <burstsize> send calls every <time> units.

raj(a)raj-8510w:~/netperf2_trunk$ src/netperf -H lart.fc.hp.com -w 1s -b 10 -t U>
UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lart.fc.hp.com (15.11.146.31) port 0 AF_INET : spin interval
Socket Message Elapsed Messages
Size Size Time Okay Errors Throughput
bytes bytes secs # # 10^6bits/sec

112640 1024 10.00 103 0 0.08
111616 10.00 103 0.08

If you are not concerned about CPU consumption on the sender add a
--enable-spin and netperf will sit and spin to acheive the <time>
interval rather than depend on the granularity of the interval timer.

happy benchmarking,

rick jones
--
denial, anger, bargaining, depression, acceptance, rebirth...
where do you want to be today?
these opinions are mine, all mine; HP might not want them anyway... :)
feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...