From: Bill Fink on
On Thu, 15 Jul 2010, Hagen Paul Pfeifer wrote:

> * David Miller | 2010-07-14 14:55:47 [-0700]:
>
> >Although section 3 of RFC 5681 is a great text, it does not say at all
> >that increasing the initial CWND would lead to fairness issues.
>
> Because it is only one side of the medal, probing conservative the available
> link capacity in conjunction with n simultaneous probing TCP/SCTP/DCCP
> instances is another.
>
> >To be honest, I think google's proposal holds a lot of weight. If
> >over time link sizes and speeds are increasing (they are) then nudging
> >the initial CWND every so often is a legitimate proposal. Were
> >someone to claim that utilization is lower than it could be because of
> >the currenttly specified initial CWND, I would have no problem
> >believing them.
> >
> >And I'm happy to make Linux use an increased value once it has
> >traction in the standardization community.
>
> Currently I know no working link capacity probing approach, without active
> network feedback, to conservatively probing the available link capacity with a
> high CWND. I am curious about any future trends.

A long, long time ago, I suggested a Path BW Discovery mechanism
to the IETF, analogous to the Path MTU Discovery mechanism, but
it didn't get any traction. Such information could be extremely
useful to TCP endpoints, to determine a maximum window size to
use, to effectively rate limit a much stronger sender from
overpowering a much weaker receiver (for example 10-GigE -> GigE),
resulting in abominable performance across large RTT paths
(as low as 12 Mbps), even in the absence of any real network
contention.

-Bill
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Tom Herbert on
On Wed, Jul 14, 2010 at 1:39 PM, Hagen Paul Pfeifer <hagen(a)jauu.net> wrote:
> * Rick Jones | 2010-07-14 13:17:24 [-0700]:
>
>>There is an effort under way, lead by some folks at Google and
>>including some others, to get the RFC's enhanced in support of the
>>concept of larger initial congestion windows. �Some of the discussion
>>may be in the "tcpm" mailing list (assuming I've not gotten my
>>mailing lists confused). �There may be some previous discussion of
>>that work in the netdev archives as well.
>
> tcpm is the right mailing list but there is currently no effort to develop
> this topic. Why? Because is not a standardization issue, rather it is a
> technical issue. You cannot rise the initial CWND and expect a fair behavior.
> This was discussed several times and is documented in several documents and
> RFCs.
>
> RFC 5681 Section 3.1. Google employees should start with Section 3. This topic
> pop's of every two months in netdev and until now I _never_ read a
> consolidated contribution.
>

There is an Internet draft
(http://datatracker.ietf.org/doc/draft-hkchu-tcpm-initcwnd/) on
raising the default Initial Congestion window to 10 segments, as well
as a SIGCOMM paper (http://ccr.sigcomm.org/online/?q=node/621). We
presented this proposal and data supporting it at Anaheim IETF, and
will be following up in Netherlands with more data including some of
which should further address fairness questions.

In terms of Linux implementation, setting ICW via ip route is
sufficient support on the server side. There is also a proposed patch
which could allow applications to set ICW themselves (in hopes that
application can reduce number of simultaneous connections). On the
client side we can now adjust the receive window to advertise larger
initial windows. Among current implementations, Linux advertises the
smallest default receive window of major OSes, so it turns out Linux
clients won't get lower latency benefits currently (so we'll probably
ask to raise the default some day :-)).

Tom

> Partial local issues can already be "fixed" via route specific ip options -
> see initcwnd.
>
> HGN
>
>
>
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo(a)vger.kernel.org
> More majordomo info at �http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: H.K. Jerry Chu on
On Wed, Jul 14, 2010 at 11:15 AM, David Miller <davem(a)davemloft.net> wrote:
> From: Bill Davidsen <davidsen(a)tmr.com>
> Date: Wed, 14 Jul 2010 11:21:15 -0400
>
>> You may have to go into /proc/sys/net/core and crank up the
>> rmem_* settings, depending on your distribution.
>
> You should never, ever, have to touch the various networking sysctl
> values to get good performance in any normal setup. �If you do, it's a
> bug, report it so we can fix it.

Agreed, except there are indeed bugs in the code today in that the
code in various places assumes initcwnd as per RFC3390. So when
initcwnd is raised, that actual value may be limited unnecessarily by
the initial wmem/sk_sndbuf.

Will try to find time to submit a patch.

Jerry

>
> I cringe every time someone says to do this, so please do me a favor
> and don't spread this further. :-)
>
> For one thing, TCP dynamically adjusts the socket buffer sizes based
> upon the behavior of traffic on the connection.
>
> And the TCP memory limit sysctls (not the core socket ones) are sized
> based upon available memory. �They are there to protect you from
> situations such as having so much memory dedicated to socket buffers
> that there is none left to do other things effectively. �It's a
> protective limit, rather than a setting meant to increase or improve
> performance. �So like the others, leave these alone too.
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo(a)vger.kernel.org
> More majordomo info at �http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: H.K. Jerry Chu on
On Wed, Jul 14, 2010 at 1:39 PM, Hagen Paul Pfeifer <hagen(a)jauu.net> wrote:
> * Rick Jones | 2010-07-14 13:17:24 [-0700]:
>
>>There is an effort under way, lead by some folks at Google and
>>including some others, to get the RFC's enhanced in support of the
>>concept of larger initial congestion windows. �Some of the discussion
>>may be in the "tcpm" mailing list (assuming I've not gotten my
>>mailing lists confused). �There may be some previous discussion of
>>that work in the netdev archives as well.
>
> tcpm is the right mailing list but there is currently no effort to develop
> this topic. Why? Because is not a standardization issue, rather it is a

Please don't mislead. Raising the initcwnd is actively being pursued at IETF
right now. If not here, where else? It is following the same path where initcwnd
was first raised in late 90' through rfc2414/rfc3390.

IETF is not a standard organization just for protocol lawyers to play
word games.
It is responsible for solving real technical issues as well.

Jerry

> technical issue. You cannot rise the initial CWND and expect a fair behavior.
> This was discussed several times and is documented in several documents and
> RFCs.
>
> RFC 5681 Section 3.1. Google employees should start with Section 3. This topic
> pop's of every two months in netdev and until now I _never_ read a
> consolidated contribution.
>
> Partial local issues can already be "fixed" via route specific ip options -
> see initcwnd.
>
> HGN
>
>
>
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo(a)vger.kernel.org
> More majordomo info at �http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: H.K. Jerry Chu on
On Wed, Jul 14, 2010 at 8:49 PM, Bill Fink <billfink(a)mindspring.com> wrote:
> On Thu, 15 Jul 2010, Hagen Paul Pfeifer wrote:
>
>> * David Miller | 2010-07-14 14:55:47 [-0700]:
>>
>> >Although section 3 of RFC 5681 is a great text, it does not say at all
>> >that increasing the initial CWND would lead to fairness issues.
>>
>> Because it is only one side of the medal, probing conservative the available
>> link capacity in conjunction with n simultaneous probing TCP/SCTP/DCCP
>> instances is another.
>>
>> >To be honest, I think google's proposal holds a lot of weight. �If
>> >over time link sizes and speeds are increasing (they are) then nudging
>> >the initial CWND every so often is a legitimate proposal. �Were
>> >someone to claim that utilization is lower than it could be because of
>> >the currenttly specified initial CWND, I would have no problem
>> >believing them.
>> >
>> >And I'm happy to make Linux use an increased value once it has
>> >traction in the standardization community.
>>
>> Currently I know no working link capacity probing approach, without active
>> network feedback, to conservatively probing the available link capacity with a
>> high CWND. I am curious about any future trends.
>
> A long, long time ago, I suggested a Path BW Discovery mechanism
> to the IETF, analogous to the Path MTU Discovery mechanism, but
> it didn't get any traction. �Such information could be extremely
> useful to TCP endpoints, to determine a maximum window size to
> use, to effectively rate limit a much stronger sender from
> overpowering a much weaker receiver (for example 10-GigE -> GigE),
> resulting in abominable performance across large RTT paths
> (as low as 12 Mbps), even in the absence of any real network
> contention.

Unfortunately that is not going to help initcwnd (unless one can invent a
PBWD protocol from just 3WHS), and the web is dominated by short-lived
connections so the small initcwnd becomes a choke point.

Jerry

>
> � � � � � � � � � � � � � � � � � � � � � � � �-Bill
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo(a)vger.kernel.org
> More majordomo info at �http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/