Raise initial congestion window size / speedup slow start? [Kernel]

Prev: [PATCH] wm8727: add a missing return in wm8727_platform_probe
Next: [PATCH] block: Add secure discard

From: Ed W on 15 Jul 2010 03:50

On 15/07/2010 05:12, Tom Herbert wrote:
> There is an Internet draft
> (http://datatracker.ietf.org/doc/draft-hkchu-tcpm-initcwnd/) on
> raising the default Initial Congestion window to 10 segments, as well
> as a SIGCOMM paper (http://ccr.sigcomm.org/online/?q=node/621).
>

You guys have obviously done a lot of work on this, however, it seems
that there is a case for introducing some heuristics into the choice of
init cwnd as well as offering the option to go larger? An initial size
of 10 packets is just another magic number that obviously works with the
median bandwidth delay product on today's networks - can we not do
better still?

Seems like a bunch of clever folks have already suggested tweaks to the
steady stage congestion avoidance, but so far everyone is afraid to
touch the early stage heuristics?

Also would you guys not benefit from wider deployment of ECN? Can you
not help find some ways that deployment could be increased? At present
there are big warnings all over the option that it causes some problems,
but there is no quantification of how much and really whether this
warning is still appropriate?

Ed W

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Alan Cox on 15 Jul 2010 06:30

On Thu, 15 Jul 2010 00:13:01 +0200
Hagen Paul Pfeifer <hagen(a)jauu.net> wrote:

> * David Miller | 2010-07-14 14:55:47 [-0700]:
>
> >Although section 3 of RFC 5681 is a great text, it does not say at all
> >that increasing the initial CWND would lead to fairness issues.
>
> Because it is only one side of the medal, probing conservative the available
> link capacity in conjunction with n simultaneous probing TCP/SCTP/DCCP
> instances is another.
>
> >To be honest, I think google's proposal holds a lot of weight. If
> >over time link sizes and speeds are increasing (they are) then nudging
> >the initial CWND every so often is a legitimate proposal. Were
> >someone to claim that utilization is lower than it could be because of
> >the currenttly specified initial CWND, I would have no problem
> >believing them.
> >
> >And I'm happy to make Linux use an increased value once it has
> >traction in the standardization community.
>
> Currently I know no working link capacity probing approach, without active
> network feedback, to conservatively probing the available link capacity with a
> high CWND. I am curious about any future trends.

Given perfect information from the network nodes you still need to
traverse the network each direction and then return an answer which means
with a 0.5sec end to end time as in the original posting causality itself
demands 1.5 seconds to get an answer which is itself incomplete and
obsolete.

Causality isn't showing any signs of going away soon.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Bill Davidsen on 15 Jul 2010 11:20

Ed W wrote:
>
>>> Does someone have some pointers on where to look to modify initial
>>> congestion window please?
>>>
>> Are you sure that's the issue? The backlog is in incoming, is it not?
>
> Well, I was simplifying a little bit, actually I have a bunch of
> protocols in use, http is one of them
>
>
>> Having dealt with moderately long delays push TB between timezones,
>> have you set your window size up? Set
>> /proc/sys/net/ipv4/tcp_adv_win_scale to 5 or 6 and see if that helps.
>> You may have to go into /proc/sys/net/core and crank up the rmem_*
>> settings, depending on your distribution.
>>
>> This allows the server to push a lot of data without an ack, which is
>> what you want, the ack will be delayed by the long latency, so this
>> helps.
>
> I think I'm misunderstanding something fundamental here:
>
> - Surely the limited congestion window is what throttles me at
> connection initialisation time and this will not be affected by
> changing the params you mention above? For sure the sliding window
> will be relevant vs my bandwidth delay product once the tcp connection
> reaches steady state, but I'm mostly worried here about performance
> right at the creation of the connection?
>
> - Both you and Alan mention that the bulk of the traffic is "incoming"
> - this implies you think it's relevant? Obviously I'm missing
> something fundamental here because my understanding is that the
> congestion window shuts us down in both directions (at the start of
> the connection?)
>
> Thanks for the replies - I will take it over to netdev
>
Perhaps they will give you an answer you like better.

--
Bill Davidsen <davidsen(a)tmr.com>
"We can't solve today's problems by using the same thinking we
used in creating them." - Einstein

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Jerry Chu on 15 Jul 2010 13:40

On Thu, Jul 15, 2010 at 12:48 AM, Ed W <lists(a)wildgooses.com> wrote:
>
> On 15/07/2010 05:12, Tom Herbert wrote:
>>
>> There is an Internet draft
>> (http://datatracker.ietf.org/doc/draft-hkchu-tcpm-initcwnd/) on
>> raising the default Initial Congestion window to 10 segments, as well
>> as a SIGCOMM paper (http://ccr.sigcomm.org/online/?q=node/621).
>>
>
> You guys have obviously done a lot of work on this, however, it seems that there is a case for introducing some heuristics into the choice of init cwnd as well as offering the option to go larger? �An initial size of 10 packets is just another magic number that obviously works with the median bandwidth delay product on today's networks - can we not do better still?
>
> Seems like a bunch of clever folks have already suggested tweaks to the steady stage congestion avoidance, but so far everyone is afraid to touch the early stage heuristics?

This is because there is not enough info for deriving any heuristic.
For initcwnd one is constrained to
only info from 3WHS. This includes a rough estimate of RTT plus all
the bits in the SYN/SYN-ACK
headers. I'm assuming a stateless approach. We've tried a stateful
solution (i.e., seeding initcwnd from
past history) but found its complexity outweigh the gain.
(See http://www.ietf.org/proceedings/77/slides/tcpm-4.pdf)

>
> Also would you guys not benefit from wider deployment of ECN? �Can you not help find some ways that deployment could be increased? �At present there are big warnings all over the option that it causes some problems, but there is no quantification of how much and really whether this warning is still appropriate?

That will add yet another hoop for us to jump over. Also I'm not sure
a couple of bits are sufficient for a
guesstimate of what initcwnd ought to be.

Our reasoning is simple - there has been tremendous b/w growth since
rfc2414 was published. Even the
lowest common denominator (i.e., dialup links) has moved from 9.6Kbps
to 56Kbps. That's a six fold
increase. If you believe initcwnd should grow proportionally to the
buffer sizes in access links, and the
buffer sizes grows proportionally to b/w, then the initcwnd outght to
be 3*6 = 18 today.

We chose a modest increase (10) with the hope to expedite the
standardization process (and would
certainly appreciate helps from folks on this list). 10 is very
conservative considering many deployment
has gone beyond 3, including Linux stack, which allows one additional
pkt if it's the last data pkt.

Longer term it will be nice to find a way to get rid of this fixed,
somewhat arbitrary initcwnd. Mark
Allman's JumpStart is one idea, but it'd be a much longer route.

Jerry

>
> Ed W
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Rick Jones on 15 Jul 2010 16:00

I have to wonder if the only heuristic one could employ for divining the initial
congestion window is to be either pessimistic/conservative or
optimistic/liberal. Or for that matter the only one one really needs here?

That's what it comes down to doesn't it? At any one point in time, we don't
*really* know the state of the network and whether it can handle the load we
might wish to put upon it. We are always reacting to it. Up until now, it has
been felt necessary to be pessimistic/conservative at time of connection
establishment and not rely as much on the robustness of the "control" part of
avoidance and control.

Now, the folks at Google have lots of data to suggest we don't need to be so
pessimistic/conservative and so we have to decide if we are willing to be more
optimistic/liberal. Broadly handwaving, the "netdev we" seems to be willing to
be more optimistic/liberal in at least a few cases, and the question comes down
to whether or not the "IETF we" will be similarly willing.

rick jones
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8 9
Prev: [PATCH] wm8727: add a missing return in wm8727_platform_probe
Next: [PATCH] block: Add secure discard