From: Stephen Hemminger on
On Wed, 14 Jul 2010 19:48:36 +0100
Ed W <lists(a)wildgooses.com> wrote:

> On 14/07/2010 19:15, David Miller wrote:
> > From: Bill Davidsen<davidsen(a)tmr.com>
> > Date: Wed, 14 Jul 2010 11:21:15 -0400
> >
> >
> >> You may have to go into /proc/sys/net/core and crank up the
> >> rmem_* settings, depending on your distribution.
> >>
> > You should never, ever, have to touch the various networking sysctl
> > values to get good performance in any normal setup. If you do, it's a
> > bug, report it so we can fix it.
> >
>
> Just checking the basics here because I don't think this is a bug so
> much as a, less common installation that differs from the "normal" case.
>
> - When we create a tcp connection we always start with tcp slow start
> - This sets the congestion window to effectively 4 packets?
> - This applies in both directions?
> - Remote sender responds to my hypothetical http request with the first
> 4 packets of data
> - We need to wait one RTT for the ack to come back and now we can send
> the next 8 packets,
> - Wait for the next ack and at 16 packets we are now moving at a
> sensible fraction of the bandwidth delay product?
>
> So just to be clear:
> - We don't seem to have any user-space tuning knobs to influence this
> right now?
> - In this age of short attention spans, a couple of extra seconds
> between clicking something and it responding is worth optimising (IMHO)
> - I think I need to take this to netdev, but anyone else with any ideas
> happy to hear them?
>
> Thanks
>
> Ed W

TCP slow start is required by the RFC. It is there to prevent a TCP congestion
collapse. The HTTP problem is exacerbated by things beyond the user's control:
1. stupid server software that dribbles out data and doesn't used the full
payload of the packets
2. web pages with data from multiple sources (ads especially), each of which
requires a new connection
3. pages with huge graphics.

Most of this is because of sites that haven't figured out that somebody on a phone
across the globl might not have the same RTT and bandwidth that the developer on a
local network that created them. Changing the initial cwnd isn't going to fix it.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Rick Jones on
Ed W wrote:

>
> Just checking the basics here because I don't think this is a bug so
> much as a, less common installation that differs from the "normal" case.
>
> - When we create a tcp connection we always start with tcp slow start
> - This sets the congestion window to effectively 4 packets?
> - This applies in both directions?

Any TCP sender in some degree of compliance with the RFCs on the topic will
employ slow-start.

Linux adds the auto-tuning of the receiver's advertised window. It will start
at a small size, and then grow it as it sees fit.

> - Remote sender responds to my hypothetical http request with the first
> 4 packets of data
> - We need to wait one RTT for the ack to come back and now we can send
> the next 8 packets,
> - Wait for the next ack and at 16 packets we are now moving at a
> sensible fraction of the bandwidth delay product?

There may be some wrinkles depending on how many ACKs the reciever generates
(LRO being enabled and such) and how the ACKs get counted.

> So just to be clear:
> - We don't seem to have any user-space tuning knobs to influence this
> right now?
> - In this age of short attention spans, a couple of extra seconds
> between clicking something and it responding is worth optimising (IMHO)

There is an effort under way, lead by some folks at Google and including some
others, to get the RFC's enhanced in support of the concept of larger initial
congestion windows. Some of the discussion may be in the "tcpm" mailing list
(assuming I've not gotten my mailing lists confused). There may be some
previous discussion of that work in the netdev archives as well.

rick jones

> - I think I need to take this to netdev, but anyone else with any ideas
> happy to hear them?
>
> Thanks
>
> Ed W
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo(a)vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Hagen Paul Pfeifer on
* Rick Jones | 2010-07-14 13:17:24 [-0700]:

>There is an effort under way, lead by some folks at Google and
>including some others, to get the RFC's enhanced in support of the
>concept of larger initial congestion windows. Some of the discussion
>may be in the "tcpm" mailing list (assuming I've not gotten my
>mailing lists confused). There may be some previous discussion of
>that work in the netdev archives as well.

tcpm is the right mailing list but there is currently no effort to develop
this topic. Why? Because is not a standardization issue, rather it is a
technical issue. You cannot rise the initial CWND and expect a fair behavior.
This was discussed several times and is documented in several documents and
RFCs.

RFC 5681 Section 3.1. Google employees should start with Section 3. This topic
pop's of every two months in netdev and until now I _never_ read a
consolidated contribution.

Partial local issues can already be "fixed" via route specific ip options -
see initcwnd.

HGN






--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: David Miller on
From: Hagen Paul Pfeifer <hagen(a)jauu.net>
Date: Wed, 14 Jul 2010 22:39:19 +0200

> * Rick Jones | 2010-07-14 13:17:24 [-0700]:
>
>>There is an effort under way, lead by some folks at Google and
>>including some others, to get the RFC's enhanced in support of the
>>concept of larger initial congestion windows. Some of the discussion
>>may be in the "tcpm" mailing list (assuming I've not gotten my
>>mailing lists confused). There may be some previous discussion of
>>that work in the netdev archives as well.
>
> tcpm is the right mailing list but there is currently no effort to develop
> this topic. Why? Because is not a standardization issue, rather it is a
> technical issue. You cannot rise the initial CWND and expect a fair behavior.
> This was discussed several times and is documented in several documents and
> RFCs.
>
> RFC 5681 Section 3.1. Google employees should start with Section 3. This topic
> pop's of every two months in netdev and until now I _never_ read a
> consolidated contribution.
>
> Partial local issues can already be "fixed" via route specific ip options -
> see initcwnd.

Although section 3 of RFC 5681 is a great text, it does not say at all
that increasing the initial CWND would lead to fairness issues.

To be honest, I think google's proposal holds a lot of weight. If
over time link sizes and speeds are increasing (they are) then nudging
the initial CWND every so often is a legitimate proposal. Were
someone to claim that utilization is lower than it could be because of
the currenttly specified initial CWND, I would have no problem
believing them.

And I'm happy to make Linux use an increased value once it has
traction in the standardization community.

But for all we know this side discussion about initial CWND settings
could have nothing to do with the issue being reported at the start of
this thread. :-)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Ed W on
On 14/07/2010 21:39, Hagen Paul Pfeifer wrote:
> * Rick Jones | 2010-07-14 13:17:24 [-0700]:
>
>
>> There is an effort under way, lead by some folks at Google and
>> including some others, to get the RFC's enhanced in support of the
>> concept of larger initial congestion windows. Some of the discussion
>> may be in the "tcpm" mailing list (assuming I've not gotten my
>> mailing lists confused). There may be some previous discussion of
>> that work in the netdev archives as well.
>>
> tcpm is the right mailing list but there is currently no effort to develop
> this topic. Why? Because is not a standardization issue, rather it is a
> technical issue. You cannot rise the initial CWND and expect a fair behavior.
> This was discussed several times and is documented in several documents and
> RFCs.
>

I'm sure you have covered this to the point you are fed up, but my
searches turn up only a smattering of posts covering this - could you
summarise why "you cannot raise the initial cwnd and expect a fair
behaviour"?

Initial cwnd was changed (increased) in the past (rfc3390) and the RFC
claims that studies then suggested that the benefits were all positive.
Some reasonably smart people have suggested that it might be time to
review the status quo again so it doesn't seem completely obvious that
the current number is optimal?

> RFC 5681 Section 3.1. Google employees should start with Section 3. This topic
> pop's of every two months in netdev and until now I _never_ read a
> consolidated contribution.
>

Sorry, what do you mean by a "consolidated contribution"?

That RFC is a subtle read - it appears to give more specific guidance on
what to do in certain situations, but I'm not sure I see that it
improves slow start convergence speed for my situation (large RTT)?
Would you mind highlighting the new bits for those of us a bit newer to
the subject?

> Partial local issues can already be "fixed" via route specific ip options -
> see initcwnd.
>

Oh, excellent. This seems like exactly what I'm after. (Thanks Stephen
Hemminger!)

Many thanks

Ed W
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/