DNS load-balancing two equal nexthops is not fair [Postfix]

Prev: set a catch-all for users that not exists in database
Next: Simple hack to get $500 to your home.

From: Florin Andrei on 30 Jun 2010 13:53

Emails are sent from a machine running Postfix 2.5.0. They are generated
by software as a batch (triggered by certain events from outside), and
injected very quickly into the local Postfix instance, which never sends
out email directly to the Internet, but only through some Postfix
gateways on other machines.

Destinations are very diverse, by domain and by username, but there's
only one destination per message (no mass distribution of same message).

I want to "load balance" the outbound email between two Postfix
gateways, each one running 2.7.0. Each gateway should receive an
approximately equal amount of outbound messages.

I created a fake domain with the two gateways as MX records:

foobar.local. 604800 IN MX 0 thingone.local.
foobar.local. 604800 IN MX 0 thingtwo.local.

Then I created a sender_dependent_relayhost_maps table with the fake
domain as the nexthop:

somesender(a)domain.com foobar.local

Tested it - seems to work.

But in reality there's no even distribution between the two nexthops.
The local nexthop always receives 3x ... 4x more messages than the
nexthop across VPN in the other datacenter. Local nexthop is also
slightly faster hardware - not sure if that matters.

One way to do equal-volume load balancing would be to tell the initial
Postfix instance to only send, like, 10 or 100 messages through any
given SMTP connection to the nexthops, then hang off and connect again.
Due to the way DNS works, this would ensure a statistically fair
distribution, by volume, between nexthops.

Correct me if I'm wrong, but that doesn't seem possible with Postfix. I
couldn't find any setting that says "cut off delivery after N messages".

Is there another way?

Also, can someone clarify how and why I end up with the 3:1 or 4:1
distribution? What makes one system receive more emails? Is it because
it's more responsive? (closer topologically, also faster hardware)
What's the algorithm?

--
Florin Andrei
http://florin.myip.org/

From: Wietse Venema on 30 Jun 2010 14:17

When sending mail via SMTP, Postfix randomizes the order of
equal-preference server IP addresses.

However, with SMTP connection caching enabled, the faster SMTP
server will get more mail than the slower SMTP server.

So, you need to be a little more careful with your claims.

Wietse

From: Wietse Venema on 30 Jun 2010 14:29

Florin Andrei:
> Correct me if I'm wrong, but that doesn't seem possible with Postfix. I
> couldn't find any setting that says "cut off delivery after N messages".

That would actually make your problem worse. When one host is slower
than the other, and connections are closed after a fixed number of
deliveries, most connections would end up going on the slower host.

We ran into that problem with Postfix 2.2, and that is why there
is a reuse TIME limit instead of a reuse COUNT limit.

Below is a quote from Postfix RELEASE_NOTES-2.3 with the gory details.

Wietse

[Feature 20051026] This snapshot addresses a performance stability
problem with remote SMTP servers. The problem is not specific to
Postfix: it can happen when any MTA sends large amounts of SMTP
email to a site that has multiple MX hosts. The insight that led
to the solution, as well as an initial implementation, are due to
Victor Duchovni.

The problem starts when one of a set of MX hosts becomes slower
than the rest. Even though SMTP clients connect to fast and slow
MX hosts with equal probability, the slow MX host ends up with more
simultaneous inbound connections than the faster MX hosts, because
the slow MX host needs more time to serve each client request.

The slow MX host becomes a connection attractor. If one MX host
becomes N times slower than the rest, it dominates mail delivery
latency unless there are more than N fast MX hosts to counter the
effect. And if the number of MX hosts is smaller than N, the mail
delivery latency becomes effectively that of the slowest MX host
divided by the total number of MX hosts.

The solution uses connection caching in a way that differs from
Postfix 2.2. By limiting the amount of time during which a connection
can be used repeatedly (instead of limiting the number of deliveries
over that connection), Postfix not only restores fairness in the
distribution of simultaneous connections across a set of MX hosts,
it also favors deliveries over connections that perform well, which
is exactly what we want.

The smtp_connection_reuse_time_limit feature implements the connection
reuse time limit as discussed above. It limits the amount of time
after which an SMTP connection is no longer stored into the connection
cache. The default limit, 300s, can result in a huge number of
deliveries over a single connection.

From: Florin Andrei on 6 Jul 2010 14:21

On 06/30/2010 11:17 AM, Wietse Venema wrote:
> When sending mail via SMTP, Postfix randomizes the order of
> equal-preference server IP addresses.
>
> However, with SMTP connection caching enabled, the faster SMTP
> server will get more mail than the slower SMTP server.

It seems you imply that disabling the connection cache will equalize the
distribution. Or is it not that simple?

Note: The systems are pretty fast and the connections are not slow
either - one is local, the other is over a reasonably fast data link.

--
Florin Andrei
http://florin.myip.org/

From: Victor Duchovni on 6 Jul 2010 14:30

On Tue, Jul 06, 2010 at 11:21:19AM -0700, Florin Andrei wrote:

> On 06/30/2010 11:17 AM, Wietse Venema wrote:
>> When sending mail via SMTP, Postfix randomizes the order of
>> equal-preference server IP addresses.
>>
>> However, with SMTP connection caching enabled, the faster SMTP
>> server will get more mail than the slower SMTP server.
>
> It seems you imply that disabling the connection cache will equalize the
> distribution. Or is it not that simple?

No, disabling the cache will still leave a skewed distribution. Connection
creation is uniform across the servers, but connection lifetime is much
longer on the slow server, so its connection concurrency is much higher
(potentially equal to the destination concurrency limit under suitable
conditions, thus keeping the fast servers essentially idle).

A time-based cache is the fairness mechanism that keeps connection
lifetimes uniform across the servers, which ensures non-starvation
of fast servers, and avoids futher overload of (congested) slow servers.

> Note: The systems are pretty fast and the connections are not slow either -
> one is local, the other is over a reasonably fast data link.

The <usual stuff> is not always hitting the fan, otherwise the fan would
be off. :-)

--
Viktor.

| Next | Last
Pages: 1 2 3
Prev: set a catch-all for users that not exists in database
Next: Simple hack to get $500 to your home.