Prev: set a catch-all for users that not exists in database
Next: Simple hack to get $500 to your home.
From: Florin Andrei on 30 Jun 2010 13:53 Emails are sent from a machine running Postfix 2.5.0. They are generated by software as a batch (triggered by certain events from outside), and injected very quickly into the local Postfix instance, which never sends out email directly to the Internet, but only through some Postfix gateways on other machines. Destinations are very diverse, by domain and by username, but there's only one destination per message (no mass distribution of same message). I want to "load balance" the outbound email between two Postfix gateways, each one running 2.7.0. Each gateway should receive an approximately equal amount of outbound messages. I created a fake domain with the two gateways as MX records: foobar.local. 604800 IN MX 0 thingone.local. foobar.local. 604800 IN MX 0 thingtwo.local. Then I created a sender_dependent_relayhost_maps table with the fake domain as the nexthop: somesender(a)domain.com foobar.local Tested it - seems to work. But in reality there's no even distribution between the two nexthops. The local nexthop always receives 3x ... 4x more messages than the nexthop across VPN in the other datacenter. Local nexthop is also slightly faster hardware - not sure if that matters. One way to do equal-volume load balancing would be to tell the initial Postfix instance to only send, like, 10 or 100 messages through any given SMTP connection to the nexthops, then hang off and connect again. Due to the way DNS works, this would ensure a statistically fair distribution, by volume, between nexthops. Correct me if I'm wrong, but that doesn't seem possible with Postfix. I couldn't find any setting that says "cut off delivery after N messages". Is there another way? Also, can someone clarify how and why I end up with the 3:1 or 4:1 distribution? What makes one system receive more emails? Is it because it's more responsive? (closer topologically, also faster hardware) What's the algorithm? -- Florin Andrei http://florin.myip.org/
From: Wietse Venema on 30 Jun 2010 14:17 When sending mail via SMTP, Postfix randomizes the order of equal-preference server IP addresses. However, with SMTP connection caching enabled, the faster SMTP server will get more mail than the slower SMTP server. So, you need to be a little more careful with your claims. Wietse
From: Wietse Venema on 30 Jun 2010 14:29 Florin Andrei: > Correct me if I'm wrong, but that doesn't seem possible with Postfix. I > couldn't find any setting that says "cut off delivery after N messages". That would actually make your problem worse. When one host is slower than the other, and connections are closed after a fixed number of deliveries, most connections would end up going on the slower host. We ran into that problem with Postfix 2.2, and that is why there is a reuse TIME limit instead of a reuse COUNT limit. Below is a quote from Postfix RELEASE_NOTES-2.3 with the gory details. Wietse [Feature 20051026] This snapshot addresses a performance stability problem with remote SMTP servers. The problem is not specific to Postfix: it can happen when any MTA sends large amounts of SMTP email to a site that has multiple MX hosts. The insight that led to the solution, as well as an initial implementation, are due to Victor Duchovni. The problem starts when one of a set of MX hosts becomes slower than the rest. Even though SMTP clients connect to fast and slow MX hosts with equal probability, the slow MX host ends up with more simultaneous inbound connections than the faster MX hosts, because the slow MX host needs more time to serve each client request. The slow MX host becomes a connection attractor. If one MX host becomes N times slower than the rest, it dominates mail delivery latency unless there are more than N fast MX hosts to counter the effect. And if the number of MX hosts is smaller than N, the mail delivery latency becomes effectively that of the slowest MX host divided by the total number of MX hosts. The solution uses connection caching in a way that differs from Postfix 2.2. By limiting the amount of time during which a connection can be used repeatedly (instead of limiting the number of deliveries over that connection), Postfix not only restores fairness in the distribution of simultaneous connections across a set of MX hosts, it also favors deliveries over connections that perform well, which is exactly what we want. The smtp_connection_reuse_time_limit feature implements the connection reuse time limit as discussed above. It limits the amount of time after which an SMTP connection is no longer stored into the connection cache. The default limit, 300s, can result in a huge number of deliveries over a single connection.
From: Florin Andrei on 6 Jul 2010 14:21 On 06/30/2010 11:17 AM, Wietse Venema wrote: > When sending mail via SMTP, Postfix randomizes the order of > equal-preference server IP addresses. > > However, with SMTP connection caching enabled, the faster SMTP > server will get more mail than the slower SMTP server. It seems you imply that disabling the connection cache will equalize the distribution. Or is it not that simple? Note: The systems are pretty fast and the connections are not slow either - one is local, the other is over a reasonably fast data link. -- Florin Andrei http://florin.myip.org/
From: Victor Duchovni on 6 Jul 2010 14:30 On Tue, Jul 06, 2010 at 11:21:19AM -0700, Florin Andrei wrote: > On 06/30/2010 11:17 AM, Wietse Venema wrote: >> When sending mail via SMTP, Postfix randomizes the order of >> equal-preference server IP addresses. >> >> However, with SMTP connection caching enabled, the faster SMTP >> server will get more mail than the slower SMTP server. > > It seems you imply that disabling the connection cache will equalize the > distribution. Or is it not that simple? No, disabling the cache will still leave a skewed distribution. Connection creation is uniform across the servers, but connection lifetime is much longer on the slow server, so its connection concurrency is much higher (potentially equal to the destination concurrency limit under suitable conditions, thus keeping the fast servers essentially idle). A time-based cache is the fairness mechanism that keeps connection lifetimes uniform across the servers, which ensures non-starvation of fast servers, and avoids futher overload of (congested) slow servers. > Note: The systems are pretty fast and the connections are not slow either - > one is local, the other is over a reasonably fast data link. The <usual stuff> is not always hitting the fan, otherwise the fan would be off. :-) -- Viktor.
|
Next
|
Last
Pages: 1 2 3 Prev: set a catch-all for users that not exists in database Next: Simple hack to get $500 to your home. |