From: devel anaconda on
Hello everybody!

I have a high-performance server (dual quad-core Xeon 2.8Ghz + 16GB RAM + 2SCSI disks 140Gb), RedHat Enterprise Linux 5.4, software RAID1 + Postfix 2.5.9.
This server serves only smtp traffic. The only thing postfix should do - receive mails for one user and send it to local script via pipe. Like this:

user: |/usr/local/bin/script

The flow is about 250-300 mails per second.Everything is going fine, if there is no queue. But when queue grows to 10000+ letters (for some reasons), postfix loses control of it. deferred queue is null, active queue is almost null, but incoming queue is growing and growing. When incoming queue is about 100k+ letters, postfix just can't pick it up, until I stop incoming traffic. I'm wondering, why such a server can't handle this workload? Even if I do the following:

user: /dev/null

postfix can't get the incoming queue over, until I reject incoming smtp traffic... When I tried to figure out the bottleneck, I saw, that there is about 100-150 smtpd processes, 100-150 cleanup process, but there is only 5-8 "local" processes... How to say to qmgr, that it must put as many letters from incoming queue to "local" process, as possible? How to increase parallel delivery? I did this:

default_destination_concurrency_limit = 200
initial_destination_concurrency = 200
local_destination_concurrency_limit = 200

but that didn't help :(

Thank you very much and sorry for my bad English.

From: Kenneth Marshall on
On Fri, Nov 06, 2009 at 02:19:34AM +0300, devel anaconda wrote:
> Hello everybody!
>
> I have a high-performance server (dual quad-core Xeon 2.8Ghz + 16GB RAM + 2SCSI disks 140Gb), RedHat Enterprise Linux 5.4, software RAID1 + Postfix 2.5.9.
> This server serves only smtp traffic. The only thing postfix should do - receive mails for one user and send it to local script via pipe. Like this:
>
> user: |/usr/local/bin/script
>
> The flow is about 250-300 mails per second.Everything is going fine, if there is no queue. But when queue grows to 10000+ letters (for some reasons), postfix loses control of it. deferred queue is null, active queue is almost null, but incoming queue is growing and growing. When incoming queue is about 100k+ letters, postfix just can't pick it up, until I stop incoming traffic. I'm wondering, why such a server can't handle this workload? Even if I do the following:
>
> user: /dev/null
>
Your system is not a high-performance server I/O-wise. Your two disks can only
handle 200-300 fsync's to disk per second and postfix will always sync your
mail to disk before passing it on for local processing. You will need a
battery backed caching RAID controller or fast SSD drive for the
/var/spool/postfix directories to allow you to go faster. You could maybe
move the spool directory to a RAM or tmpfs file system as well. You lose
the safety net for messages if you have a power or other hardware problem.

> postfix can't get the incoming queue over, until I reject incoming smtp traffic... When I tried to figure out the bottleneck, I saw, that there is about 100-150 smtpd processes, 100-150 cleanup process, but there is only 5-8 "local" processes... How to say to qmgr, that it must put as many letters from incoming queue to "local" process, as possible? How to increase parallel delivery? I did this:
>
> default_destination_concurrency_limit = 200
> initial_destination_concurrency = 200
> local_destination_concurrency_limit = 200
>
> but that didn't help :(
>

It probably never needed more than 5-8 local processes to sink only 200-300
messages per second. That is why you do not see more. Add a "sleep 1" to
your script and see what happens then. :)

Good luck,
Ken

From: Corey Chandler on
Kenneth Marshall wrote:
>
> Your system is not a high-performance server I/O-wise. Your two disks can only
> handle 200-300 fsync's to disk per second and postfix will always sync your
> mail to disk before passing it on for local processing. You will need a
> battery backed caching RAID controller or fast SSD drive for the
> /var/spool/postfix directories to allow you to go faster. You could maybe
> move the spool directory to a RAM or tmpfs file system as well. You lose
> the safety net for messages if you have a power or other hardware problem.
>
For this kind of issue I generally split the laod across multiple
boxes. It's a lot easier to scale out than it is to keep throwing
hardware at the problem.

-- Corey / KB1JWQ

From: Kenneth Marshall on
On Thu, Nov 05, 2009 at 04:03:13PM -0800, Corey Chandler wrote:
> Kenneth Marshall wrote:
>>
>> Your system is not a high-performance server I/O-wise. Your two disks can
>> only
>> handle 200-300 fsync's to disk per second and postfix will always sync
>> your
>> mail to disk before passing it on for local processing. You will need a
>> battery backed caching RAID controller or fast SSD drive for the
>> /var/spool/postfix directories to allow you to go faster. You could maybe
>> move the spool directory to a RAM or tmpfs file system as well. You lose
>> the safety net for messages if you have a power or other hardware problem.
>>
> For this kind of issue I generally split the laod across multiple boxes.
> It's a lot easier to scale out than it is to keep throwing hardware at the
> problem.
>
> -- Corey / KB1JWQ
>
I just thought of another distasteful option, use a pre-queue content
filter to submit the E-mail and never pass it on to the postfix queue. :)

Ken

From: Wietse Venema on
devel anaconda:
> Hello everybody!
>
> I have a high-performance server (dual quad-core Xeon 2.8Ghz + 16GB RAM + 2SCSI disks 140Gb), RedHat Enterprise Linux 5.4, software RAID1 + Postfix 2.5.9.
> This server serves only smtp traffic. The only thing postfix should do - receive mails for one user and send it to local script via pipe. Like this:
>
> user: |/usr/local/bin/script
>
> The flow is about 250-300 mails per second.Everything is going fine, if there is no queue. But when queue grows to 10000+ letters (for some reasons), postfix loses control of it. deferred queue is null, active queue is almost null, but incoming queue is growing and growing. When incoming queue is about 100k+ letters, postfix just can't pick it up, until I stop incoming traffic. I'm wondering, why such a server can't handle this workload? Even if I do the following:

Because you load the DISK for 100% with incoming mail.

Wietse