From: adrian ilarion ciobanu on

I got this task of implementing ODMR in postfix. Although I tried playing the "why-not-instead" game,
1. no one wants to hear about uucp probably because they thought we will also
switch the fiber links to 9600baud modems.
2. ETRN was a good choice until ISP decided that ETRN is not a good choice anymore.
3. The existing ODMR implementations for postfix are "don't"s.

Before proceeding, I'd like to ask few questions since I seem to have multiple implementation choices:

No matter how hard I try not to, I keep seeing similarities between ETRN and ATRN. That is, dead people:
1.ATRN requires AUTH by specification but no sane sysadmin would do ETRN without AUTH by implementation
2.both ETRN and ATRN can share the same service_policy_check (by matching the requested domain(s) against
authenticated user) and (if it doesn't sound forced) same configuration parameter, smtpd_etrn_restrictions - and
this should (not must) always be the case when supporting both same time.
3.both ETRN and ATRN can be flush(8)-based services without increased code complexity and I'll get back to this soon.
4.at this point i see ATRN as a "silent" ETRN upgrade so postfix would transparently provide both services. The only
weird thing that I can't get rid of because others did it is ETRN being available on smtp port and ATRN requiring
port 366. Why? Beats me. I'll call it "incomplete security measures". RFC is missing a chapter that would make ETRN either
obsolete or confirming to the new security considerations if one cares so about security more than about keeping protocol
upgrades inline. I can hardly wait to read about YTRN and XTRN.


revisiting (3.), I picture this postfix setup that may be a classic for ETRN/ATRN: ETRN-enabled domains (if any) would
have a normal transport definition and ATRN-enabled domains would have a transport of atrn:[localhost]:DOMAIN_PORT. When
ODMR service accepts the ATRN request it will hookup a listener on localhost:DOMAIN_PORT and call flush for the requested domain.
The listener will simply proxy the data to the client's connection (without QUITs until last domain requested gets to be fully
processed). Transactions in progress may be spotted by a failed bind(), assuming no other bugs. Disabling resends until explicitly
requested may be done with defer_transports same as with etrn.
This is for now the cleanest ATRN implementation while willing to reuse everything that I can from postfix code.


Because of the (4) and also the way we can reuse flush, a patch on smtpd service is undesirable.
BUT i can still reuse great pieces of code from smtpd service to implement an "odmrd" service. I can think AUTH and EHLO and that
would be enough. So I will go with defining a trimmed-down version of smtpd automata and reuse some state handlers.
But some of the functions registered with SMTPD automata that I would reuse in "odmrd" are static. Same for odmr client if one
wants to reuse smtp functions.

My question is, is this just an isolated case when one can successfully reuse postfix code or can it be taken in consideration
for a wider audience and eventually make it through at least as a chance for anyone to define some family of functions as
exportable at compile time (preprocessed (non-)static declarations) ?

And second, is the idea of reusing flush(8) in ATRN (as described above) having some flaws that I failed to see?


Thank you,




--
adrian ilarion ciobanu
adrian.i(a)ciobanu.name
http://pub.mud.ro/~cia
+40 788 319 497

From: Wietse Venema on
adrian ilarion ciobanu:
> No matter how hard I try not to, I keep seeing similarities between
> ETRN and ATRN.

In both clases the client connects to the standard SMTP port. The
biggest difference is that ETRN creates new SMTP connections for
delivery, whereas ATRN delivers over the existing connection.

> revisiting (3.), I picture this postfix setup that may be a classic
> for ETRN/ATRN: ETRN-enabled domains (if any) would have a normal
> transport definition and ATRN-enabled domains would have a transport
> of atrn:[localhost]:DOMAIN_PORT. When ODMR service accepts the
> ATRN request it will hookup a listener on localhost:DOMAIN_PORT
> and call flush for the requested domain. The listener will simply
> proxy the data to the client's connection (without QUITs until
> last domain requested gets to be fully processed).

This requires reserving one DOMAIN_PORT for each customer, and a
Postfix SMTP client concurrency of exactly 1. Since you're using
TCP, that would limit the setup to ports 1024..65535, or 64512
different customers.

But there are plenty other applications that want to bind to the
localhost interface, such as SMTP-based content filters, HTTP-based
servers such as CUPS printer management, and so on. UNIX-domain
sockets would provide a more private name space.

It is to be expected that the customer will log in over a TLS
session, so the SMTP server needs to open a socket to an ATRN server
and tell it what customer to receive mail for. The ATRN server then
does all the SMTP-specific stuff, like sending "220 servername" to
the Postfix SMTP client, and dropping QUIT commands from the Postfix
SMTP client so that the connection is not closed too early. The
Postfix SMTP server just reads/writes bytes between ATRN daemon
and customer until both parties close a connection, or until some
error or timeout (while respecting the built-in watchdog timer and
other safety mechanisms).

Customer <--> Postfix SMTPD <--> ATRND <--> Postfix SMTP client

I would not duplicate (or somehow turn into a library) parts of
the Postfix SMTP server (welcome, ehlo, auth, starttls, etc.).
Duplication complicates code maintenance, and librarification
involves a great deal of change to mature code that requires a huge
amount of testing for which I do not have the time. Instead, I
would just add a few lines of code to the Postfix SMTP server to
connect to the ATRN daemon and to shuttles bytes between the ATRN
daemon and the customer; this code would be completely encapsulated
so it would not interfere with the operation of the Postfix SMTP
server proper.

> And second, is the idea of reusing flush(8) in ATRN (as described
> above) having some flaws that I failed to see?

The flush(8) service is the only way to go, because the Postfix
design does not allow for multiple queue managers. The alternative
would be to deliver all ATRN customers to a holding area outside
of Postfix and roll your own SMTP client (as with qmail's ODMR).

With a single queue manager, there may be a delay between the ATRN
command and mail coming out.

Wietse

From: Wietse Venema on
Added a comment about how to avioid the transport:DOMAIN_PORT kludge.

Wietse Venema:
> adrian ilarion ciobanu:
> > No matter how hard I try not to, I keep seeing similarities between
> > ETRN and ATRN.
>
> In both clases the client connects to the standard SMTP port. The
> biggest difference is that ETRN creates new SMTP connections for
> delivery, whereas ATRN delivers over the existing connection.
>
> > revisiting (3.), I picture this postfix setup that may be a classic
> > for ETRN/ATRN: ETRN-enabled domains (if any) would have a normal
> > transport definition and ATRN-enabled domains would have a transport
> > of atrn:[localhost]:DOMAIN_PORT. When ODMR service accepts the
> > ATRN request it will hookup a listener on localhost:DOMAIN_PORT
> > and call flush for the requested domain. The listener will simply
> > proxy the data to the client's connection (without QUITs until
> > last domain requested gets to be fully processed).
>
> This requires reserving one DOMAIN_PORT for each customer, and a
> Postfix SMTP client concurrency of exactly 1. Since you're using
> TCP, that would limit the setup to ports 1024..65535, or 64512
> different customers.
>
> But there are plenty other applications that want to bind to the
> localhost interface, such as SMTP-based content filters, HTTP-based
> servers such as CUPS printer management, and so on. UNIX-domain
> sockets would provide a more private name space.

Instead of using a DOMAIN_PORT kludge which requires "reserving"
a TCP port or UNIX-domain pathname per customer, it would make
sense to use the existing Postfix connection caching mechanism.

The idea is to push an open socket into the scache daemon (with a
suitable time to live) under the name of the customer's domain.
Then, the Postfix SMTP client would automatically find that open
socket and start talking SMTP over it.

> It is to be expected that the customer will log in over a TLS
> session, so the SMTP server needs to open a socket to an ATRN server
> and tell it what customer to receive mail for. The ATRN server then
> does all the SMTP-specific stuff, like sending "220 servername" to
> the Postfix SMTP client, and dropping QUIT commands from the Postfix
> SMTP client so that the connection is not closed too early. The
> Postfix SMTP server just reads/writes bytes between ATRN daemon
> and customer until both parties close a connection, or until some
> error or timeout (while respecting the built-in watchdog timer and
> other safety mechanisms).
>
> Customer <--> Postfix SMTPD <--> ATRND <--> Postfix SMTP client
>
> I would not duplicate (or somehow turn into a library) parts of
> the Postfix SMTP server (welcome, ehlo, auth, starttls, etc.).
> Duplication complicates code maintenance, and librarification
> involves a great deal of change to mature code that requires a huge
> amount of testing for which I do not have the time. Instead, I
> would just add a few lines of code to the Postfix SMTP server to
> connect to the ATRN daemon and to shuttles bytes between the ATRN
> daemon and the customer; this code would be completely encapsulated
> so it would not interfere with the operation of the Postfix SMTP
> server proper.
>
> > And second, is the idea of reusing flush(8) in ATRN (as described
> > above) having some flaws that I failed to see?
>
> The flush(8) service is the only way to go, because the Postfix
> design does not allow for multiple queue managers. The alternative
> would be to deliver all ATRN customers to a holding area outside
> of Postfix and roll your own SMTP client (as with qmail's ODMR).
>
> With a single queue manager, there may be a delay between the ATRN
> command and mail coming out.
>
> Wietse
>
>

From: adrian ilarion ciobanu on

> In both clases the client connects to the standard SMTP port. The
> biggest difference is that ETRN creates new SMTP connections for
> delivery, whereas ATRN delivers over the existing connection.

So I should understand that RFC specifying port 366 as the ODMR port is just
a "should" and not a "must"? And the odmr client is supposed to try ATRN
on smtp port?
The rfc is exactly not
so clear on if this is a requirement or ... plus i see odmr ports poisoning
my etc/services file. Now I don't know what the smtp client expectations are, but
I assume windows will use port 366 :)


>
> UNIX-domain
> sockets would provide a more private name space.

I agree. And somehow faster, by dropping the tcp stack. But in this case
one should use lmtp service for honoring flushes and that would involve
some more headache on proxying the data to client (lmtp to smtp translations).
Or is it possible to use unix addresses with smtp (manpage says no)? If not, no one stops me
to alias the loopback interface (127.0.0.2, 0.0.3, ... ). Although it sounds like
a windows hotfix.

>
> It is to be expected that the customer will log in over a TLS
> session, so the SMTP server needs to open a socket to an ATRN server
> and tell it what customer to receive mail for.
> The ATRN server then
> does all the SMTP-specific stuff, like sending "220 servername" to
> the Postfix SMTP client, and dropping QUIT commands from the Postfix
> SMTP client so that the connection is not closed too early. The
> Postfix SMTP server just reads/writes bytes between ATRN daemon
> and customer until both parties close a connection, or until some
> error or timeout (while respecting the built-in watchdog timer and
> other safety mechanisms).
>
> Customer <--> Postfix SMTPD <--> ATRND <--> Postfix SMTP client
>

I understand, you would rather have some operations (like data proxying)
duplicated instead of risking more bugs in the core and thats why
there's a new ATRND service. The less code added to SMTPD, the less the chance
of new bugs. The only problem i see with this is it only gets more complicated.
If I would really want to free smtpd from ATRN code then after a successfull
authentication the ATRN specific talk should be totally routed to ATRND service and forgotten,
by passing the socket descriptor. I think i already picture VERY unexpected bugs
only from data proxying if ATRND starts acting up and SMTPD is respectful. This is why
I wanted to reuse some SMTPD code in a totally decoupled ATRN service :)


my scenario, now that I'm talking SMTPD patching:

smtpd: if client_auth OK && request_policy_check OK then:
smtpd: listen(transport_socket) #be prepared for local smtp deliveries
smtpd: begin_foreach_domain_loop
smtpd: send_flush_request(domain) #this is just etrn flush request multiplied
smtpd: end_foreach_domain_loop
#this is where the time gets spent
smtpd: proxy from transport_socket to client_socket (filter quits, additional helos,etc)
everybody_ends_conversations

yours (did I get it right?):
#request_policy_check sits here pretty neat, still (can reuse etrn policy check service)
smtpd: if client_auth OK && request_policy_check OK then:
smtpd: connect(ATRN service)
smtpd: send userinfo
atrnd: listen(transport_socket)
atrnd: begin_foreach_domain_loop
atrnd: send_flush_request(domain)
atrnd: end_foreach_domain_loop
#this is where the time gets spent, 2 ops
atrnd: proxy from transport_socket to smtpd_atrnd_socket (filter quits, additional helos,etc)
smtpd: proxy from smtpd_atrnd_socket to client_socket
everybody_ends_conversations


#passing the fd
smtpd: if client_auth OK && request_policy_check OK then:
smtpd: connect(ATRN service)
smtpd: pass fd
atrnd: do flush and proxying, smtpd not involved

Honestly I have no idea what security considerations fd passing may raise.


#initial scenario was:

atrnd:totally decoupled service listening on port 366

#I only had the code reuse problem
#given by the morbid desire of integrating as much as i can with existent postfix logic
#as i really dont want to copy or reimplement auth,sasl,tls,ehlo,etc

So passing the fd after some SMTPD setup seems fine, i believe.

> I would not duplicate
Exactly duplication is what i dont want to do, hence wondering about
code reuse, etc - in the case smtpd-alike services with small protocol
differences are needed.
>(or somehow turn into a library) parts of
> the Postfix SMTP server (welcome, ehlo, auth, starttls, etc.).
> Duplication complicates code maintenance, and librarification
> involves a great deal of change to mature code that requires a huge
> amount of testing for which I do not have the time. Instead, I

I understand.

> would just add a few lines of code to the Postfix SMTP server to
> connect to the ATRN daemon and to shuttles bytes between the ATRN
> daemon and the customer; this code would be completely encapsulated
> so it would not interfere with the operation of the Postfix SMTP
> server proper.
>
> > And second, is the idea of reusing flush(8) in ATRN (as described
> > above) having some flaws that I failed to see?
>
> The flush(8) service is the only way to go, because the Postfix
> design does not allow for multiple queue managers. The alternative
> would be to deliver all ATRN customers to a holding area outside
> of Postfix and roll your own SMTP client (as with qmail's ODMR).
>
> With a single queue manager, there may be a delay between the ATRN
> command and mail coming out.

Hello, mail prioritization :)

>
> Wietse

Thank you very much

--
adrian ilarion ciobanu
adrian.i(a)ciobanu.name
http://pub.mud.ro/~cia
+40 788 319 497

From: adrian ilarion ciobanu on
> Instead of using a DOMAIN_PORT kludge which requires "reserving"
> a TCP port or UNIX-domain pathname per customer, it would make
> sense to use the existing Postfix connection caching mechanism.
>
> The idea is to push an open socket into the scache daemon (with a
> suitable time to live) under the name of the customer's domain.
> Then, the Postfix SMTP client would automatically find that open
> socket and start talking SMTP over it.


So I would push the socket to scache after I'm done setting it up
from SMTPD (auth, policy checks) and forget about it. If it times
out before local smtp will start deliver then the client is welcome to reconnect.
This will happen if it has to happen in SMTPD or in SCACHE the same way.
In fact it's a descriptor passing tweaked for smtp deliveries. Nice! :)







--
adrian ilarion ciobanu
adrian.i(a)ciobanu.name
http://pub.mud.ro/~cia
+40 788 319 497