From: Arkadiy on
On Jan 15, 12:10 pm, David Schwartz <dav...(a)webmaster.com> wrote:

> You can also use connection establishment to verify that the server is
> operational. If you can set up a new connection to it, it's not dead.
> However, sending a 'version' command will have (approximately) the
> same effect.

I guess I am kind of slow... Is it all about trying to reach server's
TCP without going to the server itself? What is this 'version'
command?

Thanks,
Arkadiy
From: Rick Jones on
David Schwartz <davids(a)webmaster.com> wrote:
> You can also use connection establishment to verify that the server is
> operational. If you can set up a new connection to it, it's not dead.

Since the connection will complete before the server application calls
accept(), simply establishing the connection (ie connect() completes)
does not mean the server application is in good health. Indeed it
isn't "dead" in so far as the listen endpoint is still there, but it
could be "undead" as it were and hung.

rick jones
--
oxymoron n, commuter in a gas-guzzling luxury SUV with an American flag
these opinions are mine, all mine; HP might not want them anyway... :)
feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...
From: David Schwartz on
On Jan 16, 7:08 am, Arkadiy <vertl...(a)gmail.com> wrote:
> On Jan 15, 12:10 pm, David Schwartz <dav...(a)webmaster.com> wrote:
>
> > You can also use connection establishment to verify that the server is
> > operational. If you can set up a new connection to it, it's not dead.
> > However, sending a 'version' command will have (approximately) the
> > same effect.
>
> I guess I am kind of slow... Is it all about trying to reach server's
> TCP without going to the server itself?

It is about detecting a loss of connectivity.

> What is this 'version'
> command?

You pointed me to the specification of the protocol. Did you not read
it?

Consider:

1) You send a query.

2) The server gets the query and starts working on it.

3) The server crashes.

4) The server restarts.

5) You are still waiting on 'read' for the reply, and you will wait
forever. The server has no idea you are waiting and has no reason to
ever send you a packet, nor do you have any reason to send the server
a packet.

In this case (or when you suspect it), you can simply send a 'version'
command to the server. This will result in an outbound packet which
the server will RST.

Alternatively, if the server stays down, you have the same problem.
You will not send a packet to the server and it will not send one to
you. The solution is the same, send a 'version' commands. Your send
will timeout, allowing you to detect the loss of the connection.

The problem is that TCP does not test a connection to make sure it is
still alive, except every two hours as an option. Fortunately, your
protocol provides a way to do just this -- send a "version" command
and see what happens.

1) If the server is gone, the send will timeout causing a local TCP
error.

2) If the server rebooted, it will RST the packet, causing a local TCP
error.

3) If the server is overloaded (or working on the command you sent),
nothing will happen until later, when you will get the reply to your
command.

4) If the server is fine, you will get an immediate reply.

So the problem is this:

Suppose you haven't fully timed out yet, you're not willing to give
up. But it has been a long time, and you're suspicious -- maybe the
server crashed or died. If you just wait to timeout, you're wasting
time that you could use to reconnect to the server.

This is where you need an intermediate "soft timeout" solution.

DS
From: David Schwartz on
On Jan 16, 10:25 am, Rick Jones <rick.jon...(a)hp.com> wrote:
> David Schwartz <dav...(a)webmaster.com> wrote:
> > You can also use connection establishment to verify that the server is
> > operational. If you can set up a new connection to it, it's not dead.

> Since the connection will complete before the server application calls
> accept(), simply establishing the connection (ie connect() completes)
> does not mean the server application is in good health. Indeed it
> isn't "dead" in so far as the listen endpoint is still there, but it
> could be "undead" as it were and hung.

No, it does not mean the server application is in good health, but it
does mean the server is operational. If you aren't going to use the
connection immediately but want to make sure the server is stable, you
can send a "version" command and see how long it takes for you to get
a reply.

DS
From: Arkadiy on
On Jan 16, 1:25 pm, Rick Jones <rick.jon...(a)hp.com> wrote:
> David Schwartz <dav...(a)webmaster.com> wrote:
> > You can also use connection establishment to verify that the server is
> > operational. If you can set up a new connection to it, it's not dead.
>
> Since the connection will complete before the server application calls
> accept(), simply establishing the connection (ie connect() completes)
> does not mean the server application is in good health. Indeed it
> isn't "dead" in so far as the listen endpoint is still there, but it
> could be "undead" as it were and hung.

I think this just adds one more bit of information to help to decide
what to do next. If the request timed out, something is wrong. If
the request timed out and connect succeeded, it looks like the server
is congested or hanging. If the connect times out, it's either the
server is unreachable or the network is congested. If the connect
fails, we will know what the error is... Still not enough information
though...

Regards,
Arkadiy