From: arnuld on
I searched the archives a lot here and found zero solutions to my
problem. There is a lot of discussion on this subject and almost 99% of
it focuses on "You already know what the length of incoming data is". In
my case I don't. Here is a send() created by Beej to handle partial sends
()s:

http://beej.us/guide/bgnet/output/html/multipage/advanced.html#sendall

#include <sys/types.h>
#include <sys/socket.h>

int sendall(int s, char *buf, int *len)
{
int total = 0; // how many bytes we've sent
int bytesleft = *len; // how many we have left to send
int n;

while(total < *len) {
n = send(s, buf+total, bytesleft, 0);
if (n == -1) { break; }
total += n;
bytesleft -= n;
}

*len = total; // return number actually sent here

return n==-1?-1:0; // return -1 on failure, 0 on success
}



The point is how to create a recv() like that when you don't know how
much data you are going to receive. I only know the maximum number of
characters I am ever gonna receive. Data does come in fragments on my
socket (as its large, around 1400 characters).

Any ideas ?




--
www.lispmachine.wordpress.com
my email is @ the above blog.

From: Nicolas George on
arnuld wrote in message <pan.2010.06.30.10.53.05(a)invalid.address>:
> int sendall(int s, char *buf, int *len)
> {
> int total = 0; // how many bytes we've sent
> int bytesleft = *len; // how many we have left to send
> int n;
>
> while(total < *len) {
> n = send(s, buf+total, bytesleft, 0);
> if (n == -1) { break; }
> total += n;
> bytesleft -= n;
> }
>
> *len = total; // return number actually sent here
>
> return n==-1?-1:0; // return -1 on failure, 0 on success
> }
>
>
> The point is how to create a recv() like that when you don't know how
> much data you are going to receive. I only know the maximum number of
> characters I am ever gonna receive. Data does come in fragments on my
> socket (as its large, around 1400 characters).

What kind of socket do you use: stream or datagram? The function you quoted
is suitable for a stream socket, but the question you are asking only makes
some sense for datagram sockets.
From: arnuld on
> On Wed, 30 Jun 2010 11:01:14 +0000, Nicolas George wrote:

> What kind of socket do you use: stream or datagram? The function you
> quoted is suitable for a stream socket, but the question you are asking
> only makes some sense for datagram sockets.

I am talking of stream sockets (TCP connection only).






--
www.lispmachine.wordpress.com
my email is @ the above blog.

From: Nicolas George on
arnuld wrote in message <pan.2010.06.30.11.28.42(a)invalid.address>:
> I am talking of stream sockets (TCP connection only).

Then you can not do anything: streams sockets have no notion of records,
that's the definition. Packets sent by the application can be split or
aggregated any way because of network or scheduling necessities. The
contents of the stream has to show the records separation some way, either
by prefixing be a length header or by having an end marker. In the latter
case, which I do not recommend, you need to dynamically reallocate your
receiving buffer, unless you accept to set an arbitrary limit.

Of course, you should wrap everything in a library that does buffering, so
that reading a 2-octets length header, then 42 octets of payload, then again
a 2-octets length, then 36 octet of payload would result in a single system
call.
From: Ersek, Laszlo on
On Wed, 30 Jun 2010, arnuld wrote:

>> On Wed, 30 Jun 2010 11:01:14 +0000, Nicolas George wrote:
>
>> What kind of socket do you use: stream or datagram? The function you
>> quoted is suitable for a stream socket, but the question you are asking
>> only makes some sense for datagram sockets.
>
> I am talking of stream sockets (TCP connection only).

The "technique" you're looking for is called "loop fusion".

http://en.wikipedia.org/wiki/Loop_fusion

That is, you can't use separate receive loops and processing loops. You
have to fuse them. You have to parse (perhaps not completely process, but
pre-parse) the octet stream as it is coming in.

One usual solution is to length-prefix the data. Not necessarily
explicitly; the request type, also present in a programmer-defined fixed
header, can imply the length, for example. You would have a four-phase
parser:

1. Loop until at least N bytes come in (fixed size header).

2. Supposing M >= N bytes arrived, parse the header. The header tells you
(because the sender computed it) how much octets the body will contain.
Let's call that integer B.

3. Loop until B-(M-N) further bytes arrive.

4. Process the complete request.

In the first phase you can ensure that M equals N exactly -- simply don't
try to read more octets than exactly required to complete the fixed size
header.

For example, an HTTP/1.0 client can send a Content-Length header that
allows the receiver to allocate the block at once, then read the entire
body in a single loop. HTTP/1.1 introduces Chunked Transfer Encoding,
which does the same, but for multiple smaller segments. Sometimes the
sender doesn't know the full length in advance, and would have to format
the message twice to precompute that. It may know the length of the
individual chunks in advance, OTOH.

http://en.wikipedia.org/wiki/Chunked_transfer_encoding

If you have to interleave processing with reading, you'll need a parser
that exports a "feed me" interface. You block in select() or poll() or
something equivalent, and whenever bytes come in, you feed them to the
corresponding parser. The parser calls back to your code whenever it
executes a grammatical reduction or notices an "event". For the idea, see
eg.

http://en.wikipedia.org/wiki/Simple_API_for_XML#XML_processing_with_SAX

Projects to check out:

http://www.hwaci.com/sw/lemon/
http://www.complang.org/ragel/

lacos