From: Hector Santos on
Tom Serface wrote:

> I think Joe is saying it is meaningless these days because there is no
> carriage to return any longer. I think most of us consider \n
> synonymous with Enter and that implies the start of a new line. A lot
> of this is carry over from the days of teletype and paper terminals and
> we're just stuck with it as part of ASCII.
>

I just wanted to add, yes, \n is viewed as a new line, but that is
only in the DOS/Windows world. Not the case in the "other" worlds!

In the DOS/Windows programming the default is COOKED mode when you
open a text file. COOKED means it will do translations for you - in
both directions. In RAW mode, there is no translation and you must be
specific, <CR><LF> or \r\n.

In MS C/C++, file I/O runtime library function

_setmode()

can be used to set/change the binary (RAW) or text (COOKED)
translation mode. For example,

_setmode( _fileno( stdin ), _O_BINARY );
_setmode( _fileno( stdout ), _O_BINARY );

will set a standard I/O console program to be compatibility with the
UNIX/MAC/DOS world because you are dealing with RAW bytes, no
transparent translations being done.

Here is a quick portable "fetch" program you can use to GET a HTTP
resource from a web site:

================= CUT HERE ======================
/* fetch.c -- fetch via HTTP and dump the entire session to stdout
very stupidly. Illustrate need to change the stdout
default _O_TEXT cooked mode to _O_BINARY raw mode.

*/

#ifdef _WIN32

#include <windows.h>
#include <stdio.h>
#include <string.h>
#include <winsock.h>
#include <fcntl.h>
#include <io.h>

#pragma comment(lib,"wsock32.lib")
#define close(a) closesocket(a)
#define read(a,b,c) recv(a,b,c,0)
#define write(a,b,c) send(a,b,c,0)

#else
#include <stdio.h>
#include <string.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netdb.h>
#include <signal.h>
#endif

main(argc, argv)
int argc;
char **argv;
{
int pfd; /* fd from socket */
int len;
char *hostP, *fileP;
char buf[1024];
struct hostent *hP; /* for host */
struct sockaddr_in sin;
#ifdef _WIN32
WSADATA wd;

if (WSAStartup(MAKEWORD(1, 1), &wd) != 0) {
exit(1);
}
_setmode( _fileno( stdin ), _O_BINARY );
_setmode( _fileno( stdout ), _O_BINARY );
#endif

if ( argc != 3 ) {
fprintf( stderr, "Usage: %s host file\n", argv[0] );
exit( 1 );
}

hostP = argv[1];
fileP = argv[2];

hP = gethostbyname( hostP );
if ( hP == NULL ) {
fprintf( stderr, "Unknown host \"%s\"\n", hostP );
exit( 1 );
}

pfd = socket( AF_INET, SOCK_STREAM, 0 );
if ( pfd < 0 ) {
perror( "socket" );
exit( 1 );
}

sin.sin_family = hP->h_addrtype;
memcpy( (char *)&sin.sin_addr, hP->h_addr, hP->h_length );
sin.sin_port = htons( 80 );
if ( connect( pfd, (struct sockaddr *)&sin, sizeof(sin) ) < 0 ) {
perror( "connect" );
close( pfd );
exit( 1 );
}

sprintf( buf, "GET %s HTTP/1.0\r\n"
"host: %s\r\n"
"accept: *.*\r\n\r\n", fileP, hostP);

write( pfd, buf, strlen(buf));

while ( ( len = read( pfd, buf, sizeof(buf)) ) > 0)
fwrite( buf, 1, len, stdout );

close( pfd );
fflush( stdout );
exit( 0 );
}
================= CUT HERE ======================



--
HLS
From: Joseph M. Newcomer on
A cynic after my own heart.

I've used XPAT in the past. But the attorneys had the development people on a short leash
about "open source", proving once again that the GPL is one of the worst ideas to have
ever been invented.
joe

On Fri, 22 Jan 2010 23:50:31 -0800, "Tom Serface" <tom(a)camaswood.com> wrote:

>Well, you could have used Xerces and spent 8 days getting it to work instead
>:o)
>
>Tom
>
>"Joseph M. Newcomer" <newcomer(a)flounder.com> wrote in message
>news:99vkl5p0cdc2ngvsqpdn0h9rhr5sn8fnal(a)4ax.com...
>> I can generally write an FSM parser in an hour or so, depending on the
>> syntax. I wrote an
>> XML parser, recursive descent, in eight hours, start to finish. The
>> constraints were
>> strange, and involved "no public source code, ever", which I thought was
>> foolish, but they
>> were paying. I did tell them there were a number of cheats, such as it
>> did not handle all
>> possible encodings of XML files, a constraint they found acceptable.
>> joe
>
Joseph M. Newcomer [MVP]
email: newcomer(a)flounder.com
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm
From: Joseph M. Newcomer on
See below...
On Sat, 23 Jan 2010 02:43:35 -0500, Hector Santos <sant9442(a)nospam.gmail.com> wrote:

>
>Joseph M. Newcomer wrote:
>
>> One of the rules we developed about forty years ago (1968) is that \r is meaningless noise
>> treated as whitespace, and \n is a newline. This works until you import a text file
>> creating on a pre-OS X Mac, where \r is the newline character.
>> joe
>
>
>Don't confuse raw vs cooked vs display/print device vs storage systems!
****
Historically, this has been a problem since the 1960s, AT LEAST. So it is unlikely I have
confused them.
****
>
>\r\n has their basis as hardware device codes for the harder devices
>of the day; printers, teletypes, dumb terminals, etc
>
>\r <CR> is what it is - a carriage return (move it to the first
>column) of the printer head! Note the operative word - Carriage!
****
This is news? I knew this in 1965.
****
>
>\n <LF> is what it is - a line feed (move carriage head down one line)
>of the printer head!
****
In 1968 I wrote an optimized plot program that took advantage of this capability, so it is
unlikely I would not understand it.
****
>
>When the consoles came, the printer head was now your cursor. That is
>why it is paired whether there are from translations or not.
>
>Now, your Terminal and Printer could have OPTIONAL translation for an
>automatic line feed (/n) with each carriage return (/r) which means it
>APPEAR as it was a line delimiter as in in the unix wienie world. In
>the MAC word, a /n is the line delimiter. DOS of courses uses /r/n
>(<CR><LF>) pairs.
****
Yes, and generally we considered this a real mistake in the design, done by engineers who
had no concept of reality.
****
>
>But it is your terminal or printer providing the illusion with
>translations which may be default depending on the OS it connected
>to). So if you dumped a unix file or mac file to a printer, it did
>the proper translation for you. The printer or carriage or laser
>point did not change, you still need to tell it to go left, right, up
>or down!
>
>Geez, Meaningless?
****
I was not talking about display. I was talking about reading stored information from a
file. At no point were we talking about displays; we were talking about parsing files. Or
had you missed that little point?

In parsing a file, most systems use one of two conventions: \n to end a line (if you are
Unix) and \r\n to end a line (MS-DOS, Windows). The Mac introduced a serious aberration
into this, using \r to end a line. Across dozens of operating systems, over many decades,
the only conventions used were either the Unix convention or what became the MS-DOS
convention (it adopted a long-standing tradition dating to the mid-1960s). In parsing
files, therefore, we learned early on that \r is meaningless and \n is a line terminator.

Now, if you want to change the discussion to display on dumb terminals, we can have a
completely different discussion. For example, the IBM 2741 vs. the IBM 1050 conventions,
the Model 33 TTY conventions, the conventions used by perhaps a dozen different video
terminal vendors, etc. All of these involve how those characters were used to DISPLAY
information. When parsing information, however, \r and \n are both considered
"whitespace" and the \r is meaningless. So don't confuse display with storage. Oh, wait
a minute, that's what you told me...
joe

>
>This again is a example of insane revisionist comments.
Joseph M. Newcomer [MVP]
email: newcomer(a)flounder.com
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm
From: Joseph M. Newcomer on
Logically, many rendering programs will try \r as "reset cursor to beginning margin"
(which in most languages is the left side of the display area, but in some languages like
Arabic and Hebrew is the right side). But that's a display technique, and he somehow made
the leap from my discussing how to parse stored data to thinking I was talking about
display rendering, a topic that was not under discussion.
joe

On Fri, 22 Jan 2010 23:58:13 -0800, "Tom Serface" <tom(a)camaswood.com> wrote:

>I think Joe is saying it is meaningless these days because there is no
>carriage to return any longer. I think most of us consider \n synonymous
>with Enter and that implies the start of a new line. A lot of this is
>carry over from the days of teletype and paper terminals and we're just
>stuck with it as part of ASCII.
>
>Tom
>
>"Hector Santos" <sant9442(a)nospam.gmail.com> wrote in message
>news:uqDAH$$mKHA.1548(a)TK2MSFTNGP04.phx.gbl...
>>
>> Joseph M. Newcomer wrote:
>>
>>> One of the rules we developed about forty years ago (1968) is that \r is
>>> meaningless noise
>>> treated as whitespace, and \n is a newline. This works until you import
>>> a text file
>>> creating on a pre-OS X Mac, where \r is the newline character.
>>> joe
>>
>>
>> Don't confuse raw vs cooked vs display/print device vs storage systems!
>>
>> \r\n has their basis as hardware device codes for the harder devices of
>> the day; printers, teletypes, dumb terminals, etc
>>
>> \r <CR> is what it is - a carriage return (move it to the first column) of
>> the printer head! Note the operative word - Carriage!
>>
>> \n <LF> is what it is - a line feed (move carriage head down one line) of
>> the printer head!
>>
>> When the consoles came, the printer head was now your cursor. That is why
>> it is paired whether there are from translations or not.
>>
>> Now, your Terminal and Printer could have OPTIONAL translation for an
>> automatic line feed (/n) with each carriage return (/r) which means it
>> APPEAR as it was a line delimiter as in in the unix wienie world. In the
>> MAC word, a /n is the line delimiter. DOS of courses uses /r/n (<CR><LF>)
>> pairs.
>>
>> But it is your terminal or printer providing the illusion with
>> translations which may be default depending on the OS it connected to).
>> So if you dumped a unix file or mac file to a printer, it did the proper
>> translation for you. The printer or carriage or laser point did not
>> change, you still need to tell it to go left, right, up or down!
>>
>> Geez, Meaningless?
>>
>> This again is a example of insane revisionist comments.
>>
>> --
>> HLS
Joseph M. Newcomer [MVP]
email: newcomer(a)flounder.com
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm
From: Stanza on
Thanks for everyone's contributions. There's quite a lot here to digest. Re
strtok - I remember using this many years ago, and as far as I recall it
would jump over empty csv entries, so the string "one,,three" would return
"one" followed by "three".