From: Goran on
On Jan 22, 9:37 pm, Hector Santos <sant9...(a)nospam.gmail.com> wrote:
> Tom Serface wrote:
> > One thing most parsers don't handle correctly, that's I've seen, is
> > double double quotes for strings if you want to have a quote as part of
> > the string like:
>
> > "This is  my string "Tom" that I am using", "Next token", "Next token"
>
> > In the above, from my perspective, the parser should read the entire
> > first string since we didn't come to a delimiter yet, but a lot of
> > tokenizers choke on this sort of thing.
>
> Often, it takes two to tango.  A writer needs to escape tokens in
> order to reach some level of sanity. i.e, borrowing a C slash for \".
>
>      "This is  my string \"Tom\" that I am using"
>
> Or use some encoding method, each HTTP Escape! :)

You really should stop with NIH.

I have never seen HTTP escaping in CSV files. I know of two relevant
conventions: "Unix" one for D(elimiter)SV files, where escape
character is backlash, and "Windows" (RFC 4180) one, with quote
character escaping. What the hell do you think are you doing,
inventing things like that?

Did it occur that CSV files are useless on their own. People use tools
(e.g. Excel) to view them. I don't think that works with HTTP
escaping, and I would be surprised if it did.

So let me tell you something here: you are proposing that people write
their own CSV parser, and you yourself claim to have written one or
more. But frankly, you don't seem to know what CSV files are, neither
by spec, neither in practice.

That's EXACTLY the kind of attitude I am denouncing here. That's
EXACTLY why first thing to do is to look for existing code, NOT roll
your own based on poor understanding of the problem, or worse yet,
defining an old problem anew, on a whim.

Goran.
From: Hector Santos on
Goran wrote:

>
> I have never seen HTTP escaping in CSV files.


I'm sorry about that. But its out there, maybe because layman web
programmers were trying put CSV lines over HTTP and it was naturally
escaped, but whatever reasons, its out there.

> I know of two relevant
> conventions: "Unix" one for D(elimiter)SV files, where escape
> character is backlash, and "Windows" (RFC 4180) one, with quote
> character escaping. What the hell do you think are you doing,
> inventing things like that?


I knew Yakov, and RFC 4180 was written in 2005, and ABSOLUTELY, check
it out, http-like %XX escaping is a recommendation. But practically
applications predated Yakoc RFC recommendation by several decades!

And who's reinventing things, I'm not the trying to change CSV to DSV.
its COMMA, ok?

> Did it occur that CSV files are useless on their own.


No. It didn't. Is that another opinion of yours?

> People use tools (e.g. Excel) to view them.


Describe PEOPLE!

> I don't think that works with HTTP escaping, and I would be

> surprised if it did.

Do you know what HTTP escaping is for god sake?

> So let me tell you something here:


Please do.

> you are proposing that people write their own CSV parser,


Hello? Did ANYONE with a sane mind here read that I said people
should writ their own CVS, no DVS parser over anything else?

> and you yourself claim to have written one or
> more. But frankly, you don't seem to know what CSV files are, neither
> by spec, neither in practice.


Ha! And you surely showing you know a lot!

> That's EXACTLY the kind of attitude I am denouncing here. That's
> EXACTLY why first thing to do is to look for existing code, NOT roll
> your own based on poor understanding of the problem, or worse yet,
> defining an old problem anew, on a whim.

Well, good luck in trying to stop it because generally most
programmers are interesting in knowing how to write code, not always
depend on others!! Sound like you would are not very good at neither!
Really, you are not. Honestly. I would never hire a person like you
and even if you didn't want work for me, you are certainly not showing
you have good qualifications for programming on your own. How do you
like those adam apples!?

--
HLS
From: Hector Santos on
Stanza wrote:

> Thanks for everyone's contributions. There's quite a lot here to digest.
> Re strtok - I remember using this many years ago, and as far as I recall
> it would jump over empty csv entries, so the string "one,,three" would
> return "one" followed by "three".

True.

Stanza you really could short circuit the craziest here by describing
what language and platform you are using or want this solution. It
would certainly help GORAN! :)

--
HLS
From: Hector Santos on
There is one explanation for this spin. You are visually handicap. If
that is the case, I wish to extend my apology for my ignorance.

--
HLS

Joseph M. Newcomer wrote:

> Logically, many rendering programs will try \r as "reset cursor to beginning margin"
> (which in most languages is the left side of the display area, but in some languages like
> Arabic and Hebrew is the right side). But that's a display technique, and he somehow made
> the leap from my discussing how to parse stored data to thinking I was talking about
> display rendering, a topic that was not under discussion.
> joe
>
> On Fri, 22 Jan 2010 23:58:13 -0800, "Tom Serface" <tom(a)camaswood.com> wrote:
>
>> I think Joe is saying it is meaningless these days because there is no
>> carriage to return any longer. I think most of us consider \n synonymous
>> with Enter and that implies the start of a new line. A lot of this is
>> carry over from the days of teletype and paper terminals and we're just
>> stuck with it as part of ASCII.
>>
>> Tom
>>
>> "Hector Santos" <sant9442(a)nospam.gmail.com> wrote in message
>> news:uqDAH$$mKHA.1548(a)TK2MSFTNGP04.phx.gbl...
>>> Joseph M. Newcomer wrote:
>>>
>>>> One of the rules we developed about forty years ago (1968) is that \r is
>>>> meaningless noise
>>>> treated as whitespace, and \n is a newline. This works until you import
>>>> a text file
>>>> creating on a pre-OS X Mac, where \r is the newline character.
>>>> joe
>>>
>>> Don't confuse raw vs cooked vs display/print device vs storage systems!
>>>
>>> \r\n has their basis as hardware device codes for the harder devices of
>>> the day; printers, teletypes, dumb terminals, etc
>>>
>>> \r <CR> is what it is - a carriage return (move it to the first column) of
>>> the printer head! Note the operative word - Carriage!
>>>
>>> \n <LF> is what it is - a line feed (move carriage head down one line) of
>>> the printer head!
>>>
>>> When the consoles came, the printer head was now your cursor. That is why
>>> it is paired whether there are from translations or not.
>>>
>>> Now, your Terminal and Printer could have OPTIONAL translation for an
>>> automatic line feed (/n) with each carriage return (/r) which means it
>>> APPEAR as it was a line delimiter as in in the unix wienie world. In the
>>> MAC word, a /n is the line delimiter. DOS of courses uses /r/n (<CR><LF>)
>>> pairs.
>>>
>>> But it is your terminal or printer providing the illusion with
>>> translations which may be default depending on the OS it connected to).
>>> So if you dumped a unix file or mac file to a printer, it did the proper
>>> translation for you. The printer or carriage or laser point did not
>>> change, you still need to tell it to go left, right, up or down!
>>>
>>> Geez, Meaningless?
>>>
>>> This again is a example of insane revisionist comments.
>>>
>>> --
>>> HLS
> Joseph M. Newcomer [MVP]
> email: newcomer(a)flounder.com
> Web: http://www.flounder.com
> MVP Tips: http://www.flounder.com/mvp_tips.htm



--
HLS
From: Hector Santos on
Goran wrote:

> I have never seen HTTP escaping in CSV files. I know of two relevant
> conventions: "Unix" one for D(elimiter)SV files, where escape
> character is backlash, and "Windows" (RFC 4180) one....


Curious what does RFC 4180 have to do with Windows? You don't seem
the time that would be a follower or implementator IETF documents
generically because RFC are written with the basic idea of RAW
information. In other words, you would hardly, if ever, and if so,
very few see an RFC with any specific recommendation for a LIBRARY,
VENDOR or SOLUTION.

So when you reference RFC documents, it targeted at people (like
myself) who implement these recommendations for people (like you) to use.

--
HLS