From: dpb on
analyst41(a)hotmail.com wrote:
> On Feb 1, 7:40 pm, dpb <n...(a)non.net> wrote:
....

>> Search and count the 0D 0A pairs (CRLF) -- they're the record markers.
>>
> Thanks.
>
> But since these markers are occurring both in the middle of a field
> and also at the end of an actual row from the database - I am still
> not able to separate out true EORs from the others.

Well, they _are_ "true" EORs. _WHY_ you're getting them where you
apparently think you shouldn't is a database export problem, apprently.

Is there a field/line length variable you could set or somesuch, perhaps?

--
From: Gordon Sande on
On 2010-02-01 21:11:22 -0400, dpb <none(a)non.net> said:

> analyst41(a)hotmail.com wrote:
>> On Feb 1, 7:40 pm, dpb <n...(a)non.net> wrote:
> ...
>
>>> Search and count the 0D 0A pairs (CRLF) -- they're the record markers.
>>>
>> Thanks.
>>
>> But since these markers are occurring both in the middle of a field
>> and also at the end of an actual row from the database - I am still
>> not able to separate out true EORs from the others.
>
> Well, they _are_ "true" EORs. _WHY_ you're getting them where you
> apparently think you shouldn't is a database export problem, apprently.
>
> Is there a field/line length variable you could set or somesuch, perhaps?

The example you gave looks like HTML or some close technical relative. Such
files typically are intended to have their contents be independent of whatever
line ends they might contain. If they are stored as stings in a database system
the length will be external to the strings and the number of displayed
rows (lines?)
will be dependent on the internal semantics (the <br> <\br> thingys)
independent
of the line ends.

It sure looks like you are being asked a question which depends on the
semantics
of the data and they forgot to tell you what they are. It may be just
that there
are two techniical useages of some otherwise innocent term like line or row.



From: dpb on
Gordon Sande wrote:
> On 2010-02-01 21:11:22 -0400, dpb <none(a)non.net> said:
>
>> analyst41(a)hotmail.com wrote:
>>> On Feb 1, 7:40 pm, dpb <n...(a)non.net> wrote:
>> ...
>>
>>>> Search and count the 0D 0A pairs (CRLF) -- they're the record markers.
>>>>
>>> Thanks.
>>>
>>> But since these markers are occurring both in the middle of a field
>>> and also at the end of an actual row from the database - I am still
>>> not able to separate out true EORs from the others.
>>
>> Well, they _are_ "true" EORs. _WHY_ you're getting them where you
>> apparently think you shouldn't is a database export problem, apprently.
>>
>> Is there a field/line length variable you could set or somesuch, perhaps?
>
> The example you gave looks like HTML or some close technical
> relative. Such files typically are intended to have their contents
> be independent of whatever line ends they might contain. If they are
> stored as stings in a database system the length will be external to
> the strings and the number of displayed rows (lines?) will be
> dependent on the internal semantics (the <br> <\br> thingys)
> independent of the line ends.
>
> It sure looks like you are being asked a question which depends on the
> semantics of the data and they forgot to tell you what they are. It
> may be just that there are two techniical useages of some otherwise
> innocent term like line or row.
>

Yes, I think that's a fair assumption as well; my response simply
addressed the specifics of the question as whether the database export
was embedding some other unexpected control character owing to the
encoding or somesuch. That doesn't appear to be the problem at all;
rather as you say it appears it's simply the line breaks in the original
data are likely just being imported into a text-based database record as
found in the original data source.

--
From: analyst41 on
On Feb 2, 9:52 am, dpb <n...(a)non.net> wrote:
> Gordon Sande wrote:
> > On 2010-02-01 21:11:22 -0400, dpb <n...(a)non.net> said:
>
> >> analys...(a)hotmail.com wrote:
> >>> On Feb 1, 7:40 pm, dpb <n...(a)non.net> wrote:
> >> ...
>
> >>>> Search and count the 0D 0A pairs (CRLF) -- they're the record markers.
>
> >>> Thanks.
>
> >>> But since these markers are occurring both in the middle of a field
> >>> and also at the end of an actual row from the database - I am still
> >>> not able to separate out true EORs from the others.
>
> >> Well, they _are_ "true" EORs.  _WHY_ you're getting them where you
> >> apparently think you shouldn't is a database export problem, apprently..
>
> >> Is there a field/line length variable you could set or somesuch, perhaps?
>
> > The example you gave looks like HTML or some close technical
> > relative.  Such files typically are intended to have their contents
> > be independent of whatever line ends they might contain. If they are
> > stored as stings in a database system the length will be external to
> > the strings and the number of displayed rows (lines?) will be
> > dependent on the internal semantics (the <br> <\br> thingys)
> > independent of the line ends.
>
> > It sure looks like you are being asked a question which depends on the
> > semantics of the data and they forgot to tell you what they are. It
> > may be just that there are two techniical useages of some otherwise
> > innocent term like line or row.
>
> Yes, I think that's a fair assumption as well; my response simply
> addressed the specifics of the question as whether the database export
> was embedding some other unexpected control character owing to the
> encoding or somesuch.  That doesn't appear to be the problem at all;
> rather as you say it appears it's simply the line breaks in the original
> data are likely just being imported into a text-based database record as
> found in the original data source.
>
> --- Hide quoted text -
>
> - Show quoted text -

But the database access client is able to tell the end-of-rows from
the newline markers embedded in some columns.

In other words, you will see two rows when you are in the client.
When you say 'save results to csv file' it becomes a file which seems
to have 5 rows as seem by excel, notepad, fortran etc.

I am surprised nobobdy else seems to have faced this problem.
From: dpb on
analyst41(a)hotmail.com wrote:
> On Feb 2, 9:52 am, dpb <n...(a)non.net> wrote:
>> Gordon Sande wrote:
>>> On 2010-02-01 21:11:22 -0400, dpb <n...(a)non.net> said:
>>>> analys...(a)hotmail.com wrote:
>>>>> On Feb 1, 7:40 pm, dpb <n...(a)non.net> wrote:
>>>> ...
>>>>>> Search and count the 0D 0A pairs (CRLF) -- they're the record markers.
>>>>> Thanks.
>>>>> But since these markers are occurring both in the middle of a field
>>>>> and also at the end of an actual row from the database - I am still
>>>>> not able to separate out true EORs from the others.
>>>> Well, they _are_ "true" EORs. _WHY_ you're getting them where you
>>>> apparently think you shouldn't is a database export problem, apprently.
>>>> Is there a field/line length variable you could set or somesuch, perhaps?
>>> The example you gave looks like HTML or some close technical
>>> relative. Such files typically are intended to have their contents
>>> be independent of whatever line ends they might contain. If they are
>>> stored as stings in a database system the length will be external to
>>> the strings and the number of displayed rows (lines?) will be
>>> dependent on the internal semantics (the <br> <\br> thingys)
>>> independent of the line ends.
>>> It sure looks like you are being asked a question which depends on the
>>> semantics of the data and they forgot to tell you what they are. It
>>> may be just that there are two techniical useages of some otherwise
>>> innocent term like line or row.
>> Yes, I think that's a fair assumption as well; my response simply
>> addressed the specifics of the question as whether the database export
>> was embedding some other unexpected control character owing to the
>> encoding or somesuch. That doesn't appear to be the problem at all;
>> rather as you say it appears it's simply the line breaks in the original
>> data are likely just being imported into a text-based database record as
>> found in the original data source.
....

> But the database access client is able to tell the end-of-rows from
> the newline markers embedded in some columns.

Don't think that proves anything...

> In other words, you will see two rows when you are in the client.
> When you say 'save results to csv file' it becomes a file which seems
> to have 5 rows as seem by excel, notepad, fortran etc.

As above, all that does is say that the client of the database is
displaying two rows. It doesn't say anything useful for this issue
about what is actually in the database record. Can you somehow capture
what is actually embedded in the database record w/o the user display by
some other export method that doesn't format it but dumps it as a
stream? Or going back a step, what's in the input to the database--is
it a single record or multiple lines from some formatted data stream?
Either of those could show where the CRLF is coming from perhaps.

Not only does the csv file "seem" to have five rows, it _does_ have five
rows (as the dump explicitly showed).

> I am surprised nobobdy else seems to have faced this problem.

What database/what client? I've not done enough true database engine
work in 40 years to have ever run into much of any problem with one;
when I did have needs in the area I farmed that portion out religiously.

But, either the data are embedded w/ the LF pairs and the export
function is simply echoing them or it's broke or configured for short
lines that are wrapping or somesuch.

W/O more of the rest of the puzzle don't think there's anything else
that can be said other than it isn't a Fortran problem; it's the data
file itself that's your problem. How to fix that the way you want I've
no clue, specifically.

--