From: Dmitry A. Kazakov on
On Wed, 12 May 2010 15:14:59 -0700 (PDT), Maciej Sobczak wrote:

> On 12 Maj, 17:58, "Dmitry A. Kazakov" <mail...(a)dmitry-kazakov.de>
> wrote:
>
>>>> Why an Ada program handling texts need to be aware of your system?
>>
>>> Because it is running on my system. It's a pretty good idea to have
>>> some local awareness.
>>
>> No, that makes the program non-portable and fragile against system
>> modifications.
>
> As I already said, I'm OK with this kind of non-portability.
> It does not bother me that my program will not work on non-existing
> systems (which effectively includes those obscure systems that you can
> surely refer to for the sake of example, but that none of us will ever
> use anyway).
> I am pragmatic here.

I am too. I know that Windows does exist and that LF /= CR LF. See:

http://en.wikipedia.org/wiki/CR/LF

Al this mess is abstracted away by Ada.Text_IO.

(And OpenVMS exists too)

>>> Even assuming that I can open some file, the whole "abstraction of a
>>> text" is severely limited. I cannot, for example, read the second
>>> paragraph of the third chapter of the book that is in the file.
>>
>> It is necessarily limited here because not every text is a book.
>
> Bingo. So what is a text, really?

It is not a book.

> Ada.Text_IO (similarly to text I/O libraries in other languages)
> focuses on lines. This is similarly outdated as focusing on columns
> and really belongs to the same era.
> When I read the text in my web browser (the most frequently used
> application for text consumption nowadays), the "lines" are whatever
> the browser cares to display, which depends on how I scale the window
> and what font size I select. I can change these properties dynamically
> and in particular nothing prevents me from opening the same document
> in several windows, each differently scaled.

Browser renders texts. It means that text is not the input, but the output
of. The input of a browser is a program written in some ugly language named
HTML.

> This means that "lines" is not a property of the data source, it is a
> property of the display.

I consider text an ordered set of lines.

> Yet everybody is dead focused on the concept of text files that are
> composed of lines. There are no lines, really.

Of course there are, see //-comments in C++.

> If "text" means anything more than a stream on steroids, then it must
> be recognized to have a much richer structure than just a sequence of
> lines. Paragraphs, headings, chapters, whatever - these are structural
> elements of text.

That was exactly the mistake Ada designers made with Text_IO. They tried to
add more there, pagination etc. They should have stay with lines.

(In software design this type of error is called "fat class")

> With this in mind, the "text abstraction" as represented by
> Ada.Text_IO is really a stream on steroids. That's why I don't
> understand why you put so much stress on distinguish the "text" from
> the "stream", while the effective distance between them is close to
> zero.

Because I used to write compilers and serial communication protocols.

>>> Not only. It cannot detect the end of stream without blocking, for
>>> example.
>>
>> That is a property of the stream. Don't use streams as texts.
>
> Translation: Text_IO cannot detect the end of text without blocking.

No, the translation is: stream does not have an end. You cannot detect
something that does not exist. Text has an end. Ergo, if you wanted a text
on the stream, you would need an encoding layer. (This layer exists and is
customarily broken in UNIX and Windows.)

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de
From: Warren on
Dmitry A. Kazakov expounded in
news:1g82ubkc0t0pf$.nuj3gqp1buh6.dlg(a)40tude.net:
> On Wed, 12 May 2010 15:14:59 -0700 (PDT), Maciej Sobczak wrote:
...
>> I am pragmatic here.
>
> I am too. I know that Windows does exist and that LF /= CR LF. See:
...
> Al this mess is abstracted away by Ada.Text_IO.

Windoze may be a mess, but managing LF and/or CR
is hardly rocket science. I and most other folks
I know have easily dealt with it since the CP/M days.

Warren
From: Niklas Holsti on
J-P. Rosen wrote:
> Warren a �crit :
>> J-P. Rosen expounded in news:hsb97m$tnd$1(a)news.eternal-september.org:
>>
>>> Dmitry A. Kazakov a �crit :
>>>> On Mon, 10 May 2010 18:50:02 +0000 (UTC), Warren wrote:
>>>>
>>>>> But you have no way to know when you've read
>>>>> a empty line in a lexer routine that is reading
>>>>> character by character.
>>> Come on, here is the magic function:
>>>
>>> function Empty_Line return Boolean is
>>> begin
>>> return Col=1 and End_Of_Line;
>>> end Empty_Line;
>> There's a wart-- see earlier posts about "end file".
>>
> Of course, this assumes an well-formed Ada file.
> FYI, there /is/ a scanner in AdaControl, and I never had a problem.
> The trick is to check End_Of_Line when needed, and never check
> End_Of_File, but handle End_Error instead. This works even for
> ill-formed files.

I don't think that this trick will help with the "wart", which is the
invisibility of a final null (zero-length) line in a Text_IO file.

As best I understand it, the Text_IO view of a text file as a sequence
of lines excludes the case of a file with no lines at all, which
logically would consist of a file terminator, possibly preceded by a
page terminator, but not preceded by a line terminator.

Text_IO insists that the file terminator is always preceded by line and
page terminators, logically if not physically, and the Text_IO
operations are defined to hide the physical presence or absence of a
final line terminator at the end of the file. Therefore every Text_IO
file seems to have at least one line terminator, and thus at least one line.

> And of course, you are welcome to have a look at my scanner in
> AdaControl ;-)

I did, and I even compiled and tried it, but since your scanner ignores
and skips all line terminators (there is no "end of line" token kind),
its operation does not illuminate the question of the "wart".

But this "wart" is a very small blemish, perhaps just a beauty mark, as
it has no effect on most uses of Text_IO, for scanners or otherwise.

--
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
. @ .
From: Maciej Sobczak on
On 13 Maj, 09:31, "Dmitry A. Kazakov" <mail...(a)dmitry-kazakov.de>
wrote:

> > I am pragmatic here.
>
> I am too. I know that Windows does exist and that LF /= CR LF.

<shrug />

This is an added value? I call it a trivial stream filter with barely
one bit of state in the finite-state machine. Nothing to cheer about.

> Al this mess is abstracted away by Ada.Text_IO.

This "abstraction" is peanuts. Like in
Ada.Stream_IO_With_Trivial_Filter. :-)

> (And OpenVMS exists too)

I guess that the Commodore 64 emulator for iPhone has a wider user
base - please do not refer to effectively non-existing systems just to
keep the discussion going. It's wasteful.

> > So what is a text, really?
>
> It is not a book.

A scientific paper, perhaps? Or a CV?

If you stick to the concept of "text" as defined by Text_IO
(completely unstructured sequence of lines), then effectively the only
use-case that you will cover without problems is... config files. Like
in Ada.Config_Files_IO. :-)

> Browser renders texts. It means that text is not the input, but the output
> of. The input of a browser is a program written in some ugly language named
> HTML.

Wrong. I use the browser to read .txt files, too. No HTML is
necessary.

> I consider text an ordered set of lines.

I consider such a set of lines to be a stream on steroids. The problem
is that our definitions are arbitrary and we're not going to conclude
anything.

> > Yet everybody is dead focused on the concept of text files that are
> > composed of lines. There are no lines, really.
>
> Of course there are, see //-comments in C++.

The C++ source code is not a text for me.
I know that it is a text for you due to your arbitrarily chosen
definitions.

> > Translation: Text_IO cannot detect the end of text without blocking.
>
> No, the translation is: stream does not have an end.

That's your arbitrary definition and you did not provide any reference
for it.
I see no reason to accept it.

In particular, /dev/null is a very nice empty stream. It is not a
stream that has no data for infinite amount of time (this can be
emulated) - this is genuinely empty stream which has an end and that
end can be detected immediately.

--
Maciej Sobczak * http://www.inspirel.com

YAMI4 - Messaging Solution for Distributed Systems
http://www.inspirel.com/yami4
From: Dmitry A. Kazakov on
On Fri, 14 May 2010 14:03:17 -0700 (PDT), Maciej Sobczak wrote:

> On 13 Maj, 09:31, "Dmitry A. Kazakov" <mail...(a)dmitry-kazakov.de>
> wrote:
>
>>> I am pragmatic here.
>>
>> I am too. I know that Windows does exist and that LF /= CR LF.
>
> <shrug />
>
> This is an added value? I call it a trivial stream filter with barely
> one bit of state in the finite-state machine. Nothing to cheer about.

Do you want to say that text files are Turing complete? (:-)) Maybe they
are. Maybe they are not. Consider an experiment. Let us generate random
files containing CRs, LFs and other characters. Show the file to several
people asking them to count lines...

>> Al this mess is abstracted away by Ada.Text_IO.
>
> This "abstraction" is peanuts. Like in
> Ada.Stream_IO_With_Trivial_Filter. :-)

If it were, there would be no discussions about it.

>> (And OpenVMS exists too)
>
> I guess that the Commodore 64 emulator for iPhone has a wider user
> base - please do not refer to effectively non-existing systems just to
> keep the discussion going. It's wasteful.

Ada never was popular.

>>> So what is a text, really?
>>
>> It is not a book.
>
> A scientific paper, perhaps? Or a CV?

A memo, minutes etc.

> If you stick to the concept of "text" as defined by Text_IO
> (completely unstructured sequence of lines), then effectively the only
> use-case that you will cover without problems is... config files.

+ source codes, which nicely covers 90% of my tasks.

I don't need Ada.Document_IO. Especially because it is a far bigger mess
than stupid LFs in UNIX files.

>> Browser renders texts. It means that text is not the input, but the output
>> of. The input of a browser is a program written in some ugly language named
>> HTML.
>
> Wrong. I use the browser to read .txt files, too. No HTML is
> necessary.

Then some other encoding is used. That changes nothing. Browser renders LFs
according to some rules, the result you see is the text (or one of many
possible).

>>> Yet everybody is dead focused on the concept of text files that are
>>> composed of lines. There are no lines, really.
>>
>> Of course there are, see //-comments in C++.
>
> The C++ source code is not a text for me.

Then you shall not use text editors with it.

>>> Translation: Text_IO cannot detect the end of text without blocking.
>>
>> No, the translation is: stream does not have an end.
>
> That's your arbitrary definition and you did not provide any reference
> for it.

It does not mean that no stream may have it. It is only a negation that
every stream does.

Anyway, if you tried to define the stream end, you could not do in terms of
its elements, You will need some "non-functional," "out-of-band" return
codes, exceptions etc. Blocking is just among others. You could say: it
ends when blocked. Not very nice, but, in fact, widely used in network
communication protocols.

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de