From: Dmitry A. Kazakov on
On Tue, 10 Aug 2010 22:22:02 +0100, Simon Wright wrote:

> "Dmitry A. Kazakov" <mailbox(a)dmitry-kazakov.de> writes:
>
>> Yes, if TCP sockets is what you use. There is a hell of other
>> protocols even on the Ethernet, some of which are not stream-oriented.
>
> We leverage stream i/o via UDP (which on GNAT is seriously broken,
> because it tries to do an OS sendto() for each elementary component!

Wow! Even in the latest GNAT Pro/GPL?

I am asking because AdaCore fixed this very issue for String'Write, which
did Character'Write per element. Was it String-only fix?

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de
From: Dmitry A. Kazakov on
On Tue, 10 Aug 2010 16:42:08 -0500, Randy Brukardt wrote:

> "Dmitry A. Kazakov" <mailbox(a)dmitry-kazakov.de> wrote in message
> news:1o6jah15584x1$.1arrrgog9jdk7.dlg(a)40tude.net...
>> On Mon, 9 Aug 2010 20:17:40 -0500, Randy Brukardt wrote:
>>
>>> "Dmitry A. Kazakov" <mailbox(a)dmitry-kazakov.de> wrote in message
>>> news:1y1c8zzqmcer5.po56hkesa968.dlg(a)40tude.net...
>>> ...
>>>>> For these, you don't want modular semantics -- you just want
>>>>> a data type whose representation matches what you're
>>>>> interfacing/communicating with, such as "type Octet is
>>>>> range 0..2**8-1;"
>>>>
>>>> The problem is that these things require both array-of-Boolean view and
>>>> arithmetic. I agree that when arithmetic is used, then it has to be
>>>> wide.
>>>> E.g. when interpreting sequences of octets as little/big endian numbers,
>>>> we
>>>> never use modular arithmetic. But integer arithmetic is incompatible
>>>> with
>>>> array/set view.
>>>
>>> What have you done with Dmitry?? You can't be the *real* Dmitry! :-)
>>
>> Brainwashed me? (:-))
>>
>>> Array-of-bytes views and arithmetic views are of clearly different types,
>>> with different sets of operations. These shouldn't be mixed, logically or
>>> any other way. If you need to go between these views, you need some sort
>>> of
>>> type conversion (or Unchecked_Conversion or Stream'Read or...). Thus, it
>>> is
>>> *never* necessary to do operations on both views at once, and it is
>>> irrelevant what the "math" operations for octets is. If Ada gets anything
>>> wrong about this, it is that it has math operations at all for
>>> stream_elements.
>>
>> Right, but there is no contradiction because it is more than one type
>> involved. What I meant is:
>>
>> type Octet is ...;
>>
>> -- Array interface to access bits of the octet (not Ada)
>> type Bit_Index is range 1..8;
>> function "()" (Unit : Octet; Bit : Bit_Index) return Boolean;
>> procedure "()" (Unit : in out Octet; Bit : Bit_Index; Value : Boolean);
>>
>> -- Arithmetic interface, immediately leads out of octets (not Ada)
>> function "+" (Left, Right : Octet) return Universal_Integer;
>> function "-" (Left, Right : Octet) return Universal_Integer;
>> ...
>> So when you write:
>>
>> Little_Endian_Value := Octet_1 + Octet_2 * 256;
>>
>> There result is not an octet, as it would be with a modular type.
>> Presently
>> it is just broken.
>
> I wouldn't mess with mixed "+" routines that return other types. I'd just
> convert the Octets to a suitable type and add them. That is, any mess should
> be in type conversions, not in operators.

Conversion mess is what we already have right now. The point is that "+" is
well-defined and meaningful for octets, but it is not closed in there. Why

function "**" (Left : T; Right : Natural) return T;
function S'Pos(Arg : S'Base) return universal_integer;
...

are OK and "+" is not?

> Since we're inventing things, I would suggest:
>
> -- Conversion function (Not Ada):
> function "#" (Right : Octet) return Universal_Integer;
>
> Little_Endian_Value := #Octet_1 + #Octet_2 * 256;
>
> And now you don't need any Octet math.

1. What is this else?

2. I see no advantage over

Little_Endian_Value := Integer (Octet_1) + Integer (Octet_2) * 256;

> BTW, the "#" operator was suggested by Jean Ichbiah and others back during
> the design of Ada 83, to be used for conversions between types. It's a pity
> that it or something like it never made it into Ada. (It comes up each time
> and still is rejected.)

It would be nice to have more operators than now ("^", "|", "..", "in",
":=", "<>", "()", "[]", "{}", "<=>", "@", "~" and so on. Ah, Ada is Unicode
now, here is the list: http://www.unicode.org/charts/PDF/U2200.pdf ).

But what is so special in type conversions, aren't they just functions?

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de
From: Natacha Kerensikova on
On Aug 10, 5:46 pm, "Dmitry A. Kazakov" <mail...(a)dmitry-kazakov.de>
wrote:
> On Tue, 10 Aug 2010 05:06:29 -0700 (PDT), Natacha Kerensikova wrote:
> > The first object is the internal memory representation designed for
> > actual efficient use. For example, an integer will probably be
> > represented by its binary value with machine-defined endianness and
> > machine-defined size.
>
> > The other object is a "serialized" representation, in the sense that
> > it's designed for communication and storage, for example the same
> > integer, in a context where it will be sent over a network, can be for
> > example represented as an ASCII-encoded decimal number, or in binary
> > but with a predefined size and endianness.
>
> Why don't you send it at once?

As I said, I can't just insert the raw object in the stream, I need at
least to know its size. I might need further inspection of the
serialized representation in case I want a "smart" chose of atom
encoding, but I'm afraid it wasn't a good idea to mention that point
because it doesn't fit into the simplified concept of S-expression I
have been discussing with you for quite some posts.

However my proposed Sexp_Stream does send it as soon as it gets the
whole representation, and this idea comes from our discussion.

> > This is really the same
> > considerations as when storing or sending an object directly, except
> > that is has to reside in memory for a short time. There is no more
> > conversions or representations than when S-expression-lessly storing
> > or sending objects; the only difference is the memory buffering to
> > allow S-expression-specific information to be inserted around the
> > stream.
>
> This is impossible in general case, so the question why. As an example
> consider a stateful communication protocol (existing in real life) which is
> reacts only on changes. When you send your integer nothing happens because
> the device reacts only when the bit pattern changes. So if you wanted to
> really send it to another side you have to change something in the
> representation of integer, e.g. to toggle some extra bit.

Well obviously S-expressions aren't designed to be transmitted over
such a protocol. The basic assumption behind S-expression that we can
transmit/store/whatever octet sequences and receive/retrieve/whatever
them intact. When the assumption doesn't hold, either something must
be done to make it true (e.g. add another layer) or S-expressions must
be abandoned.

For example, S-expressions small enough to fit in one packet can be
easily transferred over UDP. S-expression parsers (or at least mine)
handle well fragmented data (even when unevenly fragmented) but fail
when data is missing or mis-ordered, which prevent large S-expressions
to be simply spread over as many packets as needed. However one might
solve this issue by adding a sequence number inside the UDP payload,
along with a mechanism to re-send lost packet; however that would be
(at least partially) re-inventing TCP.

> >> What was the problem then?
>
> > The problem is to organize different objects inside a single file. S-
> > expression standardize the organization and relations between objects,
> > while something else has to be done beyond S-expression to turn
> > objects into representations suitable to live in a file.
>
> > [...]
>
> Yes, I don't see how S-expression might help there. They do not add value,
> because the work of serialization or equivalent to serialization is already
> done while construction of the expression object.

There are two things that are to be serialized: objects into atoms,
and relations between objects into S-expression-specific stuff. The S-
expression object is an unserialized in-memory representation of
relations between serialized representations of objects. The writing
of an S-expression into a stream is the final serialization-of-
relations stage.

> There are two questions to discuss:
>
> 1. The external storage format: S-expressions vs. other
> 2. Construction of an intermediate S-expression object in the memory
>
> You justified #1 by an argument to legacy. You cannot [re-]use that
> argument for #2. (OK, since Ludovic had already done it, you could (:-))

I don't re-use that argument. And actually if you followed the
description of my Sexp_Stream, I don't need a S-expression object in
memory, I only need serialized representation of atoms. The rest can
be directly send into a stream.

And while I occasionally feel the need of an in-memory S-expression
object, so far it has never been for writing or sending, it was always
for specific sub-S-expression that are read or received. I believe
this need happens only when I have a variable of type S-expression,
which I consider to be as good a type as String or Natural. It is then
a data-structure choice, which happens at a higher level than
encoding, serialization, I/O or most of what we have discussed so far.

> >> Why do you need S-sequence in the memory, while dumping
> >> objects directly into files as S-sequences (if you insist on having them)
> >> is simpler, cleaner, thinner, faster.
>
> > Because I need to examine the S-sequence before writing it to disk, in
> > order to have enough information to write S-expression metadata. At
> > the very lest, I need to know the total size of the atom before
> > allowing its first byte to be send into the file.
>
> That does not look like a stream! But this is again about abstraction
> layers. Why do you care?

The "verbatim encoding" of an atom, which is the only one allowed in
canonical representation of a S-expression, is defined as follow: a
size, represented as the ASCII encoding of the decimal representation
of the number of octets in the atom, without leading zero (therefore
of variable length); followed by the ASCII character ':'; followed by
the octet sequence of the atom.

You can't write an atom using such a format when you don't know in
advance the number of octets in the atom.

The idea behind S-expressions could be seen as the serialization of a
list of serialized objects. When serializing such a list one much be
able to distinguish between the different objects; to the best of my
knowledge this can only be done either by keeping track of object
sizes, or by using separators. To prevent the restriction of possible
atom contents, the first solution has been chosen.

> > That sounds like a very nice way of doing it. So in the most common
> > case, there will still be a stream, provided by the platform-specific
> > socket facilities, which will accept an array-of-octets, and said
> > array would have to be created from objects by custom code, right?
>
> Yes, if TCP sockets is what you use. There is a hell of other protocols
> even on the Ethernet, some of which are not stream-oriented.

But you were talking about Octet'Read and Octet'Write. Aren't these
Ada Stream based stuff?

> >> In other post Jeffrey Carter described this as low-level. Why not to tell
> >> the object: store yourself and all relations you need, I just don't care
> >> which and how?
>
> > That's indeed a higher-level question. That's how it will happen at
> > some point in my code; however at some other point I will still have
> > to actually implement said object storage, and that's when I will
> > really care about which and how. I'm aware from the very beginning
> > that a S-expression library is low-level and is only used by mid-level
> > objects before reaching the application.
>
> This is what caused the questions. Because if the problem is serialization,
> then S-expression does not look good.

Why? Because it's a partial serialization? Because it serializes stuff
you deem as useless? Because it's a way of serializing stuff you would
have serialized in another way? I still don't understand what is so
bad with S-expressions. While I understand gut-rejection of anything-
with-parentheses (including lisp and S-expressions), you seem way
above that.

> >> You do not need S-expressions here either. You can
> >> store/restore templates as S-sequences. A template in the memory would be
> >> an object with some operations like Instantiate_With_Parameters etc. The
> >> result of instantiation will be again an object and no S-sequence.
>
> > Well how would solve the problem described above without S-
> > expressions? (That's a real question, if something simpler and/or more
> > efficient than my way of doing it exists, I'm genuinely interested.)
>
> The PPN, a simple stack machine. Push arguments onto the stack, pop to
> execute an operation. Push the results back. Repeat.

Does that allows to push an operation and its arguments, to have it
executed by another operation? S-expressions do it naturally, and I
find it very useful in conditional or loop constructs.



Thanks for the discussion,
Natacha
From: Dmitry A. Kazakov on
On Wed, 11 Aug 2010 02:43:58 -0700 (PDT), Natacha Kerensikova wrote:

> On Aug 10, 5:46�pm, "Dmitry A. Kazakov" <mail...(a)dmitry-kazakov.de>
> wrote:

>> Why don't you send it at once?
>
> As I said, I can't just insert the raw object in the stream,

You can encode it.

> I need at least to know its size.

No, you don't and you cannot, because the object's size has nothing to do
to the object representation in the encoding used for the stream. Consider
two systems using common vocabulary. Instead of sending objects you do the
ids asking the partner to construct an object for this id.

> For example, S-expressions small enough to fit in one packet can be
> easily transferred over UDP. S-expression parsers (or at least mine)
> handle well fragmented data (even when unevenly fragmented) but fail
> when data is missing or mis-ordered, which prevent large S-expressions
> to be simply spread over as many packets as needed. However one might
> solve this issue by adding a sequence number inside the UDP payload,
> along with a mechanism to re-send lost packet; however that would be
> (at least partially) re-inventing TCP.

You can recode/pack S-expressions into something else. That does not answer
the question why. And from you responses it becomes less and less clear
where. What is the intended S-expression implementation according to the
OSI classification? We tried almost all levels, it looks pretty much
everything and nothing, a Ding-an-sich.

>>>> What was the problem then?
>>
>>> The problem is to organize different objects inside a single file. S-
>>> expression standardize the organization and relations between objects,
>>> while something else has to be done beyond S-expression to turn
>>> objects into representations suitable to live in a file.
>>
>>> [...]
>>
>> Yes, I don't see how S-expression might help there. They do not add value,
>> because the work of serialization or equivalent to serialization is already
>> done while construction of the expression object.
>
> There are two things that are to be serialized: objects into atoms,
> and relations between objects into S-expression-specific stuff. The S-
> expression object is an unserialized in-memory representation of
> relations between serialized representations of objects. The writing
> of an S-expression into a stream is the final serialization-of-
> relations stage.

It is #2? (see below)

>> There are two questions to discuss:
>>
>> 1. The external storage format: S-expressions vs. other
>> 2. Construction of an intermediate S-expression object in the memory
>>
>> You justified #1 by an argument to legacy. You cannot [re-]use that
>> argument for #2. (OK, since Ludovic had already done it, you could (:-))
>
> I don't re-use that argument. And actually if you followed the
> description of my Sexp_Stream, I don't need a S-expression object in
> memory, I only need serialized representation of atoms. The rest can
> be directly send into a stream.

It is not #2, only #1?

>>>> Why do you need S-sequence in the memory, while dumping
>>>> objects directly into files as S-sequences (if you insist on having them)
>>>> is simpler, cleaner, thinner, faster.
>>
>>> Because I need to examine the S-sequence before writing it to disk, in
>>> order to have enough information to write S-expression metadata. At
>>> the very lest, I need to know the total size of the atom before
>>> allowing its first byte to be send into the file.
>>
>> That does not look like a stream! But this is again about abstraction
>> layers. Why do you care?
>
> The "verbatim encoding" of an atom, which is the only one allowed in
> canonical representation of a S-expression, is defined as follow: a
> size, represented as the ASCII encoding of the decimal representation
> of the number of octets in the atom, without leading zero (therefore
> of variable length); followed by the ASCII character ':'; followed by
> the octet sequence of the atom.

And TCP requires checksums, port numbers, destination address etc. Do you
care of them writing a socket?

>>> That sounds like a very nice way of doing it. So in the most common
>>> case, there will still be a stream, provided by the platform-specific
>>> socket facilities, which will accept an array-of-octets, and said
>>> array would have to be created from objects by custom code, right?
>>
>> Yes, if TCP sockets is what you use. There is a hell of other protocols
>> even on the Ethernet, some of which are not stream-oriented.
>
> But you were talking about Octet'Read and Octet'Write. Aren't these
> Ada Stream based stuff?

It is a stream interface. You can hang this interface on something that is
not a stream, e.g. a text file. (We are hobbling in circles because of
mixed abstractions.)

>> This is what caused the questions. Because if the problem is serialization,
>> then S-expression does not look good.
>
> Why? Because it's a partial serialization?

Yes

> Because it serializes stuff you deem as useless?

Yes, it does a structure of nested brackets. It does not serialize objects,
you need a lot of meat on the carcass. So if the problem is serialization
(of objects), then S-expression yet is not a solution.

>> The PPN, a simple stack machine. Push arguments onto the stack, pop to
>> execute an operation. Push the results back. Repeat.
>
> Does that allows to push an operation and its arguments, to have it
> executed by another operation?

Operation is executed, no matter by what. PPN is executable. Whatever you
want to be executed has to be expressed in PPN. The PPN itself can be
pushed onto the stack if you wanted (=a von Neumann machine). I don't know
why would you need that stuff (~trampoline), however.

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de
From: Natacha Kerensikova on
On Aug 11, 12:37 pm, "Dmitry A. Kazakov" <mail...(a)dmitry-kazakov.de>
wrote:
> On Wed, 11 Aug 2010 02:43:58 -0700 (PDT), Natacha Kerensikova wrote:
> > On Aug 10, 5:46 pm, "Dmitry A. Kazakov" <mail...(a)dmitry-kazakov.de>
> > wrote:
> >> Why don't you send it at once?
>
> > As I said, I can't just insert the raw object in the stream,
>
> You can encode it.

Indeed, but to encode anything I need to access it, i.e. having it
live in the RAM. That's exactly what my in-memory serialized are for.

> > I need at least to know its size.
>
> No, you don't and you cannot, because the object's size has nothing to do
> to the object representation in the encoding used for the stream. Consider
> two systems using common vocabulary. Instead of sending objects you do the
> ids asking the partner to construct an object for this id.

I meant the size of the serialized representation. I can't store or
transmit something when I don't know the size of what is to be stored
or to be transmitted. If you want to transfer id instead of objects,
fine, then the id is the transmitted object and I have to know its
size. Using id is just moving the problem without solving it.

> What is the intended S-expression implementation according to the
> OSI classification?

The primary intended use of this implementation is disk file I/O,
which doesn't fit well into the OSI classification.

When I used its C counterpart over a network, it has always been over
stream sockets (UNIX stream sockets or TCP sockets), so it's clearly
above the transport layer. I've never been able to figure out what is
what on the level 5 and above, but that's where my implementation
would be.

Now real-life S-expressions are used as a part of SPKI and IMAP. I
think IMAP is on level 7, but I'm not that tells us where IMAP S-
expressions are; and I haven't found enough information about SPKI to
know where it is, but from its purpose I guess it's probably also on
level 5 or above.

I don't know how this information helps you though.

> >>>> What was the problem then?
>
> >>> The problem is to organize different objects inside a single file. S-
> >>> expression standardize the organization and relations between objects,
> >>> while something else has to be done beyond S-expression to turn
> >>> objects into representations suitable to live in a file.
>
> >>> [...]
>
> >> Yes, I don't see how S-expression might help there. They do not add value,
> >> because the work of serialization or equivalent to serialization is already
> >> done while construction of the expression object.
>
> > There are two things that are to be serialized: objects into atoms,
> > and relations between objects into S-expression-specific stuff. The S-
> > expression object is an unserialized in-memory representation of
> > relations between serialized representations of objects. The writing
> > of an S-expression into a stream is the final serialization-of-
> > relations stage.
>
> It is #2? (see below)

When we are talking about a S-expression object, considering I
understand the word "object" as something that lives in RAM, then yes
we are in #2.

But this isn't required: my proposed Sexp_Stream never actually
construct a S-expression object, it serializes S-expression stuff
directly into the underlying stream, around the provided
serializations that are atoms.

> >> There are two questions to discuss:
>
> >> 1. The external storage format: S-expressions vs. other
> >> 2. Construction of an intermediate S-expression object in the memory
>
> >> You justified #1 by an argument to legacy. You cannot [re-]use that
> >> argument for #2. (OK, since Ludovic had already done it, you could (:-))
>
> > I don't re-use that argument. And actually if you followed the
> > description of my Sexp_Stream, I don't need a S-expression object in
> > memory, I only need serialized representation of atoms. The rest can
> > be directly send into a stream.
>
> It is not #2, only #1?

#1 using S-expression storage format is a choice driven by existing
data being in that format and personal taste.

#2 constructing a S-expression object in memory, is something that may
or may not be needed, depending on the application. Parsing a
configuration like my tcp-connect example does not, hence my
proposition of Sexp_Stream which functions without S-expression
objects in memory. My S-expression templates still seem (to me) to
require a S-expression object in memory (to be handed around to the
correct handler after dispatch).

When I proposed my Sexp_Stream, I postponed the design of a S-
expression object library, because it would be built upon the
Sexp_Stream which isn't fixed yet. And depending on Ludovic Brenta's
progress, I might never design my S-expression object and re-use his.

> >>>> Why do you need S-sequence in the memory, while dumping
> >>>> objects directly into files as S-sequences (if you insist on having them)
> >>>> is simpler, cleaner, thinner, faster.
>
> >>> Because I need to examine the S-sequence before writing it to disk, in
> >>> order to have enough information to write S-expression metadata. At
> >>> the very lest, I need to know the total size of the atom before
> >>> allowing its first byte to be send into the file.
>
> >> That does not look like a stream! But this is again about abstraction
> >> layers. Why do you care?
>
> > The "verbatim encoding" of an atom, which is the only one allowed in
> > canonical representation of a S-expression, is defined as follow: a
> > size, represented as the ASCII encoding of the decimal representation
> > of the number of octets in the atom, without leading zero (therefore
> > of variable length); followed by the ASCII character ':'; followed by
> > the octet sequence of the atom.
>
> And TCP requires checksums, port numbers, destination address etc. Do you
> care of them writing a socket?

I would if I were implementing a TCP library based on something of
lower OSI-level.

Now I'm writing a S-expression library based on TCP sockets (or disk
streams), so I don't care about TCP details. However I do care about S-
expression details, among which is the "verbatim encoding".

> >>> That sounds like a very nice way of doing it. So in the most common
> >>> case, there will still be a stream, provided by the platform-specific
> >>> socket facilities, which will accept an array-of-octets, and said
> >>> array would have to be created from objects by custom code, right?
>
> >> Yes, if TCP sockets is what you use. There is a hell of other protocols
> >> even on the Ethernet, some of which are not stream-oriented.
>
> > But you were talking about Octet'Read and Octet'Write. Aren't these
> > Ada Stream based stuff?
>
> It is a stream interface. You can hang this interface on something that is
> not a stream, e.g. a text file. (We are hobbling in circles because of
> mixed abstractions.)

And what do you deduce from this?

In the fourth level of quote above, I'm basically asking whether I
correctly understood your idea by reformulating it and asking for
confirmation ("…, right?").

Third level of quote, you seem to agree with a restriction.

Second level of quote, I try to understand the restriction.

And now I still don't know whether the idea in the fourth quote level
is right or wrong, and if it's wrong what it needs to be set right.
That's indeed a sad circle, and I don't even have a clue of what mixed
abstractions have to do with the circle. That's almost the point where
I stop trying to understand and just give up everything, Ada looks way
above my brain's capabilities.

> >> This is what caused the questions. Because if the problem is serialization,
> >> then S-expression does not look good.
>
> > Why? Because it's a partial serialization?
>
> Yes
>
> > Because it serializes stuff you deem as useless?
>
> Yes, it does a structure of nested brackets. It does not serialize objects,
> you need a lot of meat on the carcass. So if the problem is serialization
> (of objects), then S-expression yet is not a solution.

Indeed, because S-expressions assume the serialization of objects
(into atoms) is already solved, and address the problem of serializing
a higher level of information about these objects.

> Operation is executed, no matter by what. PPN is executable. Whatever you
> want to be executed has to be expressed in PPN. The PPN itself can be
> pushed onto the stack if you wanted (=a von Neumann machine). I don't know
> why would you need that stuff (~trampoline), however.

I still haven't found what PPN is, but from your description it looks
similar to my S-expression with semantics provided by an interpreter.

What I need can be basically examplified as the following S-expression
being sent to a list-of-object objects:
("<h1>Contents of " (title) " list</h1>"
(for-each-item "<h2>" (title) "</h2><p>" (text) "</p>"))

In the first list line, atoms (double-quoted) are taken as-is into the
output HTML stream. (title) is a call to the list's "title" function,
which is substituted by that particular list's title. Then the "for-
each-item" function is called, which sends its arguments to each of
item of the list. Here again, following atoms are copied as-is, and
this time (title) is interpreted by item objects and substituted by
the item's title, and then the item's text substitutes (text).

That doesn't sound like an alien use of operations being passed as an
argument to another operation.



Regards,
Natacha
First  |  Prev  |  Next  |  Last
Pages: 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
Prev: GPRbuild compatibility
Next: Irony?