From: Dmitry A. Kazakov on
On Sun, 1 Aug 2010 10:35:17 -0700 (PDT), Natacha Kerensikova wrote:

> On Aug 1, 2:53�pm, "Dmitry A. Kazakov" <mail...(a)dmitry-kazakov.de>
> wrote:

>> How can it make sense if the type is unknown? If the type is known, why not
>> to state it?
>
> Actually the type is deduced from the context, e.g.
> (tcp-connect (host foo.example) (port 80))

Hmm, why do you consider brackets as separate elements? I mean, more
natural would be "procedural":

tcp-connect (host (foo.example), port (80))

or

tcp-connect (foo.example, 80)

> This means that as far as the S-expression library is concerned, the byte
> sequence read from the file is not typed. Its typing actually has to be
> delayed until the application has enough context to interpret it. Hence
> the need of a typeless chunk of data.

The above is mere syntax, it is not clear why internal/external
representation must be as you described. (Actually, the structures as above
are widely used in compiler construction e.g. syntax tree, Reverse Polish
notation etc.)

There is no such thing as untyped data. The information about the type must
be somewhere. In your example it could be:

type Connect is new Abstract_Node with record
Host : IP_Address;
Port : Port_Type;
end record;

So I would derive leaves from some abstract base type and link them into an
acyclic graph. It is relatively simple to do using either standard Ada
containers or other libraries containing trees and/or graphs.

>>> Though it looks like a fine Ada API (at least to my eyes), I have
>>> absolutely no idea about how to implement the library. How to define
>>> the application-opaque Sexp_Atom type?
>>
>> It is not clear why do you need such a type. What are the properties of,
>> and what is it for, given the application need to convert it anyway?
>
> Well, I imagined making such a new type to distinguish 'raw data coming
> from a S-expression to be interpreted' from other raw data types.

But you know the type if you are going to interpret raw data. It must be
somewhere, hard-coded or dynamically stored/restored.

BTW, Ada supports I/O of types (i.e type tags) implicitly in the form of
already mentioned stream attributes, but also explicitly by external type
tag representations (RM 3.9(7/2)) and generic dispatching constructors (RM
3.9(18/1)).

> I thought this was the strong type safety to prevent (de)serialization
> procedure from trying to interpret just any chunk of memory.

No, actually it eases serialization because you can define
Serialize/Unserialize operations on the type.

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de
From: Natacha Kerensikova on
On Aug 1, 8:25 pm, Jeffrey Carter <spam.jrcarter....(a)spam.not.acm.org>
wrote:
> You might very well be able to use something like:
>
> [snip]

Thanks a lot for the example, it really looks like what I'm used to
(at least in C).

I have to admit I don't really grasp the difference between the vector
you use and the Storage_Array, but it seems I have to research it by
myself before asking here.

> If you can use unbounded strings as Brenta suggested, instead of an unbounded
> array of bytes (Storage_Element), then this would be even simpler.

Actually I'm a bit reluctant to use Ada strings for atoms, because I'm
afraid it might somehow interpret the data (e.g. locale or encoding
related stuff, or trouble with NUL inherited from C). I occasionally
include binary data in S-expressions, and while I keep it on the disk
as a text file (using hexadecimal or base-64 encoding), it is the S-
expression library's responsibility to load it into memory as the
original binary data.

On the other hand, most of my atoms are indeed strings, and Character
definition from A.1.35 looks exactly like a perfect mapping to bytes.
So if Ada strings have no issue with embedded NUL or non-graphics
character, and if binary data can be losslessly recovered once stored
into an Ada string, it could be the best type for atoms. It will
probably be a while before I reach the level of reading the Reference
Manual cover-to-cover, does anyone knows whether those "if"s are
guaranteed by the standard?


Thanks for your help,
Natacha
From: Ludovic Brenta on
Natacha Kerensikova writes:
> On Aug 1, 8:25 pm, Jeffrey Carter <spam.jrcarter....(a)spam.not.acm.org>
> wrote:
>> You might very well be able to use something like:
>>
>> [snip]
>
> Thanks a lot for the example, it really looks like what I'm used to
> (at least in C).
>
> I have to admit I don't really grasp the difference between the vector
> you use and the Storage_Array, but it seems I have to research it by
> myself before asking here.

The vector manages its own memory, you can grow and shrink it at will.
With a Storage_Array you must do that manually by allocating,
reallocating and freeing as needed.

>> If you can use unbounded strings as Brenta suggested, instead of an
>> unbounded array of bytes (Storage_Element), then this would be even
>> simpler.
>
> Actually I'm a bit reluctant to use Ada strings for atoms, because I'm
> afraid it might somehow interpret the data (e.g. locale or encoding
> related stuff, or trouble with NUL inherited from C).

No, they won't. Ada does not need NUL since it has proper arrays.

> I occasionally include binary data in S-expressions, and while I keep
> it on the disk as a text file (using hexadecimal or base-64 encoding),
> it is the S- expression library's responsibility to load it into
> memory as the original binary data.

I would still store the S-expressions in memory as Unbouded_Strings,
read straight from the file (i.e. in the hexadecimal or base-64
encoding). Then I would provide conversion subprograms for each data
type as needed on top of that.

> On the other hand, most of my atoms are indeed strings, and Character
> definition from A.1.35 looks exactly like a perfect mapping to bytes.
> So if Ada strings have no issue with embedded NUL or non-graphics
> character, and if binary data can be losslessly recovered once stored
> into an Ada string, it could be the best type for atoms. It will
> probably be a while before I reach the level of reading the Reference
> Manual cover-to-cover, does anyone knows whether those "if"s are
> guaranteed by the standard?

While Ada strings indeed have no problems with embedded NULs or
non-ASCII characters (the character set is Latin-1, not ASCII), it is
unwise to use character strings to store things that are not characters.

Like I said, you should see the character representation as the
low-level (storage) representation, then provide conversion subprograms
for the few non-String data types that you need.

--
Ludovic Brenta.
From: Dmitry A. Kazakov on
On Sun, 01 Aug 2010 21:53:58 +0200, Ludovic Brenta wrote:

> I would still store the S-expressions in memory as Unbouded_Strings,
> read straight from the file (i.e. in the hexadecimal or base-64
> encoding).

Strings is preferable to Unbouded_Strings in all cases when string content
is not changed, once the string is created. This is 90% of all cases,
including this one.

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de
From: Jeffrey Carter on
On 08/01/2010 12:43 PM, Natacha Kerensikova wrote:
>
> I have to admit I don't really grasp the difference between the vector
> you use and the Storage_Array, but it seems I have to research it by
> myself before asking here.

Storage_Array is a simple array; all instances of the type have a fixed size. So
to have a dynamic size, you'd have to do dynamic allocation and memory
management yourself. Vectors is an unbounded array abstraction, with all
allocation and memory management hidden from the client, and tested by many users.

> On the other hand, most of my atoms are indeed strings, and Character
> definition from A.1.35 looks exactly like a perfect mapping to bytes.
> So if Ada strings have no issue with embedded NUL or non-graphics
> character, and if binary data can be losslessly recovered once stored
> into an Ada string, it could be the best type for atoms. It will
> probably be a while before I reach the level of reading the Reference
> Manual cover-to-cover, does anyone knows whether those "if"s are
> guaranteed by the standard?

String is just an array of Character; there's nothing special about it. No
characters have special significance. Unbounded_String is just an unbound
variant of String, as the vector would be an unbounded variant of Storage_Array.
However, if you store non-Character data, then it would be more appropriate to
use a vector of Storage_Element. Possibly you might want to recognize your
extensive use of strings by having 3 variants, one for String, one for List, and
one for anything else.

--
Jeff Carter
"People called Romanes, they go the house?"
Monty Python's Life of Brian
79

--- news://freenews.netfront.net/ - complaints: news(a)netfront.net ---
First  |  Prev  |  Next  |  Last
Pages: 1 2 3 4 5 6 7 8 9 10 11 12
Prev: GPRbuild compatibility
Next: Irony?