From: Carlie Coats on
nmm1(a)cam.ac.uk wrote:
> In article <2poh07-rui.ln1(a)ntp.tmsw.no>,
> Terje Mathisen <"terje.mathisen at tmsw.no"> wrote:
>[snip...] it
>> probably isn't such a good idea to allow programmers to be surprised by
>> the difference between one size of data objects and another...
>> :-(
>
> [snip...we need radical changes in the
> language architectures, so that they don't rely on separate bytes
> being independent.
>
> I can think of how it would be done, but not starting from C/C++
> Java or even Fortran...
>
> Actually, I am being too hard on Fortran. I think that it could be
> done starting from modern Fortran, but the restrictions would be such
> that I don't think that Fortran programmers would accept the result.

As one of those Fortran programmers, I think my biggest problem
is trying to deal with external data formats designed by idiots.
For an extreme example, have a look at the official World Meteorology
Association standard for data interchange, GRIB. A readable starting
point would be <http://dss.ucar.edu/docs/formats/grib/gribdoc/>.
It has the most evil misalignment I've seen.

If I were to want to become the laughing stock of UseNet, one of
the easiest ways would be to post this spec on sci.data.formats
and comp.arch, and suggest it as a proposed standard ;-(

I have no influence here (USA) to get this changed. It is such a
sacred cow that it is hard even to get it recognized as evil by the
general meteorogy community here. Maybe you have more influence
in the UK, Nick?

-- Carlie Coats
From: nmm1 on
In article <7pp7j0F7nvU1(a)mid.individual.net>,
Carlie Coats <carlie(a)jyarborough.com> wrote:
>>
>> [snip...we need radical changes in the
>> language architectures, so that they don't rely on separate bytes
>> being independent.
>>
>> I can think of how it would be done, but not starting from C/C++
>> Java or even Fortran...
>>
>> Actually, I am being too hard on Fortran. I think that it could be
>> done starting from modern Fortran, but the restrictions would be such
>> that I don't think that Fortran programmers would accept the result.
>
>As one of those Fortran programmers, I think my biggest problem
>is trying to deal with external data formats designed by idiots.
>For an extreme example, have a look at the official World Meteorology
>Association standard for data interchange, GRIB. A readable starting
>point would be <http://dss.ucar.edu/docs/formats/grib/gribdoc/>.
>It has the most evil misalignment I've seen.

Alignment? What's that? :-(

I agree that is a problem, but it's actually a soluble one - not that
more than an infinitesimal proportion of programs do solve it. The
modern Fortran solution is TRANSFER, so it COULD be done fairly
cleanly in well isolated encoding/decoding modules.

My usual bugbear about external data formats designed by idiots is
the assumptions they make (but don't require) and the fact that they
don't allow for error detection on systems where they are false.
Have you ever imported IEEE denormals onto a system that doesn't
support them?

>I have no influence here (USA) to get this changed. It is such a
>sacred cow that it is hard even to get it recognized as evil by the
>general meteorogy community here. Maybe you have more influence
>in the UK, Nick?

No. The trouble is that the 'existing data and code' claim (which
is often false) is used to block even the simplest defect fixing.


Regards,
Nick Maclaren.
From: "Andy "Krazy" Glew" on
Carlie Coats wrote:
> For an extreme example, have a look at the official World Meteorology
> Association standard for data interchange, GRIB. A readable starting
> point would be <http://dss.ucar.edu/docs/formats/grib/gribdoc/>.
> It has the most evil misalignment I've seen.
>
> If I were to want to become the laughing stock of UseNet, one of
> the easiest ways would be to post this spec on sci.data.formats
> and comp.arch, and suggest it as a proposed standard ;-(

I'm not afraid of being the laughing stock of USEnet and comp.arch, so I
will say: misalignment isn't evil.

Anything that we can do to improve the amount of data that fits in a
given amount of cache is good. Avoiding wasting cache on padding to
align things to cache boundaries is good.

Handling misalignment is just muxes or, if you don't want to spend
hardware on it, just a few instructions to shift and align.

The desire to avoid misalignment is just another leftover of the failed
RISC revolution. Like delayed branches.

---

Of course, I am exaggerating. It's all a question of where you are. If
processors are fast relative to memory, allow misalignment. If
processors are slow relative to memory, deprecate misalignment.

---

There is a possible middle ground: let your exchange format be a
compressed version of aligned datastructures. I'm assuming, of course,
that compress(padded(datastructure)) is the same size as
compressed(unpadded(datastructure)).
From: Terje Mathisen "terje.mathisen at on
Carlie Coats wrote:
> As one of those Fortran programmers, I think my biggest problem
> is trying to deal with external data formats designed by idiots.
> For an extreme example, have a look at the official World Meteorology
> Association standard for data interchange, GRIB. A readable starting
> point would be <http://dss.ucar.edu/docs/formats/grib/gribdoc/>.
> It has the most evil misalignment I've seen.

The 'most evil'?

The only immediately ugly parts I noticed was the 24-bit length fields
as well as the decision to define an incompatible 32-bit float format,
but as long as such files should be unpacked by portable (i.e.
endian-agnostic) code, you have to work with bytes (octets) anyway, right?
>
> If I were to want to become the laughing stock of UseNet, one of
> the easiest ways would be to post this spec on sci.data.formats
> and comp.arch, and suggest it as a proposed standard ;-(

For a really evil data format take a look at the CABAC part of h264.
>
> I have no influence here (USA) to get this changed. It is such a
> sacred cow that it is hard even to get it recognized as evil by the
> general meteorogy community here. Maybe you have more influence
> in the UK, Nick?

Is this ever performance-critical?

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
From: Carlie Coats on
Terje Mathisen wrote:
> Carlie Coats wrote:
>> As one of those Fortran programmers, I think my biggest problem
>> is trying to deal with external data formats designed by idiots.
>> For an extreme example, have a look at the official World Meteorology
>> Association standard for data interchange, GRIB. A readable starting
>> point would be <http://dss.ucar.edu/docs/formats/grib/gribdoc/>.
>> It has the most evil misalignment I've seen.
>
> The 'most evil'?
>
> The only immediately ugly parts I noticed was the 24-bit length fields
> as well as the decision to define an incompatible 32-bit float format,
> but as long as such files should be unpacked by portable (i.e.
> endian-agnostic) code, you have to work with bytes (octets) anyway, right?


No.

You wind up with things like

11-bit-int( A*x+B) data compression, where A and B are IBM-360-format
floats.

There are *lots* of better ways of doing compression that that
(and that let me use a (de-facto, at least) standard library for
the task.