From: Mihai N. on

> German Telephone Book sorting (raw sorting, such
> as you might get with strcmp, is not correct, because it is based on the
> old ASCII-7
> translation, where ��� followed Z because in German ASCII-7

It kind of shows that the system is old and was adapted for primitive
electronic processing.
Another hint of that is the fact that the umlaut vowels are equivalent
to the ?e doubles (� == ae, � == oe, � == ue)


--
Mihai Nita [Microsoft MVP, Visual C++]
http://www.mihai-nita.net
------------------------------------------
Replace _year_ with _ to get the real email

From: Joseph M. Newcomer on
See below...
On Fri, 21 May 2010 14:37:19 -0500, Peter Olcott <NoSpam(a)OCR4Screen.com> wrote:

>On 5/21/2010 2:11 PM, Joseph M. Newcomer wrote:
>> See below...
>> On Thu, 20 May 2010 20:20:24 -0500, Peter Olcott<NoSpam(a)OCR4Screen.com> wrote:
>>
>>> A more accurate statement might be something like unmeasured performance
>>> estimates are most often very inaccurate. It is also probably true that
>>> faster methods can often be discerned from much slower (at least an
>>> order of magnitude) methods without measurement.
>> ****
>> You are still confusing design and implementation.
>> ****
>
>The way that I do design I start with broad goals that I want to achieve
>and end up with nearly correct code as my most detailed level of design.
>I progress from the broad goals through very many levels of increasing
>specificity using a hierarchy of increasing specificity.
****
But I can achieve correct code for a given design that is substantially slower, so I fail
to see how a design specifies the performance.
****
>
>So I am not confusing design with implementation, implementation is the
>most detailed level of design within a continuum of increasing
>specificity from broad goals to working code.
>
>Only about 3% of my time is spent on debugging, with another 2% on
>testing. The quickest way to complete any very complex system is to slow
>down and carefully plan every single step.
****
What does that have to do with the issue of non-executable design specifications being
"faster" than alternative design specifications.

The specification of a lexer is: looking at the current state and the current character,
determine a next state. Place the state machine in that state, advance to the next input
character, and repeat until an error state or final state is achieved.

I fail to see how performance can be specified in this abstract specification. I could
use a CMap, virtual methods of classes, or linear table lookup to implement code that met
the specification exactly, and there would be a wide range of performance variation in
these different realizations. Yet every one of them would meet the specification of a
DFA. If your specification starts talking about table layouts in memory, it is no longer
a specification, but an implementation discussion.
joe
****
Joseph M. Newcomer [MVP]
email: newcomer(a)flounder.com
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm
From: Joseph M. Newcomer on
The ae, oe, and ue variants are transliterations into alphabets that do not support
accented characters. They were common when all we had was American ASCII-7 and needed to
encode German names (not to mention Dutch, Norwegian, Swedish, Russian, etc. names). Did
you know that Tchaikovsky and Chebychev have names that sort nearby in Russian? The Tchai
and the Che represent the contemporary English transliterations of their names at the
times they were working.

My maternal great-grandfather was M�ller. But we have documents that spell his name
(typewritten) as Mueller, and my grandmother's name was "Miller".
joe

On Fri, 21 May 2010 23:51:22 -0700, "Mihai N." <nmihai_year_2000(a)yahoo.com> wrote:

>
>> German Telephone Book sorting (raw sorting, such
>> as you might get with strcmp, is not correct, because it is based on the
>> old ASCII-7
>> translation, where ? followed Z because in German ASCII-7
>
>It kind of shows that the system is old and was adapted for primitive
>electronic processing.
>Another hint of that is the fact that the umlaut vowels are equivalent
>to the ?e doubles (?= ae, ? oe, ?e)
Joseph M. Newcomer [MVP]
email: newcomer(a)flounder.com
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm
From: Peter Olcott on
On 5/22/2010 4:57 AM, Joseph M. Newcomer wrote:
> See below...
> On Fri, 21 May 2010 14:37:19 -0500, Peter Olcott<NoSpam(a)OCR4Screen.com> wrote:
>

>> The way that I do design I start with broad goals that I want to achieve
>> and end up with nearly correct code as my most detailed level of design.
>> I progress from the broad goals through very many levels of increasing
>> specificity using a hierarchy of increasing specificity.
> ****
> But I can achieve correct code for a given design that is substantially slower, so I fail
> to see how a design specifies the performance.

You can't achieve correct code that implements one of my designs that is
substantially slower because my final design is code. The design that I
posted did not leave enough leeway to really screw it up unless very
creative thought was put into screwing it up intentionality.

> ****
>>
>> So I am not confusing design with implementation, implementation is the
>> most detailed level of design within a continuum of increasing
>> specificity from broad goals to working code.
>>
>> Only about 3% of my time is spent on debugging, with another 2% on
>> testing. The quickest way to complete any very complex system is to slow
>> down and carefully plan every single step.
> ****
> What does that have to do with the issue of non-executable design specifications being
> "faster" than alternative design specifications.
>
> The specification of a lexer is: looking at the current state and the current character,
> determine a next state. Place the state machine in that state, advance to the next input
> character, and repeat until an error state or final state is achieved.
>
> I fail to see how performance can be specified in this abstract specification. I could
> use a CMap, virtual methods of classes, or linear table lookup to implement code that met
> the specification exactly, and there would be a wide range of performance variation in

I specified a switch statement and a state transition table in the design.

I specified using twelve ActionCodes in a switch statement and provided
the ActionCodes. I also provided eight states to be used in a state
transition matrix and provided these states and the input values within
these states and the corresponding actions for each input value.

State 0
00-7F ASCII
C2-DF goto State 1 // Two Byte
E0-EF goto State 2 // Three Byte
F0-F4 goto State 4 // Four Byte
else Error
State 1
80-BF
else Error
State 2
80-BF goto State 3
else Error
State 3
80-BF
else Error
State 4
80-BF goto State 5
else Error
State 5
80-BF goto State 6
else Error
State 6
80-BF goto State 7
else Error
State 7
80-BF
else Error

// Holds ActionCodes Indexed by NextState and Data[N]
uint8 States[256][8];

// This is the input data to be transformed
std::vector<uint8> Data; // LastByte hold sentinel value 11

Twelve ActionCodes
00 InvalidByteError
01 FirstByteOfOneByte
02 FirstByteOfTwoBytes
03 FirstByteOfThreeBytes
04 FirstByteOfFourBytes
05 SecondByteOfTwoBytes
06 SecondByteOfThreeBytes
07 SecondByteOfFourBytes
08 ThirdByteOfThreeBytes
09 ThirdByteOfFourBytes
10 FourthByteOfFourBytes
11 OutOfData (Sentinel)

Within the context of this design there are few correct implementations.

> these different realizations.

These different realizations would not form correct examples of the
design that I specified.

> Yet every one of them would meet the specification of a
> DFA. If your specification starts talking about table layouts in memory, it is no longer
> a specification, but an implementation discussion.
> joe
> ****
> Joseph M. Newcomer [MVP]
> email: newcomer(a)flounder.com
> Web: http://www.flounder.com
> MVP Tips: http://www.flounder.com/mvp_tips.htm

From: Joseph M. Newcomer on
See below...
On Sat, 22 May 2010 08:50:17 -0500, Peter Olcott <NoSpam(a)OCR4Screen.com> wrote:

>On 5/22/2010 4:57 AM, Joseph M. Newcomer wrote:
>> See below...
>> On Fri, 21 May 2010 14:37:19 -0500, Peter Olcott<NoSpam(a)OCR4Screen.com> wrote:
>>
>
>>> The way that I do design I start with broad goals that I want to achieve
>>> and end up with nearly correct code as my most detailed level of design.
>>> I progress from the broad goals through very many levels of increasing
>>> specificity using a hierarchy of increasing specificity.
>> ****
>> But I can achieve correct code for a given design that is substantially slower, so I fail
>> to see how a design specifies the performance.
>
>You can't achieve correct code that implements one of my designs that is
>substantially slower because my final design is code. The design that I
>posted did not leave enough leeway to really screw it up unless very
>creative thought was put into screwing it up intentionality.
***
If it is code, it is not design, it is implementation.
joe
***
>
>> ****
>>>
>>> So I am not confusing design with implementation, implementation is the
>>> most detailed level of design within a continuum of increasing
>>> specificity from broad goals to working code.
>>>
>>> Only about 3% of my time is spent on debugging, with another 2% on
>>> testing. The quickest way to complete any very complex system is to slow
>>> down and carefully plan every single step.
>> ****
>> What does that have to do with the issue of non-executable design specifications being
>> "faster" than alternative design specifications.
>>
>> The specification of a lexer is: looking at the current state and the current character,
>> determine a next state. Place the state machine in that state, advance to the next input
>> character, and repeat until an error state or final state is achieved.
>>
>> I fail to see how performance can be specified in this abstract specification. I could
>> use a CMap, virtual methods of classes, or linear table lookup to implement code that met
>> the specification exactly, and there would be a wide range of performance variation in
>
>I specified a switch statement and a state transition table in the design.
>
>I specified using twelve ActionCodes in a switch statement and provided
>the ActionCodes. I also provided eight states to be used in a state
>transition matrix and provided these states and the input values within
>these states and the corresponding actions for each input value.
>
>State 0
> 00-7F ASCII
> C2-DF goto State 1 // Two Byte
> E0-EF goto State 2 // Three Byte
> F0-F4 goto State 4 // Four Byte
> else Error
>State 1
> 80-BF
> else Error
>State 2
> 80-BF goto State 3
> else Error
>State 3
> 80-BF
> else Error
>State 4
> 80-BF goto State 5
> else Error
>State 5
> 80-BF goto State 6
> else Error
>State 6
> 80-BF goto State 7
> else Error
>State 7
> 80-BF
> else Error
>
>// Holds ActionCodes Indexed by NextState and Data[N]
>uint8 States[256][8];
>
>// This is the input data to be transformed
>std::vector<uint8> Data; // LastByte hold sentinel value 11
>
>Twelve ActionCodes
>00 InvalidByteError
>01 FirstByteOfOneByte
>02 FirstByteOfTwoBytes
>03 FirstByteOfThreeBytes
>04 FirstByteOfFourBytes
>05 SecondByteOfTwoBytes
>06 SecondByteOfThreeBytes
>07 SecondByteOfFourBytes
>08 ThirdByteOfThreeBytes
>09 ThirdByteOfFourBytes
>10 FourthByteOfFourBytes
>11 OutOfData (Sentinel)
>
>Within the context of this design there are few correct implementations.
>
>> these different realizations.
>
>These different realizations would not form correct examples of the
>design that I specified.
>
>> Yet every one of them would meet the specification of a
>> DFA. If your specification starts talking about table layouts in memory, it is no longer
>> a specification, but an implementation discussion.
>> joe
>> ****
>> Joseph M. Newcomer [MVP]
>> email: newcomer(a)flounder.com
>> Web: http://www.flounder.com
>> MVP Tips: http://www.flounder.com/mvp_tips.htm
Joseph M. Newcomer [MVP]
email: newcomer(a)flounder.com
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm