From: Peter Olcott on
On 5/19/2010 2:24 PM, Leigh Johnston wrote:
>
>
> "Peter Olcott" <NoSpam(a)OCR4Screen.com> wrote in message
> news:69ydnUm-AOC_pWnWnZ2dnUVZ_u2dnZ2d(a)giganews.com...
>> On 5/19/2010 1:42 PM, Leigh Johnston wrote:
>>>
>>>
>>> "Peter Olcott" <NoSpam(a)OCR4Screen.com> wrote in message
>>> news:P6idnX4azPfvs2nWnZ2dnUVZ_oudnZ2d(a)giganews.com...
>>>>>
>>>>> Whilst what you say is technically correct I try to avoid writing code
>>>>> which does not check against an end iterator when iterating over a
>>>>> sequence, just personal preference (due to a slight concern re
>>>>> safety).
>>>>> We are probably only talking about an extra CPU instruction or two to
>>>>> check for end of sequence in the main loop along with the O(1)
>>>>> check of
>>>>> the final state when the main loop is exited. Your solution would also
>>>>> require making a copy of the input sequence to allow appending of the
>>>>> sentinel unless you consider mutating input parameters to be OK. My
>>>>
>>>> The main purpose of this is to read in a file of UTF-8 to be converted
>>>> to UTF-32. I don't have to mutate the input at all, the user must know
>>>> to append the 0xFF byte.
>>>
>>> Are you for real? That sounds like a really stupid idea.
>>
>> The goal is to make the fastest possible validation of UTF-8 and
>> translation to UTF-32. Within this binding contsraint there are few
>> options. Copying the input data is not one of them. What else does
>> that leave? Mutating the Input and then changing it back?
>>
>
> Either you are holding the entire file in memory or performing a
> buffered read, either way you can append the sentinel to the data in
> memory unless you are using memory mapped I/O. The only use-case that
> benefits from having a sentinel is if the input is in memory and you
> have indicated this is not a primary use-case so why bother with a
> sentinel at all? When performing file I/O your algorithm is unlikely to
> be the bottleneck sentinel or no sentinel. As I would not use a sentinel
> for this I would not have the dilemma of mutating the input that you
> face and it would work for any use-case (input in a file, network or
> memory).
>
> /Leigh

Again you forget the primary purpose of this whole line-of-reasoning.
The goal is to show that it is not possible to construct a faster lexer
than the one based on a state transition matrix.
From: Leigh Johnston on


"Peter Olcott" <NoSpam(a)OCR4Screen.com> wrote in message
news:W_ydnU32Ptzn3WnWnZ2dnUVZ_qgAAAAA(a)giganews.com...
>
> Again you forget the primary purpose of this whole line-of-reasoning. The
> goal is to show that it is not possible to construct a faster lexer than
> the one based on a state transition matrix.

This contradicts what you said earlier, i.e.:

>>>> The main purpose of this is to read in a file of UTF-8 to be converted
>>>> to UTF-32. I don't have to mutate the input at all, the user must know
>>>> to append the 0xFF byte.

What is the difference between "primary purpose" and "main purpose"?

I give up, your replies are too troll-like whether intentionally or not.

/Leigh

From: Paul Bibbings on
Peter Olcott <NoSpam(a)OCR4Screen.com> writes:

> Again you forget the primary purpose of this whole
> line-of-reasoning. The goal is to show that it is not possible to
> construct a faster lexer than the one based on a state transition
> matrix.

Will *this* get a response too? *Any* response? Really?
From: Peter Olcott on
On 5/19/2010 2:51 PM, Leigh Johnston wrote:
>
>
> "Peter Olcott" <NoSpam(a)OCR4Screen.com> wrote in message
> news:W_ydnU32Ptzn3WnWnZ2dnUVZ_qgAAAAA(a)giganews.com...
>>
>> Again you forget the primary purpose of this whole line-of-reasoning.
>> The goal is to show that it is not possible to construct a faster
>> lexer than the one based on a state transition matrix.
>
> This contradicts what you said earlier, i.e.:
>
>>>>> The main purpose of this is to read in a file of UTF-8 to be converted
>>>>> to UTF-32. I don't have to mutate the input at all, the user must know
>>>>> to append the 0xFF byte.
>
> What is the difference between "primary purpose" and "main purpose"?
>
> I give up, your replies are too troll-like whether intentionally or not.
>
> /Leigh

Me too.
From: Hector Santos on
Paul Bibbings wrote:

> Peter Olcott <NoSpam(a)OCR4Screen.com> writes:
>
>> Again you forget the primary purpose of this whole
>> line-of-reasoning. The goal is to show that it is not possible to
>> construct a faster lexer than the one based on a state transition
>> matrix.
>
> Will *this* get a response too? *Any* response? Really?


Hope not.