From: Leigh Johnston on


"Daniel T." <daniel_t(a)earthlink.net> wrote in message
news:daniel_t-F52648.16462631052010(a)70-3-168-216.pools.spcsdns.net...
> In article <j9OdnTjci7caiJnRnZ2dnUVZ8k-dnZ2d(a)giganews.com>,
> "Leigh Johnston" <leigh(a)i42.co.uk> wrote:
>
>> "Daniel T." <daniel_t(a)earthlink.net> wrote in message
>> news:daniel_t-C210FF.15502731052010(a)70-3-168-216.pools.spcsdns.net...
>> > "Leigh Johnston" <leigh(a)i42.co.uk> wrote:
>> >> "Daniel T." <daniel_t(a)earthlink.net> wrote:
>> >>
>> >> > CodePoint and UTF32[N] are two representations that both refer to
>> >> > the same piece of knowledge. Why the unnecessary duplication?
>> >>
>> >> It is not unnecessary *if* there is a noticeable performance
>> >> improvement. I agree however that premature optimization should be
>> >> avoided (obviously) which is why profiling should be performed
>> >
>> > I'm glad we agree that the code in question is probably an unnecessary
>> > optimization.
>>
>> I didn't say that, it is unclear if the optimization is necessary and
>> whether or not it is can be determined through profiling and/or examining
>> the compiler's assembler output.
>
> Fine, but you do agree that it is an optimization, the only doubt you
> hold here is whether or not it is necessary. Since no tests have been
> presenting showing that code without the extra variable needs
> optimizing, this is by definition, a premature optimization.

No, using a temporary has no downsides (is at worst harmless) and yet could
be beneficial from a performance standpoint which makes it (in my book) a
win-win that doesn't deserve your criticism. I write such code quite a lot
(use a temporary to avoid multiple dereferences), call me biased if you
want.

/Leigh

From: Öö Tiib on
On 31 mai, 23:46, "Daniel T." <danie...(a)earthlink.net> wrote:
> In article <j9OdnTjci7caiJnRnZ2dnUVZ8k-dn...(a)giganews.com>,
>  "Leigh Johnston" <le...(a)i42.co.uk> wrote:
> > "Daniel T." <danie...(a)earthlink.net> wrote in message
> >news:daniel_t-C210FF.15502731052010(a)70-3-168-216.pools.spcsdns.net...
> > > "Leigh Johnston" <le...(a)i42.co.uk> wrote:
> > >> "Daniel T." <danie...(a)earthlink.net> wrote:
> > >> > CodePoint and UTF32[N] are two representations that both refer to
> > >> > the same piece of knowledge. Why the unnecessary duplication?
> > >> It is not unnecessary *if* there is a noticeable performance
> > >> improvement.  I agree however that premature optimization should be
> > >> avoided (obviously) which is why profiling should be performed
> > > I'm glad we agree that the code in question is probably an unnecessary
> > > optimization.
> > I didn't say that, it is unclear if the optimization is necessary and
> > whether or not it is can be determined through profiling and/or examining
> > the compiler's assembler output.
>
> Fine, but you do agree that it is an optimization, the only doubt you
> hold here is whether or not it is necessary. Since no tests have been
> presenting showing that code without the extra variable needs
> optimizing, this is by definition, a premature optimization.

For me "CodePoint" temporary is easier to read than "UTF32[N]" so in
context of original code it is not just an optimization. May be it is
because "UTF32[N]" hurts my eyes (feels like "macro array"). Using
iterator might make it unnecessary, but such example that uses
iterator has not been posted. Isn't it fruitless to argue about effect
of such temporary variable to readability or performance of code that
no one has written?
From: Joseph M. Newcomer on
See below...
On Mon, 31 May 2010 20:16:53 +0200, "Giovanni Dicanio"
<giovanniDOTdicanio(a)REMOVEMEgmail.com> wrote:

>"Joseph M. Newcomer" <newcomer(a)flounder.com> wrote:
>
>>> UTF8.reserve(UTF32.size() * 4); // worst case
>> ****
>> Note that this will call malloc(), which will involve setting a lock, then
>> searching for a
>> block to allocate, then releasing the lock. Since you have been a fanatic
>> about
>> performance, why is it you put a very expensive operation like 'reserve'
>> in your code?
>>
>> While it is perfectly reasonable, it seems inconsistent with your
>> previously-stated goals.
>
>Joe: I'm not sure if you are ironic or something :) ... but I believe that
>std::vector::reserve() with a proper capacity value, followed by several
>push_back()s, is very efficient.
>Sure, not as efficient as a static stack-allocated array, but very
>efficient.
****
But this code was written by someone who has been beating us nearly insensible about how
critical every single instruction is. So the code shown takes more instructions than
other alternatives, and he's been telling us that alternative implementations that take an
extra instuction or two are unacceptable implementations. So this code is inconsistent
with his previous concerns about performance.

If there is irony here, it is the fact that he violates his own strongly-stated goals
about perfomance, I could not help but point out the inconsistency.
*****
>
>
>> No, the CORRECT way to write such code is to either throw an exception (if
>> you are in C++,
>> which you clearly are) or return a value indicating the error (for
>> example, in C, an
>
>In this case, I'm for exception.
>Thanks to exception, you could use the precious function return value to
>actually return the resulting buffer (UTF8 string), instead of passing it as
>a reference to the function:
****
I'd probably choose to throw an exception, where the exception information included the
offset into the input vector, a pointer to the input vector so the handler could decide
what to do, etc.
****
>
> // Updated prototype:
> // - use 'const' correctness for utf32
> // - return resulting utf8
> // - may throw on error
> std::vector<uint8_t> toUTF8(const std::vector<uint32_t> & utf32);
>
>Note that thanks to the move semantics (i.e. the new "&&" thing of C++0x,
>available in VC10 a.k.a. VS2010), you don't pay for extra useless copies in
>returning potentially big objects.
****
Yep. But this did not state it was a 2010-compliant version. Ultimately, there is a
philosophical inconsistency between his strongly-stated concerns about performance over
the last several months, and the actual implmentation presented here. Since he loves
picking nits with us, I felt it was only fair to return the favor.

This does not change the fact that the printf is without a doubt a really awful interface.
joe
****
>
>Giovanni
>
>
Joseph M. Newcomer [MVP]
email: newcomer(a)flounder.com
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm
From: Oliver Regenfelder on
Hello,

Leigh Johnston wrote:
> Also printf sucks, this is a
> C++ newsgroup not a C newsgroup.

This is not even a general C++ newsgroup but an MFC one. So
strictly there is zero relevance of his posting to this
newsgroup.

Best regards,

Oliver
From: Oliver Regenfelder on
Hello Peter,

Peter Olcott wrote:
> const correctness requires the "extra" CodePoint variable.

That is wrong.

And if you would be deep into 'const correctness' you would
have declared the method itself const too.

Best regards,

Oliver