From: Peter Olcott on
On 6/16/2010 12:44 PM, jwatts wrote:
> On Jun 15, 3:01 pm, Peter Olcott<NoS...(a)OCR4Screen.com> wrote:
>> On 6/15/2010 4:17 AM, Asher Langton wrote:
>>> On Jun 13, 1:43 pm, Peter Olcott<NoS...(a)OCR4Screen.com> wrote:
>>>> (1) The ONLY reason not to always inline everything (that can be
>>>> in-lined) all the time is code bloat.
>>>> (2) If a function is only called once then there can not possibly be any
>>>> code bloat at all.
>>>> (3) Therefore all functions that are only called once should always be
>>>> in-lined.
>>
>>> Consider this snippet:
>>
>>> while (...)
>>> {
>>> ...
>>> if (condition)
>>> doSomething();
>>> }
>>
>>> If doSomething() is inlined, then the body of the loop might no longer
>>> fit in the instruction cache, which -- in the case where 'condition'
>>
>> If the body of the loop without the function call overhead does not fit
>> in cache, then the body of the loop with the function call overhead will
>> also not fit in cache because it requires even more memory. Adding
>> function call overhead can not reduce memory requirements.
>>
>
> Once you have entered the loop, the cost in time of the function call
> and any memory requirements will have been paid. Since your entire
> program will not fit in the cache, you will encounter cache flushes in
> the vicinity of the inlined code anyway. The question devolves to "is
> the cost of the function call significant?" I would tend to say no
> since normally the cost of the loop will dwarf that of the function
> call.
>
> As a general rule, I recommend that my developers avoid the temptation
> of 'premature optimization'. First, just make it work. Then, if
> performance is not acceptable, use a profiler to find the problem
> areas. First try simple changes in algorithms, and if that's not
> sufficient, perhaps structural or architectural changes are necessary.

I have heard this heuristic over and over. What if the goal is not to
make a process take an acceptable amount of time? What is the original
goal is to make a process take a minimal amount of time?

Within this goal, I try to make my designs hit the ballpark of optimal
speed initially. This may take 50% more development time than just
getting the process working, but, I estimate that it takes much less
development time than the repeated iterations of redesign required to
achieve the ball park of the fastest possible design using approach
mentioned above.

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: Leigh Johnston on
"Joshua Maurice" <joshuamaurice(a)gmail.com> wrote in message news:4982319c-628f-4319-97d9-f54d88860027(a)x27g2000yqb.googlegroups.com...
> On Jun 16, 10:46 am, jwatts <jwatt...(a)gmail.com> wrote:
>> 3. In method toUTF32(), is it possible to produce a reasonable
>> estimate of the amount of space required for the result vector UTF32?
>> Each time a call to push_back() results in reallocation of a vector,
>> you will pay a runtime cost, especially once the vectors reach
>> significant size;
>
> No. As I'm sure many other people will reply, ::std::vector::push_back
> is specifically guaranteed to have amortized O(1) runtime. It
> accomplishes this by not increasing the capacity by just 1 when it
> runs out of capacity, but instead by increasing the capacity by a
> multiple when it runs out of capacity. This is generally simplified in
> discussions to "doubling the capacity each time push_back runs out of
> capacity". However, I have seen reports from good people that doubling
> isn't ideal from real world measurement. 2 isn't special; any constant
> multiplier greater than 1 will give amortized O(1) runtime.

"Amortized O(1)" is not the same as "O(1)". Calling reserve() can sometimes
be beneficial for certain inputs and/or if copies are expensive.

/Leigh

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: Seungbeom Kim on
On 2010-06-16 10:46, jwatts wrote:
>
> 4. Be careful when assigning values of small types (i.e. uint8_t) to
> variables of a larger type (i.e. uint32_t). e.g. "CodePoint = Byte;"
> Should you ever move the code to a different compiler, the sign-
> extension behavior may change, causing unexpected, and likely
> undesired, behaviors;

You have nothing to lose by being careful, but converting between
different unsigned types are completely defined (as modulo arithmetic).
Especially if you're converting from a smaller unsigned type to a
larger one, the value is always unchanged.

When you're converting into a signed type, you should take more care
to see if the value can be represented in the destination type; because
otherwise, the result is implementation-defined.

> 5. Instead of using a for() loop to traverse the input vector, prefer
> using iterators;

Don't you use for() loops with iterators as well? :)

They belong to different types; you could suggest using standard
algorithms (in <algorithm>) instead of for(), or iterators instead
of pointers or indices, but the two problems lie in separate axes.
(They are not completely independent, though, since you cannot use
standard algorithms with indices...)

--
Seungbeom Kim

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: Ulrich Eckhardt on
jwatts wrote:
> 4. Be careful when assigning values of small types (i.e. uint8_t) to
> variables of a larger type (i.e. uint32_t). e.g. "CodePoint = Byte;"
> Should you ever move the code to a different compiler, the sign-
> extension behavior may change, causing unexpected, and likely
> undesired, behaviors;

Are you sure? I mean all those types are unsigned, so "sign" extension by
definition can't apply. Actually, portability is a good reason to use
unsigned types here!

Uli

--
Sator Laser GmbH
Geschäftsführer: Thorsten Föcking, Amtsgericht Hamburg HR B62 932


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: ThosRTanner on
On Jun 13, 9:43 pm, Peter Olcott <NoS...(a)OCR4Screen.com> wrote:
> On 6/13/2010 5:06 AM, Keith H Duggar wrote:
>
> > On Jun 12, 2:12 am, Peter Olcott<NoS...(a)OCR4Screen.com> wrote:
> >> Inlining is always better for functions that are known
> >> to only need to be called from a single place.
>
> > In those absolute terms that statement is flat wrong. Whether to
> > inline depends intimately on the relative execution frequency and
> > size of the code in question versus code in the calling chain. It
> > has to do with optimizing instruction cache performance and other
> > factors. See the "Problems" section in the wikipedia
>
> (1) The ONLY reason not to always inline everything (that can be
> in-lined) all the time is code bloat.

That doesn't entirely cover things. The problem with code bloat is it
causes cache misses to occur more often.

As an experiment, we out-of-lined a whole load of (non-trivial)
inlined code and did some performance analysis. Contrary to the
expectation of the "function calls are bad" clique, the system went
faster. It also means our builds are more stable, as the client code
doesn't get rebuilt when the implementation changes.

> (2) If a function is only called once then there can not possibly be any
> code bloat at all.

There can however be adverse effects on performance due to cache
misses

> (3) Therefore all functions that are only called once should always be
> in-lined.

Which given the above is clearly not true.


--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]