From: Keith H Duggar on
On Jun 17, 3:25 am, "Leigh Johnston" <le...(a)i42.co.uk> wrote:
> "Joshua Maurice" <joshuamaur...(a)gmail.com> wrote in
messagenews:4982319c-628f-4319-97d9-f54d88860027(a)x27g2000yqb.googlegroups.com...
> > On Jun 16, 10:46 am, jwatts <jwatt...(a)gmail.com> wrote:
> >> 3. In method toUTF32(), is it possible to produce a reasonable
> >> estimate of the amount of space required for the result vector UTF32?
> >> Each time a call to push_back() results in reallocation of a vector,
> >> you will pay a runtime cost, especially once the vectors reach
> >> significant size;
>
> > No. As I'm sure many other people will reply, ::std::vector::push_back
> > is specifically guaranteed to have amortized O(1) runtime. It
> > accomplishes this by not increasing the capacity by just 1 when it
> > runs out of capacity, but instead by increasing the capacity by a
> > multiple when it runs out of capacity. This is generally simplified in
> > discussions to "doubling the capacity each time push_back runs out of
> > capacity". However, I have seen reports from good people that doubling
> > isn't ideal from real world measurement. 2 isn't special; any constant
> > multiplier greater than 1 will give amortized O(1) runtime.
>
> "Amortized O(1)" is not the same as "O(1)". Calling reserve() can sometimes
> be beneficial for certain inputs and/or if copies are expensive.

And even more basic than that "O(1)" is not the same as "fast".
There are constants involved in O-notation and often one finds
that for sizes likely to occur in production some higher order
algorithm is faster due to simplicity and smaller constants.

http://en.wikipedia.org/wiki/Big_O_notation

A classic example is unordered (hash) vs ordered map. James Kanze
has done some very nice empirical comparisons between various map
implementations showing these behaviors. But I can't seem to find
his work at the moment.

KHD


--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: jwatts on
On Jun 16, 5:02 pm, Joshua Maurice <joshuamaur...(a)gmail.com> wrote:
> On Jun 16, 10:46 am, jwatts <jwatt...(a)gmail.com> wrote:
>
> > 3. In method toUTF32(), is it possible to produce a reasonable
> > estimate of the amount of space required for the result vector UTF32?
> > Each time a call to push_back() results in reallocation of a vector,
> > you will pay a runtime cost, especially once the vectors reach
> > significant size;
>
> No. As I'm sure many other people will reply, ::std::vector::push_back
> is specifically guaranteed to have amortized O(1) runtime. It
> accomplishes this by not increasing the capacity by just 1 when it
> runs out of capacity, but instead by increasing the capacity by a
> multiple when it runs out of capacity. This is generally simplified in
> discussions to "doubling the capacity each time push_back runs out of
> capacity". However, I have seen reports from good people that doubling
> isn't ideal from real world measurement. 2 isn't special; any constant
> multiplier greater than 1 will give amortized O(1) runtime.
>
> --

vector::push_back() only has O(1) behavior when there is space
available for the new element. If the vector needs to expand to
accommodate a new element, then a new buffer has to be allocated, the
existing data copied to the new buffer, and the old buffer freed.
Even for simple discrete types (e.g. integers) this can become time
consuming for large vectors if it occurs frequently enough. In the
case of complex objects whose copy-ctors or assignment operators do
'real' work, such as allocating memory and copying their contents from
one object to the next, expanding a vector<> can by quite time
consuming.

If you know a priori approximately how many elements you will need,
the whole issue can be avoided by reserve()'ing that much space to
start with, thus avoiding the re-allocation and copy operations.


--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: jwatts on
On Jun 16, 5:01 pm, Peter Olcott <NoS...(a)OCR4Screen.com> wrote:
> On 6/16/2010 12:44 PM, jwatts wrote:
>
>
>
> > On Jun 15, 3:01 pm, Peter Olcott<NoS...(a)OCR4Screen.com> wrote:
> >> On 6/15/2010 4:17 AM, Asher Langton wrote:
> >>> On Jun 13, 1:43 pm, Peter Olcott<NoS...(a)OCR4Screen.com> wrote:
> >>>> (1) The ONLY reason not to always inline everything (that can be
> >>>> in-lined) all the time is code bloat.
> >>>> (2) If a function is only called once then there can not possibly be any
> >>>> code bloat at all.
> >>>> (3) Therefore all functions that are only called once should always be
> >>>> in-lined.
>
> >>> Consider this snippet:
>
> >>> while (...)
> >>> {
> >>> ...
> >>> if (condition)
> >>> doSomething();
> >>> }
>
> >>> If doSomething() is inlined, then the body of the loop might no longer
> >>> fit in the instruction cache, which -- in the case where 'condition'
>
> >> If the body of the loop without the function call overhead does not fit
> >> in cache, then the body of the loop with the function call overhead will
> >> also not fit in cache because it requires even more memory. Adding
> >> function call overhead can not reduce memory requirements.
>
> > Once you have entered the loop, the cost in time of the function call
> > and any memory requirements will have been paid. Since your entire
> > program will not fit in the cache, you will encounter cache flushes in
> > the vicinity of the inlined code anyway. The question devolves to "is
> > the cost of the function call significant?" I would tend to say no
> > since normally the cost of the loop will dwarf that of the function
> > call.
>
> > As a general rule, I recommend that my developers avoid the temptation
> > of 'premature optimization'. First, just make it work. Then, if
> > performance is not acceptable, use a profiler to find the problem
> > areas. First try simple changes in algorithms, and if that's not
> > sufficient, perhaps structural or architectural changes are necessary.
>
> I have heard this heuristic over and over. What if the goal is not to
> make a process take an acceptable amount of time? What is the original
> goal is to make a process take a minimal amount of time?
>
> Within this goal, I try to make my designs hit the ballpark of optimal
> speed initially. This may take 50% more development time than just
> getting the process working, but, I estimate that it takes much less
> development time than the repeated iterations of redesign required to
> achieve the ball park of the fastest possible design using approach
> mentioned above.
>
> --

For your environment I'm sure your approach is the correct one. In my
environment, for better or worse, "good enough" really is good
enough. We're under considerable pressure to finish each application
and move on to the next. However, the more senior developers that are
working on shared classes, interfaces, libraries etc. are afforded the
relative luxury of iterating designs and implementations since their
output has an effect on the performance of all applications. Even
then we're sometimes surprised when someone discovers unanticipated
uses for our designs.


--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: jwatts on
On Jun 17, 7:09 am, Seungbeom Kim <musip...(a)bawi.org> wrote:
> On 2010-06-16 10:46, jwatts wrote:
>
>
>
> > 4. Be careful when assigning values of small types (i.e. uint8_t) to
> > variables of a larger type (i.e. uint32_t). e.g. "CodePoint = Byte;"
> > Should you ever move the code to a different compiler, the sign-
> > extension behavior may change, causing unexpected, and likely
> > undesired, behaviors;
>
> You have nothing to lose by being careful, but converting between
> different unsigned types are completely defined (as modulo arithmetic).
> Especially if you're converting from a smaller unsigned type to a
> larger one, the value is always unchanged.
>
> When you're converting into a signed type, you should take more care
> to see if the value can be represented in the destination type; because
> otherwise, the result is implementation-defined.
>
> > 5. Instead of using a for() loop to traverse the input vector, prefer
> > using iterators;
>
> Don't you use for() loops with iterators as well? :)
>
> They belong to different types; you could suggest using standard
> algorithms (in <algorithm>) instead of for(), or iterators instead
> of pointers or indices, but the two problems lie in separate axes.
> (They are not completely independent, though, since you cannot use
> standard algorithms with indices...)
>
> --

You're correct, thanks for point it out. I did an especially poor job
with that one. What I meant to convey was the idea that when
traversing contiguous sequences (i.e. vectors<> or arrays), it's
usually faster to to use iterators or pointers as appropriate, as
compared to indexes.



--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: Peter Olcott on
On 6/18/2010 10:15 AM, jwatts wrote:
> On Jun 17, 7:09 am, Seungbeom Kim<musip...(a)bawi.org> wrote:
>> On 2010-06-16 10:46, jwatts wrote:
>>
>>
>>
>>> 4. Be careful when assigning values of small types (i.e. uint8_t) to
>>> variables of a larger type (i.e. uint32_t). e.g. "CodePoint = Byte;"
>>> Should you ever move the code to a different compiler, the sign-
>>> extension behavior may change, causing unexpected, and likely
>>> undesired, behaviors;
>>
>> You have nothing to lose by being careful, but converting between
>> different unsigned types are completely defined (as modulo arithmetic).
>> Especially if you're converting from a smaller unsigned type to a
>> larger one, the value is always unchanged.
>>
>> When you're converting into a signed type, you should take more care
>> to see if the value can be represented in the destination type; because
>> otherwise, the result is implementation-defined.
>>
>>> 5. Instead of using a for() loop to traverse the input vector, prefer
>>> using iterators;
>>
>> Don't you use for() loops with iterators as well? :)
>>
>> They belong to different types; you could suggest using standard
>> algorithms (in<algorithm>) instead of for(), or iterators instead
>> of pointers or indices, but the two problems lie in separate axes.
>> (They are not completely independent, though, since you cannot use
>> standard algorithms with indices...)
>>
>> --
>
> You're correct, thanks for point it out. I did an especially poor job
> with that one. What I meant to convey was the idea that when
> traversing contiguous sequences (i.e. vectors<> or arrays), it's
> usually faster to to use iterators or pointers as appropriate, as
> compared to indexes.
>
>
>
Although this has historically been the case, I have been told that this
is no longer the case. Compilers are now smart enough to see the
equivalence and thereby generate equivalent code. It is my option at
least for my own code that indices are more readable than pointers of
iterators. I only use iterators for things such as std::list where I
have no choice, and I never use std::list because std::vector tends to
be faster. There may be cases where this is not true.


--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]