From: LR on
Adam Badura wrote:
>> Perhaps this http://www.stepanovpapers.com/notes.pdf will provide an
>> explanation.
>
> This is over 200 pages. Are you expecting us to read it all and
> then reply whether it provided any explanation or not?

No. The document has a table of contents.


> I quickly browsed through parts about comparing but found nothing
> about this topic. And I do not have (sadly) that much time to read it
> all (now).

I quickly browsed also and found parts that I think are relevant,
although I don't think there is An Explanation in this text. I think
that might have to be extracted from reading the text and taking its
implications. Unfortunately this process will be subjective, so YMWV.

Anyway, sorry, but I don't want to paraphrase what he wrote, so this is
the best I can do, here are the parts that I think may be relevant. I
hope you find this useful.

Start on page 20 in Lecture 2 with the discussion of the == operator for
cvector_int. Continue with the justification of why == and != should be
defined together in the start of Lecture 3. Continue with Lecture 3 on
pages 26 and 27 where using operator< to implement the other operators
is justified, although I think the choice between < and > might be
somewhat arbitrary, I'm not sure. There seems to be a minor
contradiction on these and the following pages regarding comparisons of
built in types with problems associated with NaN earlier in the text. I
think the discussion starting on page 29 regarding swap isn't relevant
to the question at hand. The class at the end of Lecture 5 might be
interesting to read. Lecture 6 page 52 contains some words about the
illusion of freedom (excuse me for interjecting a personal note, I
partially agree with this, but also disagree with it in parts) that I
think is relevant to the question of comparisons and what kind of code
we as programmers may want to write. I think the paragraph after that
about what Mr. Stepanov considers the worst mistake of his technical
career is probably relevant as well. Lecture 7 contains more
justification at the beginning and some more technical details.

I haven't read through the rest of the document, so there may be more.


But as I said, none of this is The Explanation, so I hope someone else
will do a better search than I was able to do and find The Answer.

HTH

LR

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: Seungbeom Kim on
Anders Dalvander wrote:
>
> std::map::find is usually built on-top of std::map::lower_bound. To
> give a correct result std::map::find need one extra call to "less"
> than std::map::lower_bound. std::map::lower_bound, which only need to
> utilize "less", wouldn't gain anything from calling "compare" instead
> of "less", as "less" is the only thing needed and is only called once
> for every level in the map.
>
> Given a map of 1,000,000 elements, that map would have the height of
> 20. std::map::lower_bound would need to call "less" 20 times, and
> std::map::find would need to call "less" 21 times.
>
> For the same sized compmap, utilizing a "compare" function,
> compmap::lower_bound would need to call "less" 20 times, and
> compmap::find would need to call "less" 20 times. The complexity of
> compmap::find would on the other hand be greater as it cannot fully
> rely on compmap::lower_bound.
>
> std::map::find would call "less" at most 5% more times than
> compmap::find would call "compare", if the map would contain 1,000,000
> elements.

It's not only the number of calls to "less" on the top level that
matters. Consider std::map<T> and compmap<T>, where T is std::pair<A, B>.
Each call to std::less<T> eventually leads to something like this:

// taken from GNU libstdc++
bool operator<(const pair<A, B>& x, const pair<A, B>& y)
{
return x.first < y.first
|| (!(y.first < x.first) && x.second < y.second);
}

Therefore, if x.first < y.first evaluates to false, whose probability
would be 50% under a uniform distribution assumption, the above function
should also evaluate y.first < x.first just to check whether the two
operands are equal and the second fields should be considered. We could
have checked it in the first place when we compared x.first and y.first.
This is what I call inefficiency.

On the other hand, if we used "compare", each call to compare<T> would
eventually lead to something like this:

int compare(const pair<A, B>& x, const pair<A, B>& y)
{
if (int r = compare(x.first, y.first)) return r;
return compare(x.second, y.second);
}

and we would never have to compare x.first and y.first twice.

I used std::pair just as an easy example, but the same phenomenon can
happen in any structure where the lexicographical ordering ("compare
field A first, and field B second, ...") is needed.

bool operator<(const pair<A, B, C>& x, const pair<A, B, C>& y)
{
return x.first < y.first
|| (!(y.first < x.first)
&& (x.second < y.second
|| (!(y.second < x.second)
&& x.third < y.third)));
}

// I'm not even sure if I got that right...
// This would be easier to write and understand:

bool operator<(const pair<A, B, C>& x, const pair<A, B, C>& y)
{
if (x.first < y.first) return true;
if (y.first < x.first) return false;
if (x.second < y.second) return true;
if (y.second < x.second) return false;
return x.third < y.third;
}

// This is intuitive and efficient!
int compare(const pair<A, B, C>& x, const pair<A, B, C>& y)
{
if (int r = compare(x.first, y.first)) return r;
if (int r = compare(x.second, y.second)) return r;
return compare(x.third, y.third);
}

And the inefficiency could get higher with a deeper level of nesting.

--
Seungbeom Kim

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: Adam Badura on
> I'm not referring to the code, I'm referring to the model and concept.
> "less" models Strict Weak Ordering, a well known concept in
> programming as well as in other areas, see for instance
> http://en.wikipedia.org/wiki/Strict_weak_ordering or
> http://www.sgi.com/tech/stl/StrictWeakOrdering.html for description.
>
> SWO has four well defined properties: irreflexivity, asymmetry,
> transitivity and transitivity of equivalence. Which properties
> "compare" would have I don't know, I feel that it would have more
> complex properties than the ones for "less". This is what I meant with
> my statement that "less" is a much simpler concept than "compare".

No. "compare" functor/function can model SWO or total ordering or
anything else like "less". It just computes in one shot results of all
three conditions:
1) A < B,
2) !(A < B) && !(B < A)
3) B < A
as it seems from practise that answering all these questions requires
usually (I don't know of any counter example) exactly the same work as
answering on of them.

> std::map::find is usually built on-top of std::map::lower_bound. To
> give a correct result std::map::find need one extra call to "less"
> than std::map::lower_bound. std::map::lower_bound, which only need to
> utilize "less", wouldn't gain anything from calling "compare" instead
> of "less", as "less" is the only thing needed and is only called once
> for every level in the map.

How about more complex data structures which are not present in
STD?


I do not have any hard data on performance gain of any of the
algorithms (either in STD or others).
However I am sure that using "compare" would not cause any
performance problems. Neither would it cause any implementation
problems. And it would allow to ask more direct questions. If you want
to compare two objects you just do it.
After all why did C had "strcmp" function (or something like that,
I'm not sure now) and not "strless"?

Adam Badura

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: Adam Badura on
> It's not only the number of calls to "less" on the top level that
> matters. Consider std::map<T> and compmap<T>, where T is std::pair<A, B>.
> Each call to std::less<T> eventually leads to something like this:
>
> // taken from GNU libstdc++
> bool operator<(const pair<A, B>& x, const pair<A, B>& y)
> {
> return x.first < y.first
> || (!(y.first < x.first) && x.second < y.second);
> }

Actually this can be done even now since you are the one who
writes the operator.

But since "compare" is not well established in standard library it
is more difficult as in general you will end up in writing your own
"compare" for every type including those standard ones and from 3-rd
party libraries.
If "compare" was present in the standard library, just like "swap"
is, the it would be much easier to use it.

Adam Badura

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: Le Chaud Lapin on
On Jan 18, 8:43 am, Seungbeom Kim <musip...(a)bawi.org> wrote:
[snipped]
> I used std::pair just as an easy example, but the same phenomenon can
> happen in any structure where the lexicographical ordering ("compare
> field A first, and field B second, ...") is needed.

Yep, such as gene sequence sorting, etc.

> bool operator<(const pair<A, B, C>& x, const pair<A, B, C>& y)
> {
> return x.first < y.first
> || (!(y.first < x.first)
> && (x.second < y.second
> || (!(y.second < x.second)
> && x.third < y.third)));
> }
>
> // I'm not even sure if I got that right...
> // This would be easier to write and understand:
>
> bool operator<(const pair<A, B, C>& x, const pair<A, B, C>& y)
> {
> if (x.first < y.first) return true;
> if (y.first < x.first) return false;
> if (x.second < y.second) return true;
> if (y.second < x.second) return false;
> return x.third < y.third;
> }
>
> // This is intuitive and efficient!
> int compare(const pair<A, B, C>& x, const pair<A, B, C>& y)
> {
> if (int r = compare(x.first, y.first)) return r;
> if (int r = compare(x.second, y.second)) return r;
> return compare(x.third, y.third);
> }
>
> And the inefficiency could get higher with a deeper level of nesting.

The form of the code above is pratically identical to that of the
actual code below for comparing bytes of Ethernet address:

---- START CODE ----

friend int signum (const Bytes &a, const Bytes &b)
{
if (int s = ::signum(a.buffer[5], b.buffer[5])) return s;
if (int s = ::signum(a.buffer[4], b.buffer[4])) return s;
if (int s = ::signum(a.buffer[3], b.buffer[3])) return s;
if (int s = ::signum(a.buffer[2], b.buffer[2])) return s;
if (int s = ::signum(a.buffer[1], b.buffer[1])) return s;

return ::signum(a.buffer[0], b.buffer[0]);
}

bool operator == (const Bytes &that) const {return signum (*this,
that) == 0;}
bool operator != (const Bytes &that) const {return signum (*this,
that) != 0;}
bool operator > (const Bytes &that) const {return signum (*this,
that) > 0;}
bool operator < (const Bytes &that) const {return signum (*this,
that) < 0;}
bool operator >= (const Bytes &that) const {return signum (*this,
that) >= 0;}
bool operator <= (const Bytes &that) const {return signum (*this,
that) <= 0;}

---- END CODE ----

As you might imagine, this method becomes so comfortable/regular that
after two or three iterations, it requires no forethought.

Whenever I have create a new class whose values can be ordered,
whether I intend to order them or not, immediately, I:

1. Go to one of my existing order-able classes.
2. Cut and paste the code like that above.
3. Rename old class name to new class name within the code.
4. Define signum for the new class.
5. Tuck it away for later, just in case.

Of course, existing STL code cannot benefit from "compare/signum", so
one would have to define their own containers to take advantage of
this method.

-Le Chaud Lapin-


--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]