From: Patrick Maupin on
On Mar 29, 10:29 pm, Steven D'Aprano
<ste...(a)REMOVE.THIS.cybersource.com.au> wrote:
> On Mon, 29 Mar 2010 19:24:42 -0700, Patrick Maupin wrote:
> > On Mar 29, 6:19 pm, Steven D'Aprano <st...(a)REMOVE-THIS-
> > cybersource.com.au> wrote:
> >> How does the existence of math.fsum contradict the existence of sum?
>
> > You're exceptionally good at (probably deliberately) mis-interpreting
> > what people write.
>
> I cannot read your mind, I can only interpret the words you choose to
> write. You said
>
> [quote]
> See, I think the very existence of math.fsum() already violates "there
> should be one obvious way to do it."
> [end quote]
>
> If sum satisfies the existence of one obvious way, how does math.fsum
> violate it? sum exists, and is obvious, regardless of whatever other
> solutions exist as well.

Because sum() is the obvious way to sum floats; now the existence of
math.fsum() means there are TWO obvious ways to sum floats. Is that
really that hard to understand? How can you misconstrue this so badly
that you write something that can be (easily) interpreted to mean that
you think that I think that once math.fsum() exists, sum() doesn't
even exist any more????
From: Steve Howell on
On Mar 29, 8:01 pm, Steven D'Aprano
<ste...(a)REMOVE.THIS.cybersource.com.au> wrote:
> You don't define symmetry. You don't even give a sensible example of
> symmetry. Consequently I reject your argument that because sum is the
> obvious way to sum a lot of integers, "symmetry" implies that it should
> be the obvious way to concatenate a lot of lists.
>

You are not rejecting my argument; you are rejecting an improper
paraphrase of my argument.

My argument was that repeated use of "+" is spelled "sum" for
integers, so it's natural to expect the same name for repeated use of
"+" on lists. Python already allows for this symmetry, just SLOWLY.

>
> You are correct that building intermediate lists isn't *compulsory*,
> there are alternatives, but the alternatives themselves have costs.
> Complexity itself is a cost. sum currently has nice simple semantics,
> which means you can reason about it: sum(sequence, start) is the same as
>
> total = start
> for item in sequence:
>     total = total + start
> return total
>

I could just as reasonably expect these semantics:

total = start
for item in sequence:
total += start
return total

Python does not contradict my expectations here:

>>> start = []
>>> x = sum([], start)
>>> x.append(1)
>>> start
[1]

> You don't have to care what the items in sequence are, you don't have to
> make assumptions about what methods sequence and start have (beyond
> supporting iteration and addition).

The only additional assumption I'm making is that Python can take
advantage of in-place addition, which is easy to introspect.

> Adding special cases to sum means it
> becomes more complex and harder to reason about. If you pass some other
> sequence type in the middle of a bunch of lists, what will happen? Will
> sum suddenly break, or perhaps continue to work but inefficiently?

This is mostly a red herring, as I would tend to use sum() on
sequences of homogenous types.

Python already gives me the power to shoot myself in the foot for
strings.

>>> list = [1, 2]
>>> list += "foo"
>>> list
[1, 2, 'f', 'o', 'o']

>>> lst = [1,2]
>>> lst.extend('foo')
>>> lst
[1, 2, 'f', 'o', 'o']

I'd prefer to get an exception for cases where += would do the same.

>>> start = []
>>> bogus_example = [[1, 2], None, [3]]
>>> for item in bogus_example: start += item
....
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'NoneType' object is not iterable



> You still need to ask these questions with existing sum, but it is
> comparatively easy to answer them: you only need to consider how the
> alternative behaves when added to a list. You don't have to think about
> the technicalities of the sum algorithm itself -- sometimes it calls +,
> sometimes extend, sometimes +=, sometimes something else

I would expect sum() to support the same contract as +=, which already
works for numerics (so no backward incompatibility), and which already
works for lists. For custom-designed classes, I would rely on the
promise that augmented assignment falls back to normal methods.

> ... which of the
> various different optimized branches will I fall into this time? Who
> knows? sum already has two branches. In my opinion, three branches is one
> too many.

As long as it falls into the branch that works, I'm happy. :)

>
> "Aggregating" lists? Not summing them? I think you've just undercut your
> argument that sum is the "obvious" way of concatenating lists.
>
> In natural language, we don't talk about "summing" lists, we talk about
> joining, concatenating or aggregating them. You have just done it
> yourself, and made my point for me.

Nor do you use "chain" or "extend."

> And this very thread started because
> somebody wanted to know what the equivalent to sum for sequences.
>
> If sum was the obvious way to concatenate sequences, this thread wouldn't
> even exist.

This thread is entitled "sum for sequences." I think you just made my
point.

From: Steven D'Aprano on
On Mon, 29 Mar 2010 19:31:44 -0700, Patrick Maupin wrote:

> It's about a lack of surprises. Which, 99% of the time, Python excels
> at. This is why many of us program in Python. This is why some of us
> who would never use sum() on lists, EVEN IF IT WERE FIXED TO NOT BE SO
> OBNOXIOUSLY SLOW, advocate that it, in fact, be fixed to not be so
> obnoxiously slow.

As I said, patches are welcome. Personally, I expect that it would be
rejected, but that's not my decision to make, and who knows, perhaps I'm
wrong and you'll have some of the Python-Dev people support your idea.

sum is not designed to work with lists. It happens to work because lists
happen to use + for concatenation, and because it is too much trouble for
too little benefit to explicitly exclude lists in the same way sum
explicitly excludes strings. In the Python philosophy, simplicity of
implementation is a virtue: the code that is not there contributes
exactly no bugs and has precisely no overhead.

sum has existed as a Python built-in for many years -- by memory, since
Python 2.2, which was nearly nine years ago. Unlike the serious gotcha of
repeated string concatenation:


# DO NOT DO THIS
result = ""
for s in items:
result += s


which *does* cause real problems in real code, I don't believe that there
have been any significant problems caused by summing lists of lists. As
problems go, it is such a minor one that it isn't worth this discussion,
let alone fixing it. But if anyone disagrees, this is open source, go
ahead and fix it. You don't need my permission.



--
Steven
From: Steven D'Aprano on
On Mon, 29 Mar 2010 19:53:04 -0700, Steve Howell wrote:

> On Mar 29, 4:19 pm, Steven D'Aprano <st...(a)REMOVE-THIS-
> cybersource.com.au> wrote:
[...]
>> Python is under no compulsion to make "the obvious way" obvious to
>> anyone except Guido. It's a bonus if it happens to be obvious to
>> newbies, not a requirement.
>>
>> And besides, what is "it" you're talking about?
>>
>> * Adding integers, decimals or fractions, or floats with a low
>>   requirement for precision and accuracy? Use sum.
>>
>> * Adding floats with a high requirement for precision and accuracy?
>>   Use math.fsum.
>>
>> * Concatenating strings? Use ''.join.
>>
>> * Joining lists? Use [].extend.
>>
>> * Iterating over an arbitrary sequence of arbitrary sequences?
>>   Use itertools.chain.
>>
>> That's five different "its", and five obvious ways to do them.
>>
>>
> Let's go through them...

"Obvious" doesn't mean you don't have to learn the tools you use. It
doesn't mean that there's no need to think about the requirements of your
problem. It doesn't even mean that the way to do it has to be a built-in
or pre-built solution in the standard library, or that somebody with no
Python experience could intuit the correct function to use based on
nothing more than a good grasp of English.

It certainly doesn't mean that users shouldn't be expected to know how to
import a module:


> >>> fsum([1.234534665989, 2.987, 3])
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> NameError: name 'fsum' is not defined

I called it math.fsum every time I referred to it. Did I need to specify
that you have to import the math module first?


>> * Concatenating strings? Use ''.join.
>
>
> Common pitfall:
>
> >>> ['abc', 'def', 'ghi'].join()
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> AttributeError: 'list' object has no attribute 'join'

Is it really common?

I've been hanging around this newsgroup for many years now, and I don't
believe I've ever seen anyone confused by this. I've seen plenty of
newbies use repeated string concatenation, but never anyone trying to do
a string join and getting it wrong. If you have any links to threads
showing such confusion, I'd be grateful to see them.


>> * Joining lists? Use [].extend.
>
> Obvious, yes. Convenient? Not really.
>
> >>> start = []
> >>> for list in [[1, 2], [3, 4]]:
> ... start.extend(list)
> ...
> >>> start
> [1, 2, 3, 4]


Why isn't that convenient? It is an obvious algorithm written in three
short lines. If you need a one-liner, write a function and call it:

concatenate_lists(sequence_of_lists)



>> * Iterating over an arbitrary sequence of arbitrary sequences?
>> Use itertools.chain.
>
> >>> group1 = ['al', 'bob']
> >>> group2 = ['cal']
> >>> groups = [group1, group2]
>
> Obvious if you are Dutch...

Or are familiar with the itertools module and the Pythonic practice of
iterating over lazy sequences. Iterators and itertools are fundamental to
the Pythonic way of doing things.



> >>> itertools.chain(groups)
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> NameError: name 'itertools' is not defined

That's the second time you've either mistakenly neglected to import a
module, or deliberately not imported it to make the rhetorical point that
you have to import a module before using it. Yes, you *do* have to import
modules before using them. What's your point? Not everything has to be a
built-in.



[...]
> Sum is builtin, but you have to import fsum from math and chain from
> itertools.
>
> Join is actually a method on strings, not sequences.

Is that supposed to be an argument against them?



[...]
> Just commit all that to memory, and enjoy the productivity of using a
> high level language! ;)

If you don't know your tools, you will spend your life hammering screws
in with the butt of your saw. It will work, for some definition of work.
Giving saws heavier, stronger handles to make it faster to hammer screws
is not what I consider good design.



--
Steven
From: Paul Rubin on
Steven D'Aprano <steven(a)REMOVE.THIS.cybersource.com.au> writes:
>>>> ...
>>> ...
>> ...
> "Obvious" doesn't mean you don't have to learn the tools you use....

Geez you guys, get a room ;-). You're all good programmers with too
much experience for this arguing over stuff this silly.