From: Patrick Maupin on
On Mar 28, 11:56 am, "Alf P. Steinbach" <al...(a)start.no> wrote:
>  From a more practical point of view, the sum efficiency could be improved by
> doing the first addition using '+' and the rest using '+=', without changing the
> behavior.

Mod parent up!!!! :-)
From: Steve Howell on
On Mar 28, 10:21 am, Steve Howell <showel...(a)yahoo.com> wrote:
>     import timeit
>     M = 10
>     N = 8000
>
>     def in_place(
>         start = [],
>         sublists = ([[None] * M]) * N
>         ):
>         # only macro-optimized
>         i = 0
>         for sublist in sublists:
>             if i == 0:
>                accum = start + sublist
>                i += 1
>             else:
>                 accum.extend(sublist)

FYI I later obtained similar results with the more general:
accum += sublist

>         if i == 0:
>             return 'whatever' # semantics here?
>         return accum
>
>     def with_intermediates(
>         start = [],
>         sublists = ([[None] * M]) * N
>         ):
>         accum = start
>         for sublist in sublists:
>             accum = accum + sublist
>         return accum
>
From: Patrick Maupin on
On Mar 28, 12:34 pm, Steve Howell <showel...(a)yahoo.com> wrote:
> FYI I later obtained similar results with the more general:
>                   accum += sublist

Yeah, but you still have to create an object of the correct type for
accum. And for summing small lists, that will actually increase the
runtime. (Worst case, a list of a single item: you have to create the
accum object based on the type of the start value, then do two += into
it, for the start value and the first item in the list, vs just doing
a single + which automatically creates an object.)

Personally, I think the most general approach, if sum were open to
enhancements, would be to automatically apply Alf's suggestion of "+"
on summing the first item to the start value, and "+=" on subsequent
items. Also, you *could* add a parameter to sum(), for example,
default "partialsums=False" that would allow the use of partial sums
in those cases where that might give better behavior (such as floats
and tuples).

From: Steve Howell on
On Mar 28, 11:16 am, Patrick Maupin <pmau...(a)gmail.com> wrote:
> On Mar 28, 12:34 pm, Steve Howell <showel...(a)yahoo.com> wrote:
>
> > FYI I later obtained similar results with the more general:
> >                   accum += sublist
>
> Yeah, but you still have to create an object of the correct type for
> accum.  

I think you overlooked the surrounding code.

Here is the code again:

def in_place(
start = [],
sublists = ([[None] * M]) * N
):
# only macro-optimized
i = 0
for sublist in sublists:
if i == 0:
accum = start + sublist
i += 1
else:
accum += sublist
if i == 0:
return 'whatever' # semantics here?
return accum

No need to infer the type.

> And for summing small lists, that will actually increase the
> runtime.  (Worst case, a list of a single item: you have to create the
> accum object based on the type of the start value, then do two += into
> it, for the start value and the first item in the list, vs just doing
> a single + which automatically creates an object.)
>

For small lists, the performance of any sane implementation would
probably be so fast as to be negligible, unless you are in a really
tight loop. If you are in a tight loop, then your use case probably
eventually sums large lists. Even if it doesn't, the slowdown is just
a constant.

For M=5, I get these results on my machine:

M N t1 t2 (t2/t1)
5 1 3.50475311279e-05 3.2901763916e-05 0.938775510204
5 2 2.00271606445e-05 1.59740447998e-05 0.797619047619
5 4 6.79492950439e-05 6.31809234619e-05 0.929824561404
5 8 0.000124931335449 0.000126123428345 1.00954198473
5 64 0.000530958175659 0.00226187705994 4.25999101931
5 1024 0.00262904167175 0.27246594429 103.636981953

t1 = time with add only first element
t2 = time with add all elements

Very small penalty for small lists, very large gain for large lists.

> Personally, I think the most general approach, if sum were open to
> enhancements, would be to automatically apply Alf's suggestion of "+"
> on summing the first item to the start value, and "+=" on subsequent
> items.  

See above. That's the approach I would use as well.

From: Paul Rubin on
Steve Howell <showell30(a)yahoo.com> writes:
> The documentation is pretty clear on the intention that sum() is
> intended for numbers: ...

I've never been big on the idea of duck-typing addition. Who would have
thought that (1,2,3)+(4.5.6) was something other than the vector sum?

I think itertools.chain more directly expresses what the OP was trying
to do, and is preferable for readability, independently of these
performance questions.