From: D'Arcy J.M. Cain on
On Fri, 23 Apr 2010 00:07:18 +1000
Xavier Ho <contact(a)xavierho.com> wrote:
> > print (sorted (l, reverse=True)[:k])
>
> You don't really need to reverse sort there:

True but...

> >>> numbers = [1, 4, 5, 3, 7, 8]
> >>> sorted(numbers)[3:]
> [5, 7, 8]

Now try returning the top two or four numbers.

--
D'Arcy J.M. Cain <darcy(a)druid.net> | Democracy is three wolves
http://www.druid.net/darcy/ | and a sheep voting on
+1 416 425 1212 (DoD#0082) (eNTP) | what's for dinner.
From: John Nagle on
Chris Rebert wrote:
> 2010/4/22 Jo Chan <csjcg2(a)gmail.com>:
>> Hi,friends.
>> I wanna ask if there is a function which is able to take a list as argument
>> and then return its top-k maximums?
>> I only know about max which is poorly a top-1 maximum function, now I want
>> more yet I am lazy enough that don't want to write one by myself.
>
> http://docs.python.org/library/heapq.html#heapq.nlargest
>
> Cheers,
> Chris
> --
> http://blog.rebertia.com

Is "nlargest" smart enough to decide when it's cheaper to track the
N largest entries on a linear pass through the list than to sort?

This is a typical optimization in SQL databases, by the way. When
you write

SELECT * FROM tab ORDER BY salary LIMIT 2;

you'll probably get a different algorithm than if you write

SELECT * FROM tab ORDER BY salary LIMIT 200;

even if there's no key on "salary".

John Nagle
From: Steven D'Aprano on
On Thu, 22 Apr 2010 10:49:29 -0700, John Nagle wrote:

> Is "nlargest" smart enough to decide when it's cheaper to track the
> N largest entries on a linear pass through the list than to sort?

Doesn't appear to do so. From Python 3.1:

def nlargest(n, iterable):
"""Find the n largest elements in a dataset.

Equivalent to: sorted(iterable, reverse=True)[:n]
"""
it = iter(iterable)
result = list(islice(it, n))
if not result:
return result
heapify(result)
_heappushpop = heappushpop
for elem in it:
_heappushpop(result, elem)
result.sort(reverse=True)
return result


although if there is a C implementation, Python will use that instead of
the above, and that may be different.

Interestingly, nsmallest does use two different algorithms, depending on
how many items you ask for. See the source code.



--
Steven