Prev: Street address parsing in Python, again.
Next: bz2 module doesn't work properly with all bz2 files
From: MRAB on 4 Jun 2010 14:50 kj wrote: > > > > > Task: given a list, produce a tally of all the distinct items in > the list (for some suitable notion of "distinct"). > > Example: if the list is ['a', 'b', 'c', 'a', 'b', 'c', 'a', 'b', > 'c', 'a'], then the desired tally would look something like this: > > [('a', 4), ('b', 3), ('c', 3)] > > I find myself needing this simple operation so often that I wonder: > > 1. is there a standard name for it? > 2. is there already a function to do it somewhere in the Python > standard library? > > Granted, as long as the list consists only of items that can be > used as dictionary keys (and Python's equality test for hashkeys > agrees with the desired notion of "distinctness" for the tallying), > then the following does the job passably well: > > def tally(c): > t = dict() > for x in c: > t[x] = t.get(x, 0) + 1 > return sorted(t.items(), key=lambda x: (-x[1], x[0])) > > But, of course, if a standard library solution exists it would be > preferable. Otherwise I either cut-and-paste the above every time > I need it, or I create a module just for it. (I don't like either > of these, though I suppose that the latter is much better than the > former.) > > So anyway, I thought I'd ask. :) > In Python 3 there's the 'Counter' class in the 'collections' module. It'll also be in Python 2.7. For earlier versions there's this: http://code.activestate.com/recipes/576611/
From: Lie Ryan on 4 Jun 2010 15:56 On 06/05/10 04:38, Magdoll wrote: > On Jun 4, 11:33 am, Peter Otten <__pete...(a)web.de> wrote: >> kj wrote: >> >>> Task: given a list, produce a tally of all the distinct items in >>> the list (for some suitable notion of "distinct"). >> >>> Example: if the list is ['a', 'b', 'c', 'a', 'b', 'c', 'a', 'b', >>> 'c', 'a'], then the desired tally would look something like this: >> >>> [('a', 4), ('b', 3), ('c', 3)] >> >>> I find myself needing this simple operation so often that I wonder: >> >>> 1. is there a standard name for it? >>> 2. is there already a function to do it somewhere in the Python >>> standard library? >> >>> Granted, as long as the list consists only of items that can be >>> used as dictionary keys (and Python's equality test for hashkeys >>> agrees with the desired notion of "distinctness" for the tallying), >>> then the following does the job passably well: >> >>> def tally(c): >>> t = dict() >>> for x in c: >>> t[x] = t.get(x, 0) + 1 >>> return sorted(t.items(), key=lambda x: (-x[1], x[0])) >> >>> But, of course, if a standard library solution exists it would be >>> preferable. Otherwise I either cut-and-paste the above every time >>> I need it, or I create a module just for it. (I don't like either >>> of these, though I suppose that the latter is much better than the >>> former.) >> >>> So anyway, I thought I'd ask. :) >> >> Python 3.1 has, and 2.7 will have collections.Counter: >> >>>>> from collections import Counter >>>>> c = Counter("abcabcabca") >>>>> c.most_common() >> >> [('a', 4), ('c', 3), ('b', 3)] >> >> Peter > > > Thanks Peter, I think you just answered my post :) If you're using previous versions (2.4 and onwards) then: [(o, len(list(g))) for o, g in itertools.groupby(sorted(myList))]
From: kj on 4 Jun 2010 16:52 Thank you all! ~K
From: Sreenivas Reddy Thatiparthy on 5 Jun 2010 13:55 On Jun 4, 11:14 am, kj <no.em...(a)please.post> wrote: > Task: given a list, produce a tally of all the distinct items in > the list (for some suitable notion of "distinct"). > > Example: if the list is ['a', 'b', 'c', 'a', 'b', 'c', 'a', 'b', > 'c', 'a'], then the desired tally would look something like this: > > [('a', 4), ('b', 3), ('c', 3)] > > I find myself needing this simple operation so often that I wonder: > > 1. is there a standard name for it? > 2. is there already a function to do it somewhere in the Python > standard library? > > Granted, as long as the list consists only of items that can be > used as dictionary keys (and Python's equality test for hashkeys > agrees with the desired notion of "distinctness" for the tallying), > then the following does the job passably well: > > def tally(c): > t = dict() > for x in c: > t[x] = t.get(x, 0) + 1 > return sorted(t.items(), key=lambda x: (-x[1], x[0])) > > But, of course, if a standard library solution exists it would be > preferable. Otherwise I either cut-and-paste the above every time > I need it, or I create a module just for it. (I don't like either > of these, though I suppose that the latter is much better than the > former.) > > So anyway, I thought I'd ask. :) > > ~K How about this one liner, if you prefer them; set([(k,yourList.count(k)) for k in yourList])
From: Paul Rubin on 5 Jun 2010 14:00 Sreenivas Reddy Thatiparthy <thatiparthysreenivas(a)gmail.com> writes: > How about this one liner, if you prefer them; > set([(k,yourList.count(k)) for k in yourList]) That has a rather bad efficiency problem if the list is large.
First
|
Prev
|
Pages: 1 2 Prev: Street address parsing in Python, again. Next: bz2 module doesn't work properly with all bz2 files |