related lists mean value [Python]

Prev: Newbie question: python versions differ per user?
Next: remove element with ElementTree

From: dimitri pater - serpia on 8 Mar 2010 17:34

Hi,

I have two related lists:
x = [1 ,2, 8, 5, 0, 7]
y = ['a', 'a', 'b', 'c', 'c', 'c' ]

what I need is a list representing the mean value of 'a', 'b' and 'c'
while maintaining the number of items (len):
w = [1.5, 1.5, 8, 4, 4, 4]

I have looked at iter(tools) and next(), but that did not help me. I'm
a bit stuck here, so your help is appreciated!

thanks!
Dimitri

From: MRAB on 8 Mar 2010 18:15

dimitri pater - serpia wrote:
> Hi,
>
> I have two related lists:
> x = [1 ,2, 8, 5, 0, 7]
> y = ['a', 'a', 'b', 'c', 'c', 'c' ]
>
> what I need is a list representing the mean value of 'a', 'b' and 'c'
> while maintaining the number of items (len):
> w = [1.5, 1.5, 8, 4, 4, 4]
>
> I have looked at iter(tools) and next(), but that did not help me. I'm
> a bit stuck here, so your help is appreciated!
>
Try doing it in 2 passes.

First pass: count the number of times each string occurs in 'y' and the
total for each (zip/izip and defaultdict are useful for these).

Second pass: create the result list containing the mean values.

From: Chris Rebert on 8 Mar 2010 18:22

On Mon, Mar 8, 2010 at 2:34 PM, dimitri pater - serpia
<dimitri.pater(a)gmail.com> wrote:
> Hi,
>
> I have two related lists:
> x = [1 ,2, 8, 5, 0, 7]
> y = ['a', 'a', 'b', 'c', 'c', 'c' ]
>
> what I need is a list representing the mean value of 'a', 'b' and 'c'
> while maintaining the number of items (len):
> w = [1.5, 1.5, 8, 4, 4, 4]
>
> I have looked at iter(tools) and next(), but that did not help me. I'm
> a bit stuck here, so your help is appreciated!

from __future__ import division

def group(keys, values):
#requires None not in keys
groups = []
cur_key = None
cur_vals = None
for key, val in zip(keys, values):
if key != cur_key:
if cur_key is not None:
groups.append((cur_key, cur_vals))
cur_vals = [val]
cur_key = key
else:
cur_vals.append(val)
groups.append((cur_key, cur_vals))
return groups

def average(lst):
return sum(lst) / len(lst)

def process(x, y):
result = []
for key, vals in group(y, x):
avg = average(vals)
for i in xrange(len(vals)):
result.append(avg)
return result

x = [1 ,2, 8, 5, 0, 7]
y = ['a', 'a', 'b', 'c', 'c', 'c' ]

print process(x, y)
#=> [1.5, 1.5, 8.0, 4.0, 4.0, 4.0]

It could be tweaked to use itertools.groupby(), but it would probably
be less efficient/clear.

Cheers,
Chris
--
http://blog.rebertia.com

From: dimitri pater - serpia on 8 Mar 2010 18:47

thanks Chris and MRAB!
Looks good, I'll try it out

On Tue, Mar 9, 2010 at 12:22 AM, Chris Rebert <clp2(a)rebertia.com> wrote:
> On Mon, Mar 8, 2010 at 2:34 PM, dimitri pater - serpia
> <dimitri.pater(a)gmail.com> wrote:
>> Hi,
>>
>> I have two related lists:
>> x = [1 ,2, 8, 5, 0, 7]
>> y = ['a', 'a', 'b', 'c', 'c', 'c' ]
>>
>> what I need is a list representing the mean value of 'a', 'b' and 'c'
>> while maintaining the number of items (len):
>> w = [1.5, 1.5, 8, 4, 4, 4]
>>
>> I have looked at iter(tools) and next(), but that did not help me. I'm
>> a bit stuck here, so your help is appreciated!
>
> from __future__ import division
>
> def group(keys, values):
> #requires None not in keys
> groups = []
> cur_key = None
> cur_vals = None
> for key, val in zip(keys, values):
> if key != cur_key:
> if cur_key is not None:
> groups.append((cur_key, cur_vals))
> cur_vals = [val]
> cur_key = key
> else:
> cur_vals.append(val)
> groups.append((cur_key, cur_vals))
> return groups
>
> def average(lst):
> return sum(lst) / len(lst)
>
> def process(x, y):
> result = []
> for key, vals in group(y, x):
> avg = average(vals)
> for i in xrange(len(vals)):
> result.append(avg)
> return result
>
> x = [1 ,2, 8, 5, 0, 7]
> y = ['a', 'a', 'b', 'c', 'c', 'c' ]
>
> print process(x, y)
> #=> [1.5, 1.5, 8.0, 4.0, 4.0, 4.0]
>
> It could be tweaked to use itertools.groupby(), but it would probably
> be less efficient/clear.
>
> Cheers,
> Chris
> --
> http://blog.rebertia.com
>

--
---
You can't have everything. Where would you put it? -- Steven Wright
---
please visit www.serpia.org

From: John Posner on 8 Mar 2010 21:39

On 3/8/2010 5:34 PM, dimitri pater - serpia wrote:
> Hi,
>
> I have two related lists:
> x = [1 ,2, 8, 5, 0, 7]
> y = ['a', 'a', 'b', 'c', 'c', 'c' ]
>
> what I need is a list representing the mean value of 'a', 'b' and 'c'
> while maintaining the number of items (len):
> w = [1.5, 1.5, 8, 4, 4, 4]
>
> I have looked at iter(tools) and next(), but that did not help me. I'm
> a bit stuck here, so your help is appreciated!

Nobody expects object-orientation (or the Spanish Inquisition):

#-------------------------
from collections import defaultdict

class Tally:
def __init__(self, id=None):
self.id = id
self.total = 0
self.count = 0

x = [1 ,2, 8, 5, 0, 7]
y = ['a', 'a', 'b', 'c', 'c', 'c']

# gather data
tally_dict = defaultdict(Tally)
for i in range(len(x)):
obj = tally_dict[y[i]]
obj.id = y[i]
obj.total += x[i]
obj.count += 1

# process data
result_list = []
for key in sorted(tally_dict):
obj = tally_dict[key]
mean = 1.0 * obj.total / obj.count
result_list.extend([mean] * obj.count)
print result_list
#-------------------------

-John

| Next | Last
Pages: 1 2 3
Prev: Newbie question: python versions differ per user?
Next: remove element with ElementTree