From: dimitri pater - serpia on
Hi,

I have two related lists:
x = [1 ,2, 8, 5, 0, 7]
y = ['a', 'a', 'b', 'c', 'c', 'c' ]

what I need is a list representing the mean value of 'a', 'b' and 'c'
while maintaining the number of items (len):
w = [1.5, 1.5, 8, 4, 4, 4]

I have looked at iter(tools) and next(), but that did not help me. I'm
a bit stuck here, so your help is appreciated!

thanks!
Dimitri
From: MRAB on
dimitri pater - serpia wrote:
> Hi,
>
> I have two related lists:
> x = [1 ,2, 8, 5, 0, 7]
> y = ['a', 'a', 'b', 'c', 'c', 'c' ]
>
> what I need is a list representing the mean value of 'a', 'b' and 'c'
> while maintaining the number of items (len):
> w = [1.5, 1.5, 8, 4, 4, 4]
>
> I have looked at iter(tools) and next(), but that did not help me. I'm
> a bit stuck here, so your help is appreciated!
>
Try doing it in 2 passes.

First pass: count the number of times each string occurs in 'y' and the
total for each (zip/izip and defaultdict are useful for these).

Second pass: create the result list containing the mean values.
From: Chris Rebert on
On Mon, Mar 8, 2010 at 2:34 PM, dimitri pater - serpia
<dimitri.pater(a)gmail.com> wrote:
> Hi,
>
> I have two related lists:
> x = [1 ,2, 8, 5, 0, 7]
> y = ['a', 'a', 'b', 'c', 'c', 'c' ]
>
> what I need is a list representing the mean value of 'a', 'b' and 'c'
> while maintaining the number of items (len):
> w = [1.5, 1.5, 8, 4, 4, 4]
>
> I have looked at iter(tools) and next(), but that did not help me. I'm
> a bit stuck here, so your help is appreciated!

from __future__ import division

def group(keys, values):
#requires None not in keys
groups = []
cur_key = None
cur_vals = None
for key, val in zip(keys, values):
if key != cur_key:
if cur_key is not None:
groups.append((cur_key, cur_vals))
cur_vals = [val]
cur_key = key
else:
cur_vals.append(val)
groups.append((cur_key, cur_vals))
return groups

def average(lst):
return sum(lst) / len(lst)

def process(x, y):
result = []
for key, vals in group(y, x):
avg = average(vals)
for i in xrange(len(vals)):
result.append(avg)
return result

x = [1 ,2, 8, 5, 0, 7]
y = ['a', 'a', 'b', 'c', 'c', 'c' ]

print process(x, y)
#=> [1.5, 1.5, 8.0, 4.0, 4.0, 4.0]

It could be tweaked to use itertools.groupby(), but it would probably
be less efficient/clear.

Cheers,
Chris
--
http://blog.rebertia.com
From: dimitri pater - serpia on
thanks Chris and MRAB!
Looks good, I'll try it out

On Tue, Mar 9, 2010 at 12:22 AM, Chris Rebert <clp2(a)rebertia.com> wrote:
> On Mon, Mar 8, 2010 at 2:34 PM, dimitri pater - serpia
> <dimitri.pater(a)gmail.com> wrote:
>> Hi,
>>
>> I have two related lists:
>> x = [1 ,2, 8, 5, 0, 7]
>> y = ['a', 'a', 'b', 'c', 'c', 'c' ]
>>
>> what I need is a list representing the mean value of 'a', 'b' and 'c'
>> while maintaining the number of items (len):
>> w = [1.5, 1.5, 8, 4, 4, 4]
>>
>> I have looked at iter(tools) and next(), but that did not help me. I'm
>> a bit stuck here, so your help is appreciated!
>
> from __future__ import division
>
> def group(keys, values):
>    #requires None not in keys
>    groups = []
>    cur_key = None
>    cur_vals = None
>    for key, val in zip(keys, values):
>        if key != cur_key:
>            if cur_key is not None:
>                groups.append((cur_key, cur_vals))
>            cur_vals = [val]
>            cur_key = key
>        else:
>            cur_vals.append(val)
>    groups.append((cur_key, cur_vals))
>    return groups
>
> def average(lst):
>    return sum(lst) / len(lst)
>
> def process(x, y):
>    result = []
>    for key, vals in group(y, x):
>        avg = average(vals)
>        for i in xrange(len(vals)):
>            result.append(avg)
>    return result
>
> x = [1 ,2, 8, 5, 0, 7]
> y = ['a', 'a', 'b', 'c', 'c', 'c' ]
>
> print process(x, y)
> #=> [1.5, 1.5, 8.0, 4.0, 4.0, 4.0]
>
> It could be tweaked to use itertools.groupby(), but it would probably
> be less efficient/clear.
>
> Cheers,
> Chris
> --
> http://blog.rebertia.com
>



--
---
You can't have everything. Where would you put it? -- Steven Wright
---
please visit www.serpia.org
From: John Posner on
On 3/8/2010 5:34 PM, dimitri pater - serpia wrote:
> Hi,
>
> I have two related lists:
> x = [1 ,2, 8, 5, 0, 7]
> y = ['a', 'a', 'b', 'c', 'c', 'c' ]
>
> what I need is a list representing the mean value of 'a', 'b' and 'c'
> while maintaining the number of items (len):
> w = [1.5, 1.5, 8, 4, 4, 4]
>
> I have looked at iter(tools) and next(), but that did not help me. I'm
> a bit stuck here, so your help is appreciated!

Nobody expects object-orientation (or the Spanish Inquisition):

#-------------------------
from collections import defaultdict

class Tally:
def __init__(self, id=None):
self.id = id
self.total = 0
self.count = 0

x = [1 ,2, 8, 5, 0, 7]
y = ['a', 'a', 'b', 'c', 'c', 'c']

# gather data
tally_dict = defaultdict(Tally)
for i in range(len(x)):
obj = tally_dict[y[i]]
obj.id = y[i]
obj.total += x[i]
obj.count += 1

# process data
result_list = []
for key in sorted(tally_dict):
obj = tally_dict[key]
mean = 1.0 * obj.total / obj.count
result_list.extend([mean] * obj.count)
print result_list
#-------------------------

-John