From: Douglas Garstang on
Anyone,

I have the two dictionaries below. How can I merge them, such that:

1. The cluster dictionary contains the additional elements from the
default dictionary.
2. Nothing is removed from the cluster dictionary.

The idea here is that the two dictionaries are read from different
files where, if the value isn't found in the cluster dictionary, it's
pulled from the default one, and I can have a new dictionary
reflecting this. The update() method on dictionaries doesn't seem to
work. The resulting dictionary always seems to be the one passed as a
parameter.

default = {
'cluster': {
'platform': {
'elements': {
'data_sources': {
'elements': {
'db_min_pool_size': 10
},
},
},
},
}
}

cluster = {
'cluster': {
'name': 'Customer 1',
'description': 'Production',
'environment': 'production',
'platform': {
'elements': {
'data_source': {
'elements': {
'username': 'username',
'password': 'password'
},
},
},
},
}
}

The resulting dictionary would therefore look like this:

new_dict = {
'cluster': {
'name': 'Customer 1',
'description': 'Production',
'environment': 'production',
'platform': {
'elements': {
'data_source': {
'elements': {
'username': 'username',
'password': 'password',
'db_min_pool_size': 10 # This was added from
the default.
},
},
},
},
}
}


Thanks,
Doug.

--
Regards,

Douglas Garstang
http://www.linkedin.com/in/garstang
Email: doug.garstang(a)gmail.com
Cell: +1-805-340-5627
From: Gary Herron on
On 08/01/2010 11:11 PM, Douglas Garstang wrote:
> On Sun, Aug 1, 2010 at 10:58 PM, Gary Herron<gherron(a)islandtraining.com> wrote:
>
>> On 08/01/2010 10:09 PM, Douglas Garstang wrote:
>>
>>> Anyone,
>>>
>>> I have the two dictionaries below. How can I merge them, such that:
>>>
>>> 1. The cluster dictionary contains the additional elements from the
>>> default dictionary.
>>> 2. Nothing is removed from the cluster dictionary.
>>>
>>> The idea here is that the two dictionaries are read from different
>>> files where, if the value isn't found in the cluster dictionary, it's
>>> pulled from the default one, and I can have a new dictionary
>>> reflecting this. The update() method on dictionaries doesn't seem to
>>> work. The resulting dictionary always seems to be the one passed as a
>>> parameter.
>>>
>>> default = {
>>> 'cluster': {
>>> 'platform': {
>>> 'elements': {
>>> 'data_sources': {
>>> 'elements': {
>>> 'db_min_pool_size': 10
>>> },
>>> },
>>> },
>>> },
>>> }
>>> }
>>>
>>> cluster = {
>>> 'cluster': {
>>> 'name': 'Customer 1',
>>> 'description': 'Production',
>>> 'environment': 'production',
>>> 'platform': {
>>> 'elements': {
>>> 'data_source': {
>>> 'elements': {
>>> 'username': 'username',
>>> 'password': 'password'
>>> },
>>> },
>>> },
>>> },
>>> }
>>> }
>>>
>>> The resulting dictionary would therefore look like this:
>>>
>>> new_dict = {
>>> 'cluster': {
>>> 'name': 'Customer 1',
>>> 'description': 'Production',
>>> 'environment': 'production',
>>> 'platform': {
>>> 'elements': {
>>> 'data_source': {
>>> 'elements': {
>>> 'username': 'username',
>>> 'password': 'password',
>>> 'db_min_pool_size': 10 # This was added from
>>> the default.
>>> },
>>> },
>>> },
>>> },
>>> }
>>> }
>>>
>>>
>>> Thanks,
>>> Doug.
>>>
>>>
>>>
>> Your dictionaries are annoyingly complicated -- making it hard to see what's
>> going on. Here I've replaced all the distractions of your dictionary
>> nesting with a simple (string) value. Now when you try to update
>>
>>
>>>>> default = {'cluster': 'some_value'}
>>>>> cluster = {'cluster': 'another_value'}
>>>>> cluster.update(default)
>>>>> print cluster
>>>>>
>> {'cluster': 'some_value'}
>>
>> If you read up on what update is supposed to do, this is correct -- keys in
>> default are inserted into cluster -- replacing values if they already exist.
>>
>> I believe update is not what you want for two reasons:
>>
>> 1. It's doubtful that you want a default to replace an existing value, and
>> that's what update does.
>>
>> 2. I get the distinct impression that you are expecting the update to be
>> applied recursively down through the hierarchy. Such is not the case.
>>
>>
>>
>>
>> And I just have to ask: Of what use whatsoever is a dictionary (hierarchy)
>> that contains *one* single value which needs a sequence of 6 keys to access?
>>
>> print
>> default['cluster']['platform']['elements']['data_sources']['elements']['db_min_pool_size']
>>
>>>>> 10
>>>>>
>> Seems absurd unless there is lots more going on here.
>>
> Thanks. Any particular reason you replied off-list?
>

Huh? Oh hell. My mistake. (This is now back on the list -- where it
should have been to start with.)


> Anyway, I'm trying to model a cluster of servers in a yaml file that
> gets edited by humans and a tree structure makes it easier to
> understand the context of each invidual key. If it was arrange in a
> flat fashion, each key would have to be longer in order to make it
> unique and provide some context as to what the user was actually
> editing.
>
> I actually didn't paste the whole dictionary. I cut it down to make it
> easier to explain. When you see the full version, the multiple levels
> make more sense. Tried various approaches so far, and none work. I
> can't traverse the tree recursively because each time you recurse, you
> lose the absolute position of the key your currently at, and then
> there's no way to update the values.
>
> Doug.
>

Ok. Thanks for simplifying things before sending the question out to
the list. You probably wouldn't have gotten a response otherwise.

I'm not sure I believe the reasoning for the inability to recurse. It
seems rather simple to recurse through the structures in tandem, adding
any key:value found in the default to the other if not already present.

Gary Herron




From: Douglas Garstang on
On Sun, Aug 1, 2010 at 11:57 PM, Gary Herron <gherron(a)islandtraining.com> wrote:
> On 08/01/2010 11:11 PM, Douglas Garstang wrote:
>>
>> On Sun, Aug 1, 2010 at 10:58 PM, Gary Herron<gherron(a)islandtraining.com>
>>  wrote:
>>
>>>
>>> On 08/01/2010 10:09 PM, Douglas Garstang wrote:
>>>
>>>>
>>>> Anyone,
>>>>
>>>> I have the two dictionaries below. How can I merge them, such that:
>>>>
>>>> 1. The cluster dictionary contains the additional elements from the
>>>> default dictionary.
>>>> 2. Nothing is removed from the cluster dictionary.
>>>>
>>>> The idea here is that the two dictionaries are read from different
>>>> files where, if the value isn't found in the cluster dictionary, it's
>>>> pulled from the default one, and I can have a new dictionary
>>>> reflecting this. The update() method on dictionaries doesn't seem to
>>>> work. The resulting dictionary always seems to be the one passed as a
>>>> parameter.
>>>>
>>>> default = {
>>>>     'cluster': {
>>>>         'platform': {
>>>>             'elements': {
>>>>                 'data_sources': {
>>>>                     'elements': {
>>>>                         'db_min_pool_size': 10
>>>>                     },
>>>>                 },
>>>>             },
>>>>         },
>>>>     }
>>>> }
>>>>
>>>> cluster = {
>>>>     'cluster': {
>>>>         'name': 'Customer 1',
>>>>         'description': 'Production',
>>>>         'environment': 'production',
>>>>         'platform': {
>>>>             'elements': {
>>>>                 'data_source': {
>>>>                     'elements': {
>>>>                         'username': 'username',
>>>>                         'password': 'password'
>>>>                     },
>>>>                 },
>>>>             },
>>>>         },
>>>>     }
>>>> }
>>>>
>>>> The resulting dictionary would therefore look like this:
>>>>
>>>> new_dict = {
>>>>     'cluster': {
>>>>         'name': 'Customer 1',
>>>>         'description': 'Production',
>>>>         'environment': 'production',
>>>>         'platform': {
>>>>             'elements': {
>>>>                 'data_source': {
>>>>                     'elements': {
>>>>                         'username': 'username',
>>>>                         'password': 'password',
>>>>                         'db_min_pool_size': 10 # This was added from
>>>> the default.
>>>>                     },
>>>>                 },
>>>>             },
>>>>         },
>>>>     }
>>>> }
>>>>
>>>>
>>>> Thanks,
>>>> Doug.
>>>>
>>>>
>>>>
>>>
>>> Your dictionaries are annoyingly complicated -- making it hard to see
>>> what's
>>> going on.  Here I've replaced all the distractions of your dictionary
>>> nesting with a simple (string) value.  Now when you try to update
>>>
>>>
>>>>>>
>>>>>> default = {'cluster': 'some_value'}
>>>>>> cluster = {'cluster': 'another_value'}
>>>>>> cluster.update(default)
>>>>>> print cluster
>>>>>>
>>>
>>> {'cluster': 'some_value'}
>>>
>>> If you read up on what update is supposed to do, this is correct -- keys
>>> in
>>> default are inserted into cluster -- replacing values if they already
>>> exist.
>>>
>>> I believe update is not what you want for two reasons:
>>>
>>>  1.  It's doubtful that you want a default to replace an existing value,
>>> and
>>> that's what update does.
>>>
>>>  2.  I get the distinct impression that you are expecting the update to
>>> be
>>> applied recursively down through the hierarchy.  Such is not the case..
>>>
>>>
>>>
>>>
>>> And I just have to ask: Of what use whatsoever is a dictionary
>>> (hierarchy)
>>> that contains *one* single value which needs a sequence of 6 keys to
>>> access?
>>>
>>> print
>>>
>>> default['cluster']['platform']['elements']['data_sources']['elements']['db_min_pool_size']
>>>
>>>>>>
>>>>>> 10
>>>>>>
>>>
>>> Seems absurd unless there is lots more going on here.
>>>
>>
>> Thanks. Any particular reason you replied off-list?
>>
>
> Huh?  Oh hell.  My mistake.  (This is now back on the list -- where it
> should have been to start with.)
>
>
>> Anyway, I'm trying to model a cluster of servers in a yaml file that
>> gets edited by humans and a tree structure makes it easier to
>> understand the context of each invidual key. If it was arrange in a
>> flat fashion, each key would have to be longer in order to make it
>> unique and provide some context as to what the user was actually
>> editing.
>>
>> I actually didn't paste the whole dictionary. I cut it down to make it
>> easier to explain. When you see the full version, the multiple levels
>> make more sense. Tried various approaches so far, and none work. I
>> can't traverse the tree recursively because each time you recurse, you
>> lose the absolute position of the key your currently at, and then
>> there's no way to update the values.
>>
>> Doug.
>>
>
> Ok.  Thanks for simplifying things before sending the question out to the
> list.  You probably wouldn't have gotten a response otherwise.
>
> I'm not sure I believe the reasoning for the inability to recurse.  It seems
> rather simple to recurse through the structures in tandem, adding any
> key:value found in the default to the other if not already present.

Actually, I had issues with trying recurse through the structures in
tandem too. This didn't work:

for a,b,c,d in ( cluster.iteritems(), default.iteritems() ):
... do something ...

It returns an unpack error.

Doug.
From: Chris Rebert on
On Mon, Aug 2, 2010 at 12:06 AM, Douglas Garstang
<doug.garstang(a)gmail.com> wrote:
> Actually, I had issues with trying recurse through the structures in
> tandem too. This didn't work:
>
> for a,b,c,d in ( cluster.iteritems(), default.iteritems() ):
>    ... do something ...
>
> It returns an unpack error.

Well, yeah. That for-loop has several problems:
- You're iterating over the items of a 2-tuple. It's just like:
for a,b,c,d in [1, 2]:
It's not treated any differently just because the items happen to be
iterators themselves. The iterators aren't automagically iterated
through in parallel just by putting them in a tuple. That would
require a zip().
- iteritems() returns a sequence of 2-tuples. Even when zipped, these
tuples don't get magically unpacked and repacked into 4-tuples:
for a, b, c, d in zip([(1,2), (3,4)], [(5,6), (7,8)]):
# still fails; can't unpack 2 separate tuples (i.e. (1,2) (5,6) )
directly into 4 variables; the nesting is wrong
- iteritems() returns the keys in an arbitrary order; the two
iteritems() calls won't be in any way "synchronized" so the keys match
up

Cheers,
Chris
--
http://blog.rebertia.com
From: Paul Rubin on
Douglas Garstang <doug.garstang(a)gmail.com> writes:
> default = {...
> 'data_sources': { ...
> cluster = {...
> 'data_source': { ...

Did you want both of those to say the same thing instead of one
of them being 'data_source' and the other 'data_sources' ?

If yes, then the following works for me:

def merge(cluster, default):
# destructively merge default into cluster
for k,v in cluster.iteritems():
if k in default and type(v)==dict:
assert type(default(k))==dict
merge(v,default[k])
for k,v in default.iteritems():
if k not in cluster:
cluster[k] = v