From: Peter Otten on
wheres pythonmonks wrote:

> Instead of defaultdict for hash of lists, I have seen something like:
>
>
> m={}; m.setdefault('key', []).append(1)
>
> Would this be preferred in some circumstances?

In some circumstances, sure. I just can't think of them at the moment.
Maybe if your code has to work in Python 2.4.

> Also, is there a way to upcast a defaultdict into a dict?

dict(some_defaultdict)

> I have also
> heard some people use exceptions on dictionaries to catch key
> existence, so passing in a defaultdict (I guess) could be hazardous to
> health. Is this true?

A problem could arise when you swap a "key in dict" test with a
"try...except KeyError". This would be an implementation detail for a dict
but affect the contents of a defaultdict:

>>> from collections import defaultdict
>>> def update(d):
.... for c in "abc":
.... try: d[c]
.... except KeyError: d[c] = c
....
>>> d = defaultdict(lambda:"-")
>>> update(d)
>>> d
defaultdict(<function <lambda> at 0x7fd4ce32a320>, {'a': '-', 'c': '-', 'b':
'-'})
>>> def update2(d):
.... for c in "abc":
.... if c not in d:
.... d[c] = c
....
>>> d = defaultdict(lambda:"-")
>>> update2(d)
>>> d
defaultdict(<function <lambda> at 0x7fd4ce32a6e0>, {'a': 'a', 'c': 'c', 'b':
'b'})

Peter

From: wheres pythonmonks on
Sorry, doesn't the following make a copy?

>>>> from collections import defaultdict as dd
>>>> x = dd(int)
>>>> x[1] = 'a'
>>>> x
> defaultdict(<type 'int'>, {1: 'a'})
>>>> dict(x)
> {1: 'a'}
>
>


I was hoping not to do that -- e.g., actually reuse the same
underlying data. Maybe dict(x), where x is a defaultdict is smart? I
agree that a defaultdict is safe to pass to most routines, but I guess
I could imagine that a try/except block is used in a bit of code where
on the key exception (when the value is absent) populates the value
with a random number. In that application, a defaultdict would have
no random values.


Besides a slightly different favor, does the following have
applications not covered by defaultdict?

m.setdefault('key', []).append(1)

I think I am unclear on the difference between that and:

m['key'] = m.get('key',[]).append(1)

Except that the latter works for immutable values as well as containers.

On Fri, Jul 30, 2010 at 8:19 AM, Steven D'Aprano
<steve(a)remove-this-cybersource.com.au> wrote:
> On Fri, 30 Jul 2010 07:59:52 -0400, wheres pythonmonks wrote:
>
>> Instead of defaultdict for hash of lists, I have seen something like:
>>
>>
>> m={}; m.setdefault('key', []).append(1)
>>
>> Would this be preferred in some circumstances?
>
> Sure, why not? Whichever you prefer.
>
> setdefault() is a venerable old technique, dating back to Python 2.0, and
> not a newcomer like defaultdict.
>
>
>> Also, is there a way to upcast a defaultdict into a dict?
>
> "Upcast"? Surely it is downcasting. Or side-casting. Or type-casting.
> Whatever. *wink*
>
> Whatever it is, the answer is Yes:
>
>>>> from collections import defaultdict as dd
>>>> x = dd(int)
>>>> x[1] = 'a'
>>>> x
> defaultdict(<type 'int'>, {1: 'a'})
>>>> dict(x)
> {1: 'a'}
>
>
>
>> I have also heard some people use
>> exceptions on dictionaries to catch key existence, so passing in a
>> defaultdict (I guess) could be hazardous to health.  Is this true?
>
> Yes, it is true that some people use exceptions on dicts to catch key
> existence. The most common reason to do so is to catch the non-existence
> of a key so you can add it:
>
> try:
>    mydict[x] = mydict[x] + 1
> except KeyError:
>    mydict[x] = 1
>
>
> If mydict is a defaultdict with the appropriate factory, then the change
> is perfectly safe because mydict[x] will not raise an exception when x is
> missing, but merely return 0, so it will continue to work as expected and
> all is good.
>
> Of course, if you pass it an defaultdict with an *inappropriate* factory,
> you'll get an error. So don't do that :) Seriously, you can't expect to
> just randomly replace a variable with some arbitrarily different variable
> and expect it to work. You need to know what the code is expecting, and
> not break those expectations too badly.
>
> And now you have at least three ways of setting missing values in a dict.
> And those wacky Perl people say that Python's motto is "only one way to
> do it" :)
>
>
>
> --
> Steven
> --
> http://mail.python.org/mailman/listinfo/python-list
>
From: Steven D'Aprano on
On Fri, 30 Jul 2010 08:34:52 -0400, wheres pythonmonks wrote:

> Sorry, doesn't the following make a copy?
>
>>>>> from collections import defaultdict as dd x = dd(int)
>>>>> x[1] = 'a'
>>>>> x
>> defaultdict(<type 'int'>, {1: 'a'})
>>>>> dict(x)
>> {1: 'a'}
>>
>>
>>
>
> I was hoping not to do that -- e.g., actually reuse the same underlying
> data.


It does re-use the same underlying data.

>>> from collections import defaultdict as dd
>>> x = dd(list)
>>> x[1].append(1)
>>> x
defaultdict(<type 'list'>, {1: [1]})
>>> y = dict(x)
>>> x[1].append(42)
>>> y
{1: [1, 42]}

Both the defaultdict and the dict are referring to the same underlying
key:value pairs. The data itself isn't duplicated. If they are mutable
items, a change to one will affect the other (because they are the same
item). An analogy for C programmers would be that creating dict y from
dict y merely copies the pointers to the keys and values, it doesn't copy
the data being pointed to.

(That's pretty much what the CPython implementation does. Other
implementations may do differently, so long as the visible behaviour
remains the same.)



> Maybe dict(x), where x is a defaultdict is smart? I agree that a
> defaultdict is safe to pass to most routines, but I guess I could
> imagine that a try/except block is used in a bit of code where on the
> key exception (when the value is absent) populates the value with a
> random number. In that application, a defaultdict would have no random
> values.

If you want a defaultdict with a random default value, it is easy to
provide:

>>> import random
>>> z = dd(random.random)
>>> z[2] += 0
>>> z
defaultdict(<built-in method random of Random object at 0xa01e4ac>, {2:
0.30707092626033605})


The point which I tried to make, but obviously failed, is that any piece
of code has certain expectations about the data it accepts. If take a
function that expects an int between -2 and 99, and instead decide to
pass a Decimal between 100 and 150, then you'll have problems: if you're
lucky, you'll get an exception, if you're unlucky, it will silently give
the wrong results. Changing a dict to a defaultdict is no different.

If you have code that *relies* on getting a KeyError for missing keys:

def who_is_missing(adict):
for person in ("Fred", "Barney", "Wilma", "Betty"):
try:
adict[person]
except KeyError:
print person, "is missing"

then changing adict to a defaultdict will cause the function to
misbehave. That's not unique to dicts and defaultdicts.



> Besides a slightly different favor, does the following have applications
> not covered by defaultdict?
>
> m.setdefault('key', []).append(1)

defaultdict calls a function of no arguments to provide a default value.
That means, in practice, it almost always uses the same default value for
any specific dict.

setdefault takes an argument when you call the function. So you can
provide anything you like at runtime.


> I think I am unclear on the difference between that and:
>
> m['key'] = m.get('key',[]).append(1)

Have you tried it? I guess you haven't, or you wouldn't have thought they
did the same thing.

Hint -- what does [].append(1) return?


--
Steven
From: wheres pythonmonks on
>
> Hint -- what does [].append(1) return?
>

Again, apologies from a Python beginner. It sure seems like one has
to do gymnastics to get good behavior out of the core-python:

Here's my proposed fix:

m['key'] = (lambda x: x.append(1) or x)(m.get('key',[]))

Yuck! So I guess I'll use defaultdict with upcasts to dict as needed.

On a side note: does up-casting always work that way with shared
(common) data from derived to base? (I mean if the data is part of
base's interface, will b = base(child) yield a new base object that
shares data with the child?)

Thanks again from a Perl-to-Python convert!

W


On Fri, Jul 30, 2010 at 11:47 PM, Steven D'Aprano
<steve(a)remove-this-cybersource.com.au> wrote:
> On Fri, 30 Jul 2010 08:34:52 -0400, wheres pythonmonks wrote:
>
>> Sorry, doesn't the following make a copy?
>>
>>>>>> from collections import defaultdict as dd x = dd(int)
>>>>>> x[1] = 'a'
>>>>>> x
>>> defaultdict(<type 'int'>, {1: 'a'})
>>>>>> dict(x)
>>> {1: 'a'}
>>>
>>>
>>>
>>
>> I was hoping not to do that -- e.g., actually reuse the same underlying
>> data.
>
>
> It does re-use the same underlying data.
>
>>>> from collections import defaultdict as dd
>>>> x = dd(list)
>>>> x[1].append(1)
>>>> x
> defaultdict(<type 'list'>, {1: [1]})
>>>> y = dict(x)
>>>> x[1].append(42)
>>>> y
> {1: [1, 42]}
>
> Both the defaultdict and the dict are referring to the same underlying
> key:value pairs. The data itself isn't duplicated. If they are mutable
> items, a change to one will affect the other (because they are the same
> item). An analogy for C programmers would be that creating dict y from
> dict y merely copies the pointers to the keys and values, it doesn't copy
> the data being pointed to.
>
> (That's pretty much what the CPython implementation does. Other
> implementations may do differently, so long as the visible behaviour
> remains the same.)
>
>
>
>> Maybe dict(x), where x is a defaultdict is smart?  I agree that a
>> defaultdict is safe to pass to most routines, but I guess I could
>> imagine that a try/except block is used in a bit of code where on the
>> key exception (when the value is absent)  populates the value with a
>> random number.  In that application, a defaultdict would have no random
>> values.
>
> If you want a defaultdict with a random default value, it is easy to
> provide:
>
>>>> import random
>>>> z = dd(random.random)
>>>> z[2] += 0
>>>> z
> defaultdict(<built-in method random of Random object at 0xa01e4ac>, {2:
> 0.30707092626033605})
>
>
> The point which I tried to make, but obviously failed, is that any piece
> of code has certain expectations about the data it accepts. If take a
> function that expects an int between -2 and 99, and instead decide to
> pass a Decimal between 100 and 150, then you'll have problems: if you're
> lucky, you'll get an exception, if you're unlucky, it will silently give
> the wrong results. Changing a dict to a defaultdict is no different.
>
> If you have code that *relies* on getting a KeyError for missing keys:
>
> def who_is_missing(adict):
>    for person in ("Fred", "Barney", "Wilma", "Betty"):
>        try:
>            adict[person]
>        except KeyError:
>            print person, "is missing"
>
> then changing adict to a defaultdict will cause the function to
> misbehave. That's not unique to dicts and defaultdicts.
>
>
>
>> Besides a slightly different favor, does the following have applications
>> not covered by defaultdict?
>>
>> m.setdefault('key', []).append(1)
>
> defaultdict calls a function of no arguments to provide a default value.
> That means, in practice, it almost always uses the same default value for
> any specific dict.
>
> setdefault takes an argument when you call the function. So you can
> provide anything you like at runtime.
>
>
>> I think I am unclear on the difference between that and:
>>
>> m['key'] = m.get('key',[]).append(1)
>
> Have you tried it? I guess you haven't, or you wouldn't have thought they
> did the same thing.
>
> Hint -- what does [].append(1) return?
>
>
> --
> Steven
> --
> http://mail.python.org/mailman/listinfo/python-list
>
From: Steven D'Aprano on
On Sat, 31 Jul 2010 01:02:47 -0400, wheres pythonmonks wrote:


>> Hint -- what does [].append(1) return?
>>
>>
> Again, apologies from a Python beginner. It sure seems like one has to
> do gymnastics to get good behavior out of the core-python:
>
> Here's my proposed fix:
>
> m['key'] = (lambda x: x.append(1) or x)(m.get('key',[]))
>
> Yuck!

Yuk is right. What's wrong with the simple, straightforward solution?

L = m.get('key', [])
L.append(1)
m['key'] = L


Not everything needs to be a one-liner. But if you insist on making it a
one-liner, that's what setdefault and defaultdict are for.



> So I guess I'll use defaultdict with upcasts to dict as needed.

You keep using that term "upcast". I have no idea what you think it
means, so I have no idea whether or not Python does it. Perhaps you
should explain what you think "upcasting" is.


> On a side note: does up-casting always work that way with shared
> (common) data from derived to base? (I mean if the data is part of
> base's interface, will b = base(child) yield a new base object that
> shares data with the child?)

Of course not. It depends on the implementation of the class.


--
Steven