From: John Posner on
On 7/31/2010 1:31 PM, John Posner wrote:
>
> Caveat -- there's another description of defaultdict here:
>
> http://docs.python.org/library/collections.html#collections.defaultdict
>
> ... and it's bogus. This other description claims that __missing__ is a
> method of defaultdict, not of dict.

Following is a possible replacement for the bogus description. Comments
welcome. I intend to submit a Python doc bug, and I'd like to have a
clean alternative to propose.

--------------

class collections.defaultdict([default_factory[, ...]])

defaultdict is a dict subclass that can guarantee success on key
lookups: if a key does not currently exist in a defaultdict object, a
"default value factory" is called to provide a value for that key. The
"default value factory" is a callable object (typically, a function)
that takes no arguments. You specify this callable as the first argument
to defaultdict(). Additional defaultdict() arguments are the same as for
dict().

The "default value factory" callable is stored as an attribute,
default_factory, of the newly created defaultdict object. If you call
defaultdict() with no arguments, or with None as the first argument, the
default_factory attribute is set to None. You can reassign the
default_factory attribute of an existing defaultdict object to another
callable, or to None.

When a lookup of a non-existent key is performed in a defaultdict
object, its default_factory attribute is evaluated, and the resulting
object is called:

* If the call produces a value, that value is returned as the result of
the lookup. In addition, the key-value pair is inserted into the
defaultdict.

* If the call raises an exception, it is propagated unchanged.

* If the default_factory attribute evaluates to None, a KeyError
exception is raised, with the non-existent key as its argument. (The
defaultdict behaves exactly like a standard dict in this case.)

From: Ethan Furman on
John Posner wrote:
> On 7/31/2010 1:31 PM, John Posner wrote:
>>
>> Caveat -- there's another description of defaultdict here:
>>
>> http://docs.python.org/library/collections.html#collections.defaultdict
>>
>> ... and it's bogus. This other description claims that __missing__ is a
>> method of defaultdict, not of dict.
>
> Following is a possible replacement for the bogus description. Comments
> welcome. I intend to submit a Python doc bug, and I'd like to have a
> clean alternative to propose.
>
> --------------
>
> class collections.defaultdict([default_factory[, ...]])
>
> defaultdict is a dict subclass that can guarantee success on key
> lookups: if a key does not currently exist in a defaultdict object, a
> "default value factory" is called to provide a value for that key. The
> "default value factory" is a callable object (typically, a function)
> that takes no arguments. You specify this callable as the first argument
> to defaultdict(). Additional defaultdict() arguments are the same as for
> dict().
>
> The "default value factory" callable is stored as an attribute,
> default_factory, of the newly created defaultdict object. If you call
> defaultdict() with no arguments, or with None as the first argument, the
> default_factory attribute is set to None. You can reassign the
> default_factory attribute of an existing defaultdict object to another
> callable, or to None.
>
> When a lookup of a non-existent key is performed in a defaultdict
> object, its default_factory attribute is evaluated, and the resulting
> object is called:
>
> * If the call produces a value, that value is returned as the result of
> the lookup. In addition, the key-value pair is inserted into the
> defaultdict.
>
> * If the call raises an exception, it is propagated unchanged.
>
> * If the default_factory attribute evaluates to None, a KeyError
> exception is raised, with the non-existent key as its argument. (The
> defaultdict behaves exactly like a standard dict in this case.)

I think mentioning how __missing__ plays into all this would be helpful.
Perhaps in the first paragraph, after the colon:

if a key does not currently exist in a defaultdict object, __missing__
will be called with that key, which in turn will call a "default value
factory" to provide a value for that key.

~Ethan~
From: John Posner on
On 8/3/2010 12:54 PM, Ethan Furman wrote:

<snip>

> I think mentioning how __missing__ plays into all this would be helpful.
> Perhaps in the first paragraph, after the colon:
>
> if a key does not currently exist in a defaultdict object, __missing__
> will be called with that key, which in turn will call a "default value
> factory" to provide a value for that key.

Thanks, Ethan. As I said (or at least implied) to Christian earlier in
this thread, I don't want to repeat the mistake of the current
description: confusing the functionality provided *by* the defaultdict
class with underlying functionality (the dict type's __missing__
protocol) that is used in the definition of the class.

So I'd rather not mention __missing__ in the first paragraph, which
describes the functionality provided *by* the defaultdict class. How
about adding this para at the end:

defaultdict is defined using functionality that is available to *any*
subclass of dict: a missing-key lookup automatically causes the
subclass's __missing__ method to be called, with the non-existent key
as its argument. The method's return value becomes the result of the
lookup.

BTW, I couldn't *find* the coding of defaultdict in the Python 2.6
library. File collections.py contains this code:

from _abcoll import *
import _abcoll
__all__ += _abcoll.__all__

from _collections import deque, defaultdict

.... but I ran into a dead end after that. :-( I believe that the
following *could be* the definition of defaultdict:

class defaultdict(dict):
def __init__(self, factory, *args, **kwargs):
dict.__init__(self, *args, **kwargs)
self.default_factory = factory

def __missing__(self, key):
"""provide value for missing key"""
value = self.default_factory() # call factory with no args
self[key] = value
return value

-John
From: Christian Heimes on
> So I'd rather not mention __missing__ in the first paragraph, which
> describes the functionality provided *by* the defaultdict class. How
> about adding this para at the end:
>
> defaultdict is defined using functionality that is available to *any*
> subclass of dict: a missing-key lookup automatically causes the
> subclass's __missing__ method to be called, with the non-existent key
> as its argument. The method's return value becomes the result of the
> lookup.

Your proposal sounds like a good idea.

By the way do you have a CS degree? Your wording sounds like you are
used write theses on a CS degree level. No offense. ;)

> BTW, I couldn't *find* the coding of defaultdict in the Python 2.6
> library. File collections.py contains this code:
>
> from _abcoll import *
> import _abcoll
> __all__ += _abcoll.__all__
>
> from _collections import deque, defaultdict

defaultdict is implemented in C. You can read up the source code at
http://svn.python.org/view/python/trunk/Modules/_collectionsmodule.c?revision=81029&view=markup
.. Search for "defaultdict type". The C code isn't complicated. You
should understand the concept even if you are not familiar with the C
API of Python.

> class defaultdict(dict):
> def __init__(self, factory, *args, **kwargs):
> dict.__init__(self, *args, **kwargs)
> self.default_factory = factory
>
> def __missing__(self, key):
> """provide value for missing key"""
> value = self.default_factory() # call factory with no args
> self[key] = value
> return value

The type also implements __repr__(), copy() and __reduce__(). The latter
is used by the pickle protocol. Without a new __reduce__ method, the
default_factory would no survive a pickle/unpickle cycle. For a pure
Python implementation you'd have to add __slots__ = "default_factory",
too. Otherwise every defaultdict instance would gain an unncessary
__dict__ attribute, too.

Christian

From: John Posner on
On 8/3/2010 5:47 PM, Christian Heimes wrote:
>> So I'd rather not mention __missing__ in the first paragraph, which
>> describes the functionality provided *by* the defaultdict class. How
>> about adding this para at the end:
>>
>> defaultdict is defined using functionality that is available to *any*
>> subclass of dict: a missing-key lookup automatically causes the
>> subclass's __missing__ method to be called, with the non-existent key
>> as its argument. The method's return value becomes the result of the
>> lookup.
>
> Your proposal sounds like a good idea.

Tx.

> By the way do you have a CS degree? Your wording sounds like you are
> used write theses on a CS degree level. No offense. ;)

No CS degree (coulda, woulda, shoulda). I think what you're hearing is
30+ years of tech writing for computer software companies.

-John