From: geremy condra on
On Thu, Dec 3, 2009 at 7:04 AM, M.-A. Lemburg <mal(a)egenix.com> wrote:
> Raymond Hettinger wrote:
>> [Joshua Bronson]
>>> Raymond, do you think there might be any future in including a built-
>>> in bidict data structure in Python?
>>
>> I don't think so.  There are several forces working against it:
>>
>> * the recipe is new, so it hasn't had a chance to mature
>>   or to gain a fan club.
>>
>> * there are many approaches to the solving the problem and
>>   there is no reason to assume this one is the best.
>>
>> * it extends the language with arcane syntax tricks instead of
>>   using the language as designed by Guido.  That makes it harder
>>   to learn and remember.
>>
>> * we've already got one (actually two).  The two dictionary approach
>>   uses plain python, requires no new learning, and is more flexible.
>>   Also, sqlite3 provides another way to use multiple lookups to a
>>   single record.  The database approach is much more general
>>   (extending to trijections, allowing multiple sort orders,
>>   providing persistence, etc).
>>
>> * the semantics of a bijection aren't obvious:
>>
>>      b['x'] = 'ex'      # first record:  ('x', 'ex')
>>      b['y'] = 'why'     # second record: ('y', 'why')
>>      b[:'why'] = 'x'    # do two records collapse into one? is there
>> an error?
>>
>> * the proposed syntax doesn't address the issue covered in my previous
>> post.
>>   Since bijections are symmetrical, they do not have an obvious
>> direction
>>   (which is the primary key, the husband or the wife?).  The syntax
>> needs to
>>   allow user names to make it clear which is being accessed:
>>
>>      marriages.h2w['john'] = 'amy'
>>      marriages.w2h['amy'] = 'john'
>>
>>   Contrast this with:
>>
>>      marriages['jordan'] = 'taylor'    # are you sure you got the
>> order correct?
>>      marriages[:'taylor'] = 'jordan'   # this is easy to get backwards
>
> I think the only major CS data type missing from Python is some
> form of (fast) directed graph implementation à la kjGraph:
>
>    http://gadfly.sourceforge.net/kjbuckets.html
>
> With these, you can easily build all sorts of relations between
> objects and apply fast operations on them. In fact, it should then
> be possible to build a complete relational database in Python
> (along the lines of Gadfly).

If you're in the market for a Python graph library, you may want
to check out Graphine- I'm obviously biased (I wrote most of it)
but it has a few more bells and whistles than kjbuckets, and is
pretty darned easy to use. It also supports undirected and
bridge graphs.

Geremy Condra
From: M.-A. Lemburg on
geremy condra wrote:
> On Thu, Dec 3, 2009 at 7:04 AM, M.-A. Lemburg <mal(a)egenix.com> wrote:
>> I think the only major CS data type missing from Python is some
>> form of (fast) directed graph implementation � la kjGraph:
>>
>> http://gadfly.sourceforge.net/kjbuckets.html
>>
>> With these, you can easily build all sorts of relations between
>> objects and apply fast operations on them. In fact, it should then
>> be possible to build a complete relational database in Python
>> (along the lines of Gadfly).
>
> If you're in the market for a Python graph library, you may want
> to check out Graphine- I'm obviously biased (I wrote most of it)
> but it has a few more bells and whistles than kjbuckets, and is
> pretty darned easy to use. It also supports undirected and
> bridge graphs.

Thanks for the hint :-)

The lib looks nice and would probably serve as a good prototype
for writing a new built-in type for Python.

This would have to be written in C, though, and come under
a Python compatible license. With the built-in feature moratorium
currently in place, there's about 1.5-2 years time to get this
done; perhaps a good GSoC project for next year :-)

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source (#1, Dec 03 2009)
>>> Python/Zope Consulting and Support ... http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
________________________________________________________________________

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
http://www.egenix.com/company/contact/
From: Francis Carr on
[In re R. Hettinger's critiques]

> * it extends the language with arcane syntax tricks...
I think most of these in the current version of J. Bronson's "bidict"
can be left unused, or removed altogether. In almost all cases, a
bidict should be accessed as an ordinary python dict.


> * we've already got one (actually two).
> The two dictionary approach...
Solutions such as bidict just automate the two-dict approach.

>   ...sqlite3 provides another way...
In many many cases, using a dB (even a lightweight such as sqlite3) is
swatting the fly with a sledgehammer :-)


> Since bijections are symmetrical, they do not have an obvious
> direction (which is the primary key, the husband or the wife?).
I think this is easy to solve with a classmethod constructor that
produces a pair of "linked" dicts. For example,
husband2wife, wife2husband = bidict.makepair(('Fred', 'John'),
('Mary', 'Ruth'))
Obviously from the code this pair of bidicts are "linked", and the
direction of each mapping is obvious from its name. Metaprogramming
idioms like namedtuple are not required.


> * the semantics of a bijection aren't obvious:
>      b['x'] = 'ex'      # first record:  ('x', 'ex')
>      b['y'] = 'why'     # second record: ('y', 'why')
>      b[:'why'] = 'x'    # do two records collapse into one?
> # is there an error?
Among the various critiques, I think this is the most significant.

When I was fiddling with my implementation, I was disturbed that the
code
bidict[newKey] = oldValue
should have the subtle side-effect
del bidict.inv[oldValue]

And there is a much stranger case. Suppose bidict[key1]=val1 and
bidict[key2]=val2. Then the code
bidict[key1] = val2
should have the extremely subtle side-effects
del bidict[key2] # because val2 has been re-mapped
del bidict.inv[val1] # because key1 has been re-mapped
Blech!

I think there must be something better. It would be interesting to
hear more opinions on the matter.

I propose raising ValueError when operating on one key would also
silently re-map or delete a different (key,value) pair. This would
disallow both of the above cases. To get the effect of the first
case, one would simply operate on the inverse mapping:
bidict.inv[oldValue] = newKey
This should not be confusing: it's exactly how a python dict would
operate, except the "linked" mapping is altered to match, which is the
bookkeeping we want to automate in the first place. To get the effect
of the second case, one would have to explicitly demand the side-
effects:
del bidict[key2]
del bidict.inv[val1]
bidict[key1] = val2
Also not confusing.


-- FC
From: geremy condra on
On Thu, Dec 3, 2009 at 12:57 PM, M.-A. Lemburg <mal(a)egenix.com> wrote:
> geremy condra wrote:
>> On Thu, Dec 3, 2009 at 7:04 AM, M.-A. Lemburg <mal(a)egenix.com> wrote:
>>> I think the only major CS data type missing from Python is some
>>> form of (fast) directed graph implementation à la kjGraph:
>>>
>>>    http://gadfly.sourceforge.net/kjbuckets.html
>>>
>>> With these, you can easily build all sorts of relations between
>>> objects and apply fast operations on them. In fact, it should then
>>> be possible to build a complete relational database in Python
>>> (along the lines of Gadfly).
>>
>> If you're in the market for a Python graph library, you may want
>> to check out Graphine- I'm obviously biased (I wrote most of it)
>> but it has a few more bells and whistles than kjbuckets, and is
>> pretty darned easy to use. It also supports undirected and
>> bridge graphs.
>
> Thanks for the hint :-)
>
> The lib looks nice and would probably serve as a good prototype
> for writing a new built-in type for Python.

I suspect that it would have a better chance at getting into
collections than becoming a builtin, but who knows. I'd just
like to have something like it in the standard library.

> This would have to be written in C, though,

That's currently in the works, along with database backing.
We'd welcome any help though... hint, hint...

> and come under a Python compatible license.

I'm willing to dual license under the Python license if
there were a substantial interest in doing so, and I'm
confident that the other authors and maintainers
would feel the same way. The question in my mind is
whether such an interest exists.

> With the built-in feature moratorium
> currently in place, there's about 1.5-2 years time to get this
> done; perhaps a good GSoC project for next year :-)

I'd love to have Graphine be a GSoC project, although
if the target were to get it into collections the
moratorium wouldn't change the timeline AFAICS.

Geremy Condra
From: M.-A. Lemburg on
geremy condra wrote:
> On Thu, Dec 3, 2009 at 12:57 PM, M.-A. Lemburg <mal(a)egenix.com> wrote:
>> geremy condra wrote:
>>> On Thu, Dec 3, 2009 at 7:04 AM, M.-A. Lemburg <mal(a)egenix.com> wrote:
>>>> I think the only major CS data type missing from Python is some
>>>> form of (fast) directed graph implementation � la kjGraph:
>>>>
>>>> http://gadfly.sourceforge.net/kjbuckets.html
>>>>
>>>> With these, you can easily build all sorts of relations between
>>>> objects and apply fast operations on them. In fact, it should then
>>>> be possible to build a complete relational database in Python
>>>> (along the lines of Gadfly).
>>>
>>> If you're in the market for a Python graph library, you may want
>>> to check out Graphine- I'm obviously biased (I wrote most of it)
>>> but it has a few more bells and whistles than kjbuckets, and is
>>> pretty darned easy to use. It also supports undirected and
>>> bridge graphs.
>>
>> Thanks for the hint :-)
>>
>> The lib looks nice and would probably serve as a good prototype
>> for writing a new built-in type for Python.
>
> I suspect that it would have a better chance at getting into
> collections than becoming a builtin, but who knows. I'd just
> like to have something like it in the standard library.

Integrating an easy-to-use graph library into the collections
module (and it's C companion) is good idea.

>> This would have to be written in C, though,
>
> That's currently in the works, along with database backing.
> We'd welcome any help though... hint, hint...
>
>> and come under a Python compatible license.
>
> I'm willing to dual license under the Python license if
> there were a substantial interest in doing so, and I'm
> confident that the other authors and maintainers
> would feel the same way.

Great !

> The question in my mind is whether such an interest exists.

Since Python is being used more and more in CS classes,
such an addition would complete the tool-set and make Python
even more attractive for undergrad CS courses.

Finding out how much interest exists in advance is always
a bit difficult with new data-structures. People have to
get a feeling of how they can be put to good use first,
so it's a chicken-and-egg problem.

We've seen the same thing happen with sets. They were first
made available via a separate module and then became built-ins
after people realized how useful they are in practice.

With graphs, it's probably going to take a little longer
before people realize their usefulness - graph theory is
certainly a lot more complicated than set theory :-)

>> With the built-in feature moratorium
>> currently in place, there's about 1.5-2 years time to get this
>> done; perhaps a good GSoC project for next year :-)
>
> I'd love to have Graphine be a GSoC project, although
> if the target were to get it into collections the
> moratorium wouldn't change the timeline AFAICS.

True.

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source (#1, Dec 04 2009)
>>> Python/Zope Consulting and Support ... http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
________________________________________________________________________

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
http://www.egenix.com/company/contact/