From: Joshua Bronson on
On Nov 19, 9:17 pm, Carl Banks <pavlovevide...(a)gmail.com> wrote:
> Apart from the GPL

what Ben said :)

> it seems perfectly fine to release, and looks like
> an interesting strategy. I've wanted one of those once in a while,
> never enough to bother looking for one or writing one myself.

glad to hear it! i'll release it to pypi if such feedback continues.

> But you should absolutely not inherit from dict if you're overriding
> all it's methods. It's useless and wasteful to do that, perhaps
> dangerous. You end up using bytes for a small hash table that's never
> used.
>
> Plus Python 3 has a notion of Abstract Base Classes: it will allow
> customization of isinstance to advertise that your class implements
> MutableMapping, which is the right way to do it.

Actually that's what I was originally thinking of doing but didn't go
through with it in my first pass out of concern that users might want
isinstance(bijection(), dict) to be True. Now that you bring it up, I
agree that it's the correct way to do it, and have reimplemented
bijection as a MutableMapping (ABCs are actually in Python 2.6). Take
a peek at the new and improved http://bitbucket.org/jab/toys/src/tip/bijection.py
if you get a chance and let me know how it looks!

Anyone have any other feedback? For instance, is offering the __call__
syntax for the inverse mapping wonderful or terrible, or maybe both?

Thanks,
Josh
From: Terry Reedy on
Joshua Bronson wrote:

> Anyone have any other feedback? For instance, is offering the __call__
> syntax for the inverse mapping wonderful or terrible, or maybe both?

Terrible ;-)

Use standard subscripting with slices, and only that, to both get and set.

Let m[4] == m[4:] == 'abc' # m[4:] is suggested alternative addition
Then m[:'abc'] == 4

m[4:] passes slice(4,None,None) to __getitem__
m[:'abc'] passes slice(None,'abc',None)

It just happens that dict items and slices use the same notation, but
they do, so take advantage of that. In fact, to emphasize the symmetry
of the bijective map, consider disallowing m[key] as ambiguous and
require m[key:], along with m[:key] to access and set.

Note that m[slice(1,2,3):] passes slice(slice(1, 2, 3), None, None), so
this approach does not even prevent using slices as keys/values.

In __setitem__, m[a:b] which passes slice(a,b,None) would have to be an
error. In __getitem__, it could either be a error or return True/False
depending on whether the pair is in the map. But this depends on whether
__contains__ only tests keys or is modified to test pairs.

Terry Jan Reedy


From: Joshua Bronson on
On Nov 20, 3:09 pm, Terry Reedy <tjreedy(a)udel.edu> wrote:
> Joshua Bronson wrote:
> > Anyone have any other feedback? For instance, is offering the __call__
> > syntax for the inverse mapping wonderful or terrible, or maybe both?
>
> Terrible ;-)
>
> Use standard subscripting with slices, and only that, to both get and set..
>
> Let m[4] == m[4:] == 'abc' # m[4:] is suggested alternative addition
> Then m[:'abc'] == 4
>
> m[4:] passes slice(4,None,None) to __getitem__
> m[:'abc'] passes slice(None,'abc',None)
>
> It just happens that dict items and slices use the same notation, but
> they do, so take advantage of that. In fact, to emphasize the symmetry
> of the bijective map, consider disallowing m[key] as ambiguous and
> require m[key:], along with m[:key] to access and set.
>
> Note that m[slice(1,2,3):] passes slice(slice(1, 2, 3), None, None), so
> this approach does not even prevent using slices as keys/values.
>
> In __setitem__, m[a:b] which passes slice(a,b,None) would have to be an
> error. In __getitem__, it could either be a error or return True/False
> depending on whether the pair is in the map. But this depends on whether
> __contains__ only tests keys or is modified to test pairs.
>
> Terry Jan Reedy

absolutely genius. implemented in the latest version:
http://bitbucket.org/jab/toys/src/tip/bijection.py

thank you for the terrific idea!
From: Joshua Bronson on
On Nov 20, 3:09 pm, Terry Reedy <tjreedy(a)udel.edu> wrote:
> Use standard subscripting with slices, and only that, to both get and set..

i did this for __delitem__ too, so you can do e.g. del m[:'abc'].

> In fact, to emphasize the symmetry of the bijective map, consider
> disallowing m[key] as ambiguous and require m[key:]

my initial feeling is that i'd prefer to be able to say m[key], the
defense being that it's not ambiguous if it's documented, and users
who don't like it don't have to use it, while users who do like it
won't be alienated. but i am definitely still open to this.

> Note that m[slice(1,2,3):] passes slice(slice(1, 2, 3), None, None), so
> this approach does not even prevent using slices as keys/values.

except slices aren't hashable.</nitpick> but props for even thinking
of that!

> In __setitem__, m[a:b] which passes slice(a,b,None) would have to be an
> error. In __getitem__, it could either be a error or return True/False
> depending on whether the pair is in the map.

i went with raising an error for __getitem__(slice(a,b,None)).
returning True/False for this usage based on whether a -> b is in the
bijection is an interesting idea, but i feel like it complicates
things for no good reason: if that's what you wanted to know you'd
just ask whether m[a] == b.

> But this depends on whether __contains__ only tests keys or is modified to test pairs.

i have __contains__ only testing keys, and similarly [i for i in
bijection] only gives you the keys, again on the theory that deviating
too much from the dict api increases the learning (adoption) curve, so
i think we should only do it if it buys us a tremendous usability win.

thanks again for your insightful input, i'm pretty psyched about how
this is coming along!

any further feedback is always welcome.

josh
From: Raymond Hettinger on
On Nov 19, 3:24 pm, Joshua Bronson <jabron...(a)gmail.com> wrote:
> I couldn't find a library providing a bijective map data structure
> (allowing for constant-time lookups by value) in the few minutes I
> looked, so I took a few more minutes to code one up:http://bitbucket.org/jab/toys/src/tip/bijection.py
>
> Is this at all worth releasing? Comments and suggestions welcome.
>
> Josh

Hello Joshua,

I have a few design ideas and comments for you.

* The idea of using __call__ for looking-up inverse values was
inspired. That is useable, clean, and easy to remember; however, as
discussed below, there are issues though with its actual use in real
code.

* Am not excited by the inverse iterators. With just a regular
mapping you can write:

for a, b in m.items() ... # consider either a or b be the
key and the other to be the value

That meets all of the needs that would have been served by
iter_inverse_keys() or iter_inverse_values() or whatnot. The mirrored
API doesn't really provide much in the way of value added.

* After exercising the API on a couple of samples, I'm worried that
almost any inverse-method based API makes it difficult to review code
while keeping straight the intended meaning of the forward and inverse
relationships. Am thinking that it is simpler, faster, and clearer to
just use two dictionaries -- that approach lets the variable names
communicate the important info. For example, the following code helps
keep the coder and code reviewer from conflating the forward and
inverse directions:

myurl = ip2url[myip]
myip = url2ip[myurl]

Contrast that with:

myurl = site_bijection[myip]
myip = site_bijection(myurl)

With the latter, it is darned difficult to detect accidental
conflation of brackets with parentheses.

* So, I'm thinking that code needing a bijection would be better-off
with two ordinary dicts, perhaps augmented by a couple of convenience
functions:

biject_add(site_bijection, ip=myip, url=myurl) # Create a
new pairing, raise ValueError if either key
# maps to
more than one value (in violation of the
# bijection
invariant: one-to-one and onto)

biject_remove(ip=myip) # Clear an
entry from both dicts
or
biject_remove(url=myurl)

Alternatively, another possible approach is to used use the class
generation approach (such as that used by named_tuple()) to generate a
custom bijection class with either attribute based or keyworded
accessors:

Attribute based accessors:

site = Bijection('ip', 'url')
site.url[myip] = myurl

for ip, url in site.items() ...
print site.ip[myurl]
myurl = site.url.pop(myip)

Keyword accessors:

site = Bijection('ip', 'url')
site.set(ip=myip, url=myurl)
myurl = site.get(ip=myip)
myip = set.get(url=myurl)
myurl = site.pop(ip=myip)
site.del(ip=myip)
site.del(url=myurl)


Hope these ideas help. The ultimate success of the Bijection code
will depend on its clarity, simplicity, and speed. Experiment with
various approaches to find-out which looks the best in real code. It
cannot be error-prone or it is doomed. Also, it should not introduce
much overhead processing or else people will avoid it. The API should
be trivially simple so that people remember how to use it months after
seeing it for the first time.

Good luck and happy hunting,



Raymond