From: Benjamin Kaplan on
On Wed, Mar 24, 2010 at 12:17 PM, <python(a)bdurham.com> wrote:
> Is there a way to programmatically discover the encoding types supported by
> the codecs module?
>
> For example, the following link shows a table with Codec, Aliases, and
> Language columns.
> http://docs.python.org/library/codecs.html#standard-encodings
>
> I'm looking for a way to programmatically generate this table through some
> form of module introspection.
>
> Ideas?
>
> Malcolm
> --

According to my brief messing around with the REPL,
encodings.aliases.aliases is a good place to start. I don't know of
any way to get the Language column, but at the very least that will
give you most of the supported encodings and any aliases they have.
From: python on
Gabriel,

> After looking at how things are done in codecs.c and encodings/__init__.py I think you should enumerate all modules in the encodings package that define a getregentry function. Aliases come from encodings.aliases.aliases.

Thanks for looking into this for me. Benjamin Kaplan made a similar
observation. My reply to him included the snippet of code we're using to
generate the actual list of encodings that our software will support
(thanks to Python's codecs and encodings modules).

Your help is always appreciated :)

Regards,
Malcolm


----- Original message -----
From: "Gabriel Genellina" <gagsl-py2(a)yahoo.com.ar>
To: python-list(a)python.org
Date: Wed, 24 Mar 2010 14:39:20 -0300
Subject: Re: Programmatically discovering encoding types supported by
codecs module

En Wed, 24 Mar 2010 13:17:16 -0300, <python(a)bdurham.com> escribió:

> Is there a way to programmatically discover the encoding types
> supported by the codecs module?
>
> For example, the following link shows a table with Codec,
> Aliases, and Language columns.
> http://docs.python.org/library/codecs.html#standard-encodings
>
> I'm looking for a way to programmatically generate this table
> through some form of module introspection.


--
Gabriel Genellina

--
http://mail.python.org/mailman/listinfo/python-list

From: python on
Gabriel,

Thank you for your analysis - very interesting. Enjoyed your fromlist
choice of names. I'm still in my honeymoon phase with Python so I only
know the first part :)

Regards,
Malcolm


----- Original message -----
From: "Gabriel Genellina" <gagsl-py2(a)yahoo.com.ar>
To: python-list(a)python.org
Date: Wed, 24 Mar 2010 19:50:11 -0300
Subject: Re: Programmatically discovering encoding types supported by
codecs module

En Wed, 24 Mar 2010 14:58:47 -0300, <python(a)bdurham.com> escribió:

>> After looking at how things are done in codecs.c and
>> encodings/__init__.py I think you should enumerate all modules in the
>> encodings package that define a getregentry function. Aliases come from
>> encodings.aliases.aliases.
>
> Thanks for looking into this for me. Benjamin Kaplan made a similar
> observation. My reply to him included the snippet of code we're using to
> generate the actual list of encodings that our software will support
> (thanks to Python's codecs and encodings modules).

I was curious as whether both methods would give the same results:

py> modules=set()
py> for name in glob.glob(os.path.join(encodings.__path__[0], "*.py")):
.... name = os.path.basename(name)[:-3]
.... try: mod = __import__("encodings."+name,
fromlist=['ilovepythonbutsometimesihateit'])
.... except ImportError: continue
.... if hasattr(mod, 'getregentry'):
.... modules.add(name)
....
py> fromalias = set(encodings.aliases.aliases.values())
py> fromalias - modules
set(['tactis'])
py> modules - fromalias
set(['charmap',
'cp1006',
'cp737',
'cp856',
'cp874',
'cp875',
'idna',
'iso8859_1',
'koi8_u',
'mac_arabic',
'mac_centeuro',
'mac_croatian',
'mac_farsi',
'mac_romanian',
'palmos',
'punycode',
'raw_unicode_escape',
'string_escape',
'undefined',
'unicode_escape',
'unicode_internal',
'utf_8_sig'])

There is a missing 'tactis' encoding (?) and about twenty without alias.

--
Gabriel Genellina

--
http://mail.python.org/mailman/listinfo/python-list