From: Wells on
I get this exception when decoding a certain JSON string:

'ascii' codec can't encode character u'\u2019' in position 8: ordinal
not in range(128)

The JSON data in question:

http://mlb.com/lookup/json/named.player_info.bam?sport_code=%27mlb%27&player_id=%27489002%27

It's in the 'high_school' key. Is there some string function I can run
on the information before I decode it to avoid this?

Thanks!
From: Chris Rebert on
On Tue, Dec 15, 2009 at 2:03 PM, Wells <thewellsoliver(a)gmail.com> wrote:
> I get this exception when decoding a certain JSON string:
>
> 'ascii' codec can't encode character u'\u2019' in position 8: ordinal
> not in range(128)
>
> The JSON data in question:
>
> http://mlb.com/lookup/json/named.player_info.bam?sport_code=%27mlb%27&player_id=%27489002%27
>
> It's in the 'high_school' key. Is there some string function I can run
> on the information before I decode it to avoid this?

>From what I can guess (you didn't include any code), you're printing
the result of loading the JSON (which probably loaded correctly) to
the terminal without specifying the exact encoding to use. In such
cases, Python defaults to ASCII. However, your data obviously includes
non-ASCII characters, thus resulting in the error you're encountering.
Instead of `print the_high_school`, try `print
the_high_school.encode('utf8')`.

Note that the `json` library returns Unicode strings of the type
`unicode` and not byte strings of type `str` (unless you're using
Python 3.0, in which case `unicode` got renamed to `str` and `str` got
renamed to `bytes`). When outputting Unicode, it needs to be encoded
to bytes. The built-in type() function* can help determine when you
have Unicode data.

Cheers,
Chris
--
http://blog.rebertia.com

*Yes, it's not /really truly/ a function, but the distinction is not
relevant here.
From: Intchanter / Daniel Fackrell on
On Dec 15, 3:03 pm, Wells <thewellsoli...(a)gmail.com> wrote:
> I get this exception when decoding a certain JSON string:
>
> 'ascii' codec can't encode character u'\u2019' in position 8: ordinal
> not in range(128)
>
> The JSON data in question:
>
> http://mlb.com/lookup/json/named.player_info.bam?sport_code=%27mlb%27....
>
> It's in the 'high_school' key. Is there some string function I can run
> on the information before I decode it to avoid this?

In my test using this same data, I did not get such an error. Here's
my code:

data = '{"player_info": {"queryResults": { "row": { "active_sw": "Y",
"bats": "R", "birth_city": "Baltimore", "birth_country": "USA",
"birth_date": "1987-08-31T00:00:00", "birth_state": "MD", "college":
"", "death_city": "", "death_country": "", "death_date": "",
"death_state": "", "end_date": "", "file_code": "sf", "gender": "M",
"height_feet": "6", "height_inches": "1", "high_school": "St. Paul
\u2019s School For Boys (MN) HS", "jersey_number": "",
"name_display_first_last": "Steve Johnson",
"name_display_first_last_html": "Steve Johnson",
"name_display_last_first": "Johnson, Steve",
"name_display_last_first_html": "Johnson, Steve",
"name_display_roster": "Johnson, S", "name_display_roster_html":
"Johnson, S", "name_first": "Steven", "name_full": "Johnson, Steve",
"name_last": "Johnson", "name_matrilineal": "", "name_middle":
"David", "name_nick": "", "name_prefix": "", "name_title": "",
"name_use": "Steve", "player_id": "489002", "primary_position": "1",
"primary_position_txt": "P", "primary_sport_code": "",
"pro_debut_date": "", "start_date": "2009-12-10T00:00:00", "status":
"Active", "status_code": "A", "status_date": "2009-12-10T00:00:00",
"team_abbrev": "SF", "team_code": "sfn", "team_id": "137",
"team_name": "San Francisco Giants", "throws": "R", "weight": "200" },
"totalSize": "1" }}}'

import json

print json.loads(data)

(I'm running 2.6.4 on Mac OS X)

Intchanter
Daniel Fackrell
From: Wells on
Sorry- more detail- the actual problem is an exception thrown when
running str() on the value, like so:

>>> a = u'St. Paul\u2019s School For Boys (MN) HS'
>>> print str(a)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in
position 8: ordinal not in range(128)

Is there some way to run str() against a unicode object?
From: Chris Rebert on
On Tue, Dec 15, 2009 at 3:04 PM, Wells <thewellsoliver(a)gmail.com> wrote:
> Sorry- more detail- the actual problem is an exception thrown when
> running str() on the value, like so:
>
>>>> a = u'St. Paul\u2019s School For Boys (MN) HS'
>>>> print str(a)
> Traceback (most recent call last):
>  File "<stdin>", line 1, in <module>
> UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in
> position 8: ordinal not in range(128)
>
> Is there some way to run str() against a unicode object?

To repeat what I said earlier, you use the .encode() method instead:

print a.encode('utf8')

Might I recommend reading:
http://www.joelonsoftware.com/articles/Unicode.html

Regards,
Chris
--
http://blog.rebertia.com