From: gb345 on 18 Apr 2010 21:46 I'm getting a UnicodeEncodeError during a call to repr: Traceback (most recent call last): File "bug.py", line 142, in <module> element = parser.parse(INPUT) File "bug.py", line 136, in parse ps = Parser.Parse(open(filename,'r').read(), 1) File "bug.py", line 97, in end_item r = repr(CURRENT_ENTRY) UnicodeEncodeError: 'ascii' codec can't encode character u'\u3003' in position 0: o\ rdinal not in range(128) This is what CURRENT_ENTRY.__repr__ looks like: def __repr__(self): k = SEP.join(self.k) r = SEP.join(self.r) s = SEP.join(self.s) ret = u'\t'.join((k, r, s)) print type(ret) # prints "<type 'unicode'>", as expected return ret If I "inline" this CURRENT_ENTRY.__repr__ code so that the call to repr(CURRENT_ENTRY) can be bypassed altogether, then the error disappears. Therefore, it is clear from the above that the problem, whatever it is, occurs during the execution of the repr() built-in *after* it gets the value returned by CURRENT_ENTRY.__repr__. It is also clearly that repr is trying to encode something using the ascii codec, but I don't understand why it needs to encode anything. Do I need to do something especial to get repr to work strictly with unicode? Or should __repr__ *always* return bytes rather than unicode? What about __str__ ? If both of these are supposed to return bytes, then what method should I use to define the unicode representation for instances of a class? Thanks! Gabe
From: Martin v. Loewis on 19 Apr 2010 02:52 > Do I need to do something especial to get repr to work strictly > with unicode? Yes, you need to switch to Python 3 :-) > Or should __repr__ *always* return bytes rather than unicode? In Python 2.x: yes. > What about __str__ ? Likewise. > If both of these are supposed to return bytes, > then what method should I use to define the unicode representation > for instances of a class? __unicode__. HTH, Martin
From: gb345 on 19 Apr 2010 13:08 In <hqguja$tt$1(a)online.de> "Martin v. Loewis" <martin(a)v.loewis.de> writes: >> Do I need to do something especial to get repr to work strictly >> with unicode? >Yes, you need to switch to Python 3 :-) >> Or should __repr__ *always* return bytes rather than unicode? >In Python 2.x: yes. >> What about __str__ ? >Likewise. >> If both of these are supposed to return bytes, >> then what method should I use to define the unicode representation >> for instances of a class? >__unicode__. Thanks!
From: Dave Angel on 19 Apr 2010 22:41 gb345 wrote: > In <hqguja$tt$1(a)online.de> "Martin v. Loewis" <martin(a)v.loewis.de> writes: > > >>> Do I need to do something especial to get repr to work strictly >>> with unicode? >>> > > >> Yes, you need to switch to Python 3 :-) >> > > >>> Or should __repr__ *always* return bytes rather than unicode? >>> > > >> In Python 2.x: yes. >> > > >>> What about __str__ ? >>> > > >> Likewise. >> > > >>> If both of these are supposed to return bytes, >>> then what method should I use to define the unicode representation >>> for instances of a class? >>> > > >> __unicode__. >> > > Thanks! > > > More precisely, __str__() and __repr__() return characters. Those characters are 8 bits on Python 2.x, and Unicode on 3.x. If you need unicode on 2.x, use __unicode__(). DaveA
|
Pages: 1 Prev: The ole Repetion != Concatination pitfall Next: Code redundancy |