From: magnus.lycka on
It seems that Python treats non-breaking space (\xa0) as a normal
whitespace character, e.g. when splitting a string. See below:

>>> s='hello\xa0there'
>>> s.split()
['hello', 'there']

Surely this is not intended behaviour?
From: Steven D'Aprano on
On Sat, 05 Jun 2010 01:30:40 -0700, magnus.lycka(a)gmail.com wrote:

> It seems that Python treats non-breaking space (\xa0) as a normal
> whitespace character, e.g. when splitting a string. See below:
>
>>>> s='hello\xa0there'
>>>> s.split()
> ['hello', 'there']
>
> Surely this is not intended behaviour?


Yes it is.

str.split() breaks on whitespace, and \xa0 is whitespace according to the
Unicode standard. To put it another way, str.split() is not a word-
wrapping split. This has been reported before, and rejected as a won't-
fix.

http://mail.python.org/pipermail/python-bugs-list/2006-January/031531.html



--
Steven