From: John Machin on
dirknbr <dirknbr <at> gmail.com> writes:

> I have kind of developped this but obviously it's not nice, any better
> ideas?
>
> try:
> text=texts[i]
> text=text.encode('latin-1')
> text=text.encode('utf-8')
> except:
> text=' '

As Steven has pointed out, if the .encode('latin-1') works, the result is thrown
away. This would be very fortunate.

It appears that your goal was to encode the text in latin1 if possible,
otherwise in UTF-8, with no indication of which encoding was used. Your second
posting confirmed that you were doing this in a loop, ending up with the
possibility that your output file would have records with mixed encodings.

Did you consider what a programmer writing code to READ your output file would
need to do, e.g. attempt to decode each record as UTF-8 with a fall-back to
latin1??? Did you consider what would be the result of sending a stream of
mixed-encoding text to a display device?

As already advised, the short answer to avoid all of that hassle; just encode in
UTF-8.



From: Nobody on
On Fri, 23 Jul 2010 18:27:50 -0400, Terry Reedy wrote:

> But in the
> meanwhile, once you get an error, you know what it is. You can
> intentionally feed code bad data and see what you get. And then maybe
> add a test to make sure your code traps such errors.

That doesn't really help with exceptions which are triggered by external
factors rather than explicit inputs.

Also, if you're writing libraries (rather than self-contained programs),
you have no control over the arguments. Coupled with the fact that
duck typing is quite widely advocated in Python circles, you're stuck with
the possibility that any method call on any argument can raise any
exception. This is even true for calls to standard library functions or
methods of standard classes if you're passing caller-supplied objects as
arguments.

From: Steven D'Aprano on
On Sun, 25 Jul 2010 13:52:33 +0100, Nobody wrote:

> On Fri, 23 Jul 2010 18:27:50 -0400, Terry Reedy wrote:
>
>> But in the
>> meanwhile, once you get an error, you know what it is. You can
>> intentionally feed code bad data and see what you get. And then maybe
>> add a test to make sure your code traps such errors.
>
> That doesn't really help with exceptions which are triggered by external
> factors rather than explicit inputs.

Huh? What do you mean by "external factors"? Do you mean like power
supply fluctuations, cosmic rays flipping bits in memory, bad hardware?
You can't defend against that, not without specialist fault-tolerant
hardware, so just don't worry about it.

If you mean external factors like "the network goes down" or "the disk is
full", you can still test for those with appropriate test doubles (think
"stunt doubles", only for testing) such as stubs or mocks. It's a little
bit more work (sometimes a lot more work), but it can be done.

Or don't worry about it. Release early, release often, and take lots of
logs. You'll soon learn what exceptions can happen and what can't. Your
software is still useful even when it's not perfect, and there's always
time for another bug fix release.


> Also, if you're writing libraries (rather than self-contained programs),
> you have no control over the arguments.

You can't control what the caller passes to you, but once you have it,
you have total control over it. You can reject it with an exception,
stick it inside a wrapper object, convert it to something else, deal with
it as best you can, or just ignore it.


> Coupled with the fact that duck
> typing is quite widely advocated in Python circles, you're stuck with
> the possibility that any method call on any argument can raise any
> exception. This is even true for calls to standard library functions or
> methods of standard classes if you're passing caller-supplied objects as
> arguments.

That's a gross exaggeration. It's true that some methods could in theory
raise any exception, but in practice most exceptions are vanishingly
rare. And it isn't even remotely correct that "any" method could raise
anything. If you can get something other than NameError, ValueError or
TypeError by calling "spam".index(arg), I'd like to see it.

Frankly, it sounds to me that you're over-analysing all the things that
"could" go wrong rather than focusing on the things that actually do go
wrong. That's your prerogative, of course, but I don't think you'll get
much support for it here.



--
Steven
From: Nobody on
On Sun, 25 Jul 2010 14:47:11 +0000, Steven D'Aprano wrote:

>>> But in the
>>> meanwhile, once you get an error, you know what it is. You can
>>> intentionally feed code bad data and see what you get. And then maybe
>>> add a test to make sure your code traps such errors.
>>
>> That doesn't really help with exceptions which are triggered by external
>> factors rather than explicit inputs.
>
> Huh? What do you mean by "external factors"?

I mean this:

> If you mean external factors like "the network goes down" or "the disk is
> full",

> you can still test for those with appropriate test doubles (think
> "stunt doubles", only for testing) such as stubs or mocks. It's a little
> bit more work (sometimes a lot more work), but it can be done.

I'd say "a lot" is more often the case.

>> Also, if you're writing libraries (rather than self-contained programs),
>> you have no control over the arguments.
>
> You can't control what the caller passes to you, but once you have it,
> you have total control over it.

Total control insofar as you can wrap all method calls in semi-bare
excepts (i.e. catch any Exception but not Interrupt).

>> Coupled with the fact that duck
>> typing is quite widely advocated in Python circles, you're stuck with
>> the possibility that any method call on any argument can raise any
>> exception. This is even true for calls to standard library functions or
>> methods of standard classes if you're passing caller-supplied objects as
>> arguments.
>
> That's a gross exaggeration. It's true that some methods could in theory
> raise any exception, but in practice most exceptions are vanishingly
> rare.

Now *that* is a gross exaggeration. Exceptions are by their nature
exceptional, in some sense of the word. But a substantial part of Python
development is playing whac-a-mole with exceptions. Write code, run
code, get traceback, either fix the cause (LBYL) or handle the exception
(EAFP), wash, rinse, repeat.

> And it isn't even remotely correct that "any" method could raise
> anything. If you can get something other than NameError, ValueError or
> TypeError by calling "spam".index(arg), I'd like to see it.

How common is it to call methods on a string literal in real-world code?

It's far, far more common to call methods on an argument or expression
whose value could be any "string-like object" (e.g. UserString or a str
subclass).

IOW, it's "almost" correct that any method can raise any exception. The
fact that the number of counter-examples is non-zero doesn't really
change this. Even an isinstance() check won't help, as nothing prohibits a
subclass from raising exceptions which the original doesn't. Even using
"type(x) == sometype" doesn't help if x's methods involve calling methods
of user-supplied values (unless those methods are wrapped in catch-all
excepts).

Java's checked exception mechanism was based on real-world experience of
the pitfalls of abstract types. And that experience was gained in
environments where interface specifications were far more detailed than is
the norm in the Python world.

> Frankly, it sounds to me that you're over-analysing all the things that
> "could" go wrong rather than focusing on the things that actually do go
> wrong.

See Murphy's Law.

> That's your prerogative, of course, but I don't think you'll get
> much support for it here.

Alas, I suspect that you're correct. Which is why I don't advocate using
Python for "serious" software. Neither the language nor its "culture" are
amenable to robustness.

From: Aahz on
In article <pan.2010.07.26.04.27.47.437000(a)nowhere.com>,
Nobody <nobody(a)nowhere.com> wrote:
>
>Java's checked exception mechanism was based on real-world experience of
>the pitfalls of abstract types. And that experience was gained in
>environments where interface specifications were far more detailed than is
>the norm in the Python world.

There are a number of people who claim that checked exceptions are the
wrong answer:

http://www.mindview.net/Etc/Discussions/CheckedExceptions
--
Aahz (aahz(a)pythoncraft.com) <*> http://www.pythoncraft.com/

"....Normal is what cuts off your sixth finger and your tail..." --Siobhan