From: Lawrence D'Oliveiro on
In message
<14e44c9c-04d9-452d-b544-498adfaf7d40(a)d8g2000yqf.googlegroups.com>, Carl
Banks wrote:

> Seriously, almost every other kind of library uses a binary API. What
> makes databases so special that they need a string-command based API?

HTML is also effectively a string-based API. And what about regular
expressions? And all the functionality available through the subprocess
module and its predecessors?

The reality is, embedding one language within another is a fact of life. I
think it's important for programmers to be able to deal correctly with it.
From: Peter H. Coffin on
On Mon, 28 Jun 2010 03:07:29 -0700, Dennis Lee Bieber wrote:
> Coding for something like a DBTG network database did not allow for
> easy changes in queries... What would be a simple join in SQL was
> traversing a circular linked list in the DBTG database my college
> taught. EG: loop get next "master" record; loop get next sub-record
> [etc. until all needed data retrieved] until back to master; until back
> to top of database.

We'll also note that most of these you'd have to map out where each
field in a record was by hand, any time you wanted to open the file.
Often several times, because there would be multiple record layouts per
file.

--
67. No matter how many shorts we have in the system, my guards will be
instructed to treat every surveillance camera malfunction as a
full-scale emergency.
--Peter Anspach's list of things to do as an Evil Overlord
From: Nobody on
On Tue, 29 Jun 2010 12:30:36 +1200, Lawrence D'Oliveiro wrote:

>> Seriously, almost every other kind of library uses a binary API. What
>> makes databases so special that they need a string-command based API?
>
> HTML is also effectively a string-based API.

HTML is a data format. The sane way to construct or manipulate HTML is via
the DOM, not string operations.

> And what about regular expressions?

What about them? As the saying goes:

Some people, when confronted with a problem, think
"I know, I'll use regular expressions."
Now they have two problems.

They have some uses, e.g. defining tokens[1]. Using them to match more
complex constructs is error-prone and should generally be avoided unless
you're going to manually verify the result. Oh, and you should never
generate regexps dynamically; that way madness lies.

[1] Assuming that the language's tokens can be described by a regular
grammar. This isn't always the case, e.g. you can't tokenise PostScript
using regexps, as string literals can contain nested parentheses.

> And all the functionality available through the subprocess
> module and its predecessors?

The main reason why everyone recommends subprocess over its predecessors
is that it allows you to bypass the shell, which is one of the most
common sources of the type of error being discussed in this thread.

IOW, rather than having to construct a shell command which (hopefully)
will pass the desired arguments to the child, you just pass the desired
arguments to the child directly, without involving the shell.

> The reality is, embedding one language within another is a fact of life. I
> think it's important for programmers to be able to deal correctly with it.

That depends upon what you mean by "embedding". The correct way to use
code written in one language from code written in another is to make the
first accept parameters and make the second pass them, not to have the
second (try to) generate the former dynamically.

Sometimes dynamic code generation is inevitable (e.g. if you're writing a
compiler, you probably need to generate assembler or C code), but it's not
to be done lightly, and it's unwise to take shortcuts (e.g. ad-hoc string
substitutions).

From: Roy Smith on
Nobody <nobody(a)nowhere.com> wrote:

> > And what about regular expressions?
>
> What about them? As the saying goes:
>
> Some people, when confronted with a problem, think
> "I know, I'll use regular expressions."
> Now they have two problems.

That's silly. RE is a good tool. Like all good tools, it is the right
tool for some jobs and the wrong tool for others.

I've noticed over the years a significant anti-RE sentiment in the
Python community. One reason, I suppose, is because Python gives you
some good string manipulation tools, i.e. split(), startswith(),
endswith(), and the 'in' operator, which cover many of the common RE use
cases. But there are still plenty of times when a RE is the best tool
and it's worth investing the effort to learn how to use them effectively.

One tool that Python gives you which makes RE a pleasure is raw strings.
Getting rid of all those extra backslashes really helps improve
readability.

Another great feature is VERBOSE. I've written some truly complicated
REs using that, and still been able to figure out what they meant the
next day :-)
From: Stephen Hansen on
On 6/29/10 5:41 AM, Roy Smith wrote:
> Nobody<nobody(a)nowhere.com> wrote:
>
>>> And what about regular expressions?
>>
>> What about them? As the saying goes:
>>
>> Some people, when confronted with a problem, think
>> "I know, I'll use regular expressions."
>> Now they have two problems.
>
> That's silly. RE is a good tool. Like all good tools, it is the right
> tool for some jobs and the wrong tool for others.

There's nothing silly about it.

It is an exaggeration though: but it does represent a good thing to keep
in mind.

Yes, re is a tool -- and a useful one at that. But its also a tool which
/seems/ like an omnitool capable of tackling everything.

Regular expressions are a complicated mini-language well suited towards
extensive use in a unix type environment where you want to embed certain
logic of 'what to operate on' into many different commands that aren't
languages at all -- and perl embraced it to make it perl's answer to
text problems. Which is fine.

In Python, certainly it has its uses: many of them in fact, and in many
it really is the best solution.

Its not just that its the right tool for some jobs and the wrong tool
for others, or that -- as you said also -- that Python provides a rather
rich string type which can do many common tasks natively and better, but
that regular expressions live in the front of the mind for so many
people coming to the language that its the first thing they even think
of, and what should be simple becomes difficult.

So people quote that proverb. Its a good proverb. As all proverbs, its
not perfectly applicable to all situations. But it does has an important
lesson to it: you should generally not consider re to be the solution
you're looking for until you are quite sure there's nothing else to
solve the same task.

It obviously applies less to the guru's who know all about regular
expressions and their subtleties including potential pathological behavior.

--

... Stephen Hansen
... Also: Ixokai
... Mail: me+list/python (AT) ixokai (DOT) io
... Blog: http://meh.ixokai.io/