weird environ [Unix Programming]

Prev: How to add GUI to command line
Next: ksh: /usr/bin/find: arg list too long

From: Rainer Weikusat on 9 Oct 2009 10:42

Geoff Clare <geoff(a)clare.See-My-Signature.invalid> writes:
> Rainer Weikusat wrote:
>> Geoff Clare <geoff(a)clare.See-My-Signature.invalid> writes:
>>> John Kelly wrote:
>>>
>>>> What's the preferred way to clear the environment, in the absence of
>>>> clearenv()?
>>> Obtain the name from environ[0] (the part before the '=').
>>> Call unsetenv() with that name.
>>> Repeat until environ[0] is NULL.
>>
>> Is there are particular reason for this complicated procedure? It is
>> supposed to be possible to manipulate the environment via environ both
>> for the current process and for other programs invoked by an
>> exec-routine without an env-pointer.
>
> POSIX says "If the application modifies environ or the pointers to
> which it points, the behavior of getenv() is undefined." It has
> similar statements for setenv() and unsetenv(). I assume the same
> would apply to other functions that use environment variables (i.e.
> which effectively use getenv() internally), although that doesn't
> appear to be stated anywhere in the standard.

Consequently, either these 'other functions' must not use getenv or
getenv must be able to deal with a modified environ. And there is, of
course, putenv, whose 'application usage' section contains this little
gem:

The putenv() function manipulates the environment pointed to
by environ, and can be used in conjunction with getenv().

[...]

> This restriction is what makes the unsetenv() loop the "preferred way"
> the OP asked for.

IMO certainly not for executing another program with a 'clean'
environment, possibly including a well-defined set of initial
'environment variables'. Possibly if the application wants to first
clear the environment and then make use of it itself via getenv/
setenv/ unsetenv for some weird reason: Since nothing about the actual
implementation beyond that it may do linear searches of 'the environ
table' can be relied upon, that would be a really bad choice for a
'dictionary' data structure.

From: Geoff Clare on 9 Oct 2009 16:23

Rainer Weikusat wrote:
> Geoff Clare <geoff(a)clare.See-My-Signature.invalid> writes:
>> Rainer Weikusat wrote:
>>> Geoff Clare <geoff(a)clare.See-My-Signature.invalid> writes:
>>>> John Kelly wrote:
>>>>
>>>>> What's the preferred way to clear the environment, in the absence of
>>>>> clearenv()?
>>>> Obtain the name from environ[0] (the part before the '=').
>>>> Call unsetenv() with that name.
>>>> Repeat until environ[0] is NULL.
>>> Is there are particular reason for this complicated procedure? It is
>>> supposed to be possible to manipulate the environment via environ both
>>> for the current process and for other programs invoked by an
>>> exec-routine without an env-pointer.
>> POSIX says "If the application modifies environ or the pointers to
>> which it points, the behavior of getenv() is undefined." It has
>> similar statements for setenv() and unsetenv(). I assume the same
>> would apply to other functions that use environment variables (i.e.
>> which effectively use getenv() internally), although that doesn't
>> appear to be stated anywhere in the standard.
>
> Consequently, either these 'other functions' must not use getenv or
> getenv must be able to deal with a modified environ. And there is, of
> course, putenv, whose 'application usage' section contains this little
> gem:
>
> The putenv() function manipulates the environment pointed to
> by environ, and can be used in conjunction with getenv().

putenv() is an XSI function - it came into the standard via
the old X/Open Portability Guide specs, whereas getenv(), setenv()
and unsetenv() came in via IEEE/POSIX. That would explain why
it is treated differently from them.

>> This restriction is what makes the unsetenv() loop the "preferred way"
>> the OP asked for.
>
> IMO certainly not for executing another program with a 'clean'
> environment, possibly including a well-defined set of initial
> 'environment variables'.

The OP specifically mentioned clearenv(), which modifies the
current environment; he didn't mention passing an environment to
another program. Obviously, constructing a separate environment
list for use with execle() or execve() is the better way of
achieving the latter.

> Possibly if the application wants to first
> clear the environment and then make use of it itself via getenv/
> setenv/ unsetenv for some weird reason

That's what I assumed the OP wanted.

--
Geoff Clare <netnews(a)gclare.org.uk>

From: Rainer Weikusat on 10 Oct 2009 10:07

Geoff Clare <geoff(a)clare.See-My-Signature.invalid> writes:
> Rainer Weikusat wrote:
>> Geoff Clare <geoff(a)clare.See-My-Signature.invalid> writes:
>>> Rainer Weikusat wrote:
>>>> Geoff Clare <geoff(a)clare.See-My-Signature.invalid> writes:
>>>>> John Kelly wrote:
>>>>>
>>>>>> What's the preferred way to clear the environment, in the absence of
>>>>>> clearenv()?
>>>>> Obtain the name from environ[0] (the part before the '=').
>>>>> Call unsetenv() with that name.
>>>>> Repeat until environ[0] is NULL.
>>>> Is there are particular reason for this complicated procedure? It is
>>>> supposed to be possible to manipulate the environment via environ both
>>>> for the current process and for other programs invoked by an
>>>> exec-routine without an env-pointer.
>>> POSIX says "If the application modifies environ or the pointers to
>>> which it points, the behavior of getenv() is undefined." It has
>>> similar statements for setenv() and unsetenv(). I assume the same
>>> would apply to other functions that use environment variables (i.e.
>>> which effectively use getenv() internally), although that doesn't
>>> appear to be stated anywhere in the standard.
>>
>> Consequently, either these 'other functions' must not use getenv or
>> getenv must be able to deal with a modified environ. And there is, of
>> course, putenv, whose 'application usage' section contains this little
>> gem:
>>
>> The putenv() function manipulates the environment pointed to
>> by environ, and can be used in conjunction with getenv().
>
> putenv() is an XSI function - it came into the standard via
> the old X/Open Portability Guide specs, whereas getenv(), setenv()
> and unsetenv() came in via IEEE/POSIX. That would explain why
> it is treated differently from them.

A forewarning: Some people may consider the text after the page break
'inflammatory' and Mr Rullgard will doubtlessly be convinced that
this would be the only conceivable intent behind writing it. But it
isn't. It is, however, intended to be politically critical.

And IEEE/ POSIX is where the people sit who are busy with 'embrace,
extend and extinguish Unix', right? The pthread group which has no
problems writing blatant untruths into the standard (the statement
that there would be no conceivable way how a new process starting to
execute the same program would still be useful in a multi-threaded
process [two short answers: 'handle' a possible SIGSEGV portably,
COW-share a large in-memory database parts of which are in the process
of being updated by the original process]) if this only helps them
with eliminating fork, the 'realtime people' who - ironically - want
to get rid of real time facilities useful for distributed
applications, which they have no idea of, anyway (viewed from the
perspective of an EE, the internet is probably a ghastly unregulated
place and that it works just means it needs to be undone before
someone notices) and all the other people who are busy with
introducing gratuitous incompatibilities, such as slowly eliminating
itimers in favor moving 'timers' completely into the kernel,
doubtlessly for a richer application developer experience ...

[...]

>> IMO certainly not for executing another program with a 'clean'
>> environment, possibly including a well-defined set of initial
>> 'environment variables'.
>
> The OP specifically mentioned clearenv(), which modifies the
> current environment; he didn't mention passing an environment to
> another program. Obviously, constructing a separate environment
> list for use with execle() or execve() is the better way of
> achieving the latter.

That's not the least bit 'obvious' to me. As with goto, which isn't
the bogey-man used to scare children away but a useful concept in
certain situations, shared state isn't genuinely bad, either. It just
needs to be used with some common sense and the usual counterargument
that this is just 'too hard for the average programmer' hints at a
glaring educational deficit. After all, calculus is something the
'average programmer' is supposed to master for some reason, which is
certainly a reasonable one and not just a historical accident stemming
from the original use computers were put to.

First | Prev |
Pages: 1 2
Prev: How to add GUI to command line
Next: ksh: /usr/bin/find: arg list too long