From: Nick Maclaren on

In article <4o4ed5Fd0ghrU1(a)individual.net>,
=?ISO-8859-1?Q?Jan_Vorbr=FCggen?= <jvorbrueggen(a)not-mediasec.de> writes:
|> > Since the source code of the language in question is case sensitive,
|>
|> ...which is already a massive design failure, of course, ...
|>
|> > two of those three are just wrong,
|>
|> Really? Is there anything in the C ISO standard that says that the
|> statement "include stdio.h" must lead to a file names "stdio.h", and not
|> "STDIO.H", or even an entry "STDIO" in a text-library file?

Er, do you mean '#include <stdio.h>' or '#include "stdio.h"'?

In the first case, no. In the second, yes, though the specification
is a typical piece of ISO C ambiguity. What is says is:

[#3] A preprocessing directive of the form

# include "q-char-sequence" new-line

causes the replacement of that directive by the entire
contents of the source file identified by the specified
sequence between the " delimiters. The named source file is
searched for in an implementation-defined manner. ...

The C standard does not actually SAY that it is undefined behaviour if
'#include "stdio.h"' matches a user file, but it assuredly is. What
is unclear is whether it is ALSO undefined behaviour if it matches
the same file matched by '#include <stdio.h>' - I have used systems
where it was.

But this arcana is off-group ....


Regards,
Nick Maclaren.
From: Jan Vorbrüggen on
> |> Really? Is there anything in the C ISO standard that says that the
> |> statement "include stdio.h" must lead to a file names "stdio.h", and not
> |> "STDIO.H", or even an entry "STDIO" in a text-library file?
>
> Er, do you mean '#include <stdio.h>' or '#include "stdio.h"'?
>
> In the first case, no. In the second, yes, though the specification
> is a typical piece of ISO C ambiguity. What is says is:
>
> [#3] A preprocessing directive of the form
>
> # include "q-char-sequence" new-line
>
> causes the replacement of that directive by the entire
> contents of the source file identified by the specified
> sequence between the " delimiters. The named source file is
> searched for in an implementation-defined manner. ...

So the answer is "no" even in the second case. Notice the careful wording,
"the..._contents_ of the source file _identified_ by the specified sequence".

Is there anything that says, "if you write '#include STDIO.H', you will get
something different from 'include stdio.h'"?

Jan
From: Nick Maclaren on

In article <4o4h89Fd1dnkU1(a)individual.net>,
=?ISO-8859-1?Q?Jan_Vorbr=FCggen?= <jvorbrueggen(a)not-mediasec.de> writes:
|>
|> So the answer is "no" even in the second case. Notice the careful wording,
|> "the..._contents_ of the source file _identified_ by the specified sequence".

Well, yes, the contents could be anything.

|> Is there anything that says, "if you write '#include STDIO.H', you will get
|> something different from 'include stdio.h'"?

No. You MUST get the same - a syntax error.

If you mean '#include "STDIO.H"' and '#include "stdio.h"', then no.


Regards,
Nick Maclaren.
From: Bill Todd on
Benny Amorsen wrote:
>>>>>> "BT" == Bill Todd <billtodd(a)metrocast.net> writes:
>
> BT> Does this suggest that the file system should collate
> BT> case-insensitive even while it addresses case-sensitive, so that
> BT> such potential collisions can be easily found?
>
> What do you mean by collate here?

Precisely what I said, as I usually do.

If you say that the results of
> readdir() or equivalent should be returned in proper alphabetical
> order, you really have to make that order per-user. My girlfriend and
> I expect different collations.

I am not interested in your collating preferences, or in your
girlfriend's. I was asking whether a file system should collate
insensitively to case in order to facilitate detecting unintended
logical collisions created by case-sensitive names (with the implicit
assumption that these might be sufficiently frequent - as contrasted
with other conceivable forms of character-related collisions - to be
worth addressing).

- bill
From: Anne & Lynn Wheeler on

dgay(a)barnowl.research.intel-research.net writes:
> I think you missed the point that the output of readdir is (and should
> be) unrelated to the order presented to the user. Why is the file system
> collating anyway? Now I can see the value of a library that collates
> file names according to some system-wide convention...

one of the results of changes original made by (i think ?)
perkin/elmer to cms mfd in the early 70s was to sort the
filenames. then when application was looking for specific filename
.... lookup could do much better than linear search (and searches
better than linear were dependent on being matched to collating/sort
sequences).

it really was significant change for directories that happened to have
a couple thousand filenames (some number of high use system).

i recently ran into something similar using sort on filenames and
doing something other than linear search ... where sort command
default collating sequence changed and it moved how period was handled
(showed up betwee capital H and capital I). i had to explicitly set
"LC_ALL=C" to get sort back working the way i was use to.

a similar, but different problem we did long ago and far away ... when
we did online telephone book for several thousand corporate employees.
for lots of reasons ... the names/numbers was kept in linear flat file
.... but sorted. the search was radix ... based on measured first
letter frequency by taking the size of the file and probing part way
into the file based on first letters of the search argument and the
related letter frequencies for names (originally compiled into the
search program). it could frequently get within appropriate physical
record within a probe or two (w/o requiring separate index or other
infrastructure).

we had special collating/sort order assuming that names (and search
arguments) had no blanks (even tho any names with embedded blanks were
carried in the actual data (the ignore blanks was a special sort
charactieristic/option). in the name scenario .. name
collisions/duplicates were allowed ... so search result might present
multiple matches.