From: Charles Crayne on
On 19 Nov 2007 20:00:49 +0200
Phil Carmody <thefatphil_demunged(a)yahoo.co.uk> wrote:

> Frank, for his sins, is right. Using a Ctrl-Z (26, 0x1A) to
> indicate the end of file is a hang over from CP/M and thence
> MS-DOS 1.0.

The issue was that those OSes kept track of the number of sectors in a
file, but not the number of bytes in the last sector. So, a special
character was needed to signal end of file.

-- Chuck
From: Terence on
Ctl-Z is the correct indication for ascii entry from a keyboard to
indicate end of manula entry.

The ascii symbol #1A (1aH in assembler) is the standard MSDOS code in
ascii files to indicate the position AFTER the last data character in
the file.

The reason for the uncertainty in some programmers minds, is that the
data file length was usually rouned up to the next segment size for
the device being used, and the space between the last data character
and the end of the file buffer was filled with #1A symbols. In later
MSDOS versions, the file length stored in the directory, was the
actual fdta file length and not the legth of the physical file
assignment.

So some computer programsa are able to determine the real file length
and count bytes. Fortran compilers will work with both systems,
depending on whether the file is opened as Formatted (in which case a
#1A will trigger end of file) or not, since binary files can be read
to the physical or logical end, depending on how the directory has
treated the file size for the operating system being used.
From: H. Peter Anvin on
Terence wrote:
> Ctl-Z is the correct indication for ascii entry from a keyboard to
> indicate end of manula entry.
>
> The ascii symbol #1A (1aH in assembler) is the standard MSDOS code in
> ascii files to indicate the position AFTER the last data character in
> the file.
>
> The reason for the uncertainty in some programmers minds, is that the
> data file length was usually rouned up to the next segment size for
> the device being used, and the space between the last data character
> and the end of the file buffer was filled with #1A symbols. In later
> MSDOS versions, the file length stored in the directory, was the
> actual fdta file length and not the legth of the physical file
> assignment.
>
> So some computer programsa are able to determine the real file length
> and count bytes. Fortran compilers will work with both systems,
> depending on whether the file is opened as Formatted (in which case a
> #1A will trigger end of file) or not, since binary files can be read
> to the physical or logical end, depending on how the directory has
> treated the file size for the operating system being used.

MS-DOS always had true byte-granularity binary EOF. However, QDOS was
designed for quick porting from CP/M-80, which didn't, and it needed the
0x1A hack.

-hpa
From: Terence on
No! The byte granularity is the Physical space assigment resolution on
disc.

The closing of an open file stores the length used by the program,
which may or may not be padded, depending on how the program managed
disc space.

Programs like my actual (better word than "current") Fortran F77,
using formatted files, and editors like WordStar, both of which I use,
pad the resulting ascii file with #1A characters to the next segment
multiple of 256 and MSDOS duly stores a multiple of 256 bytes.

EDIT, which I also use, (actually to remove padding from Wordstar
files) puts a crazy extra cr-lf and then stores the resulting length;
so I trick it by deleting the last cr-lf from the last line and let it
store what I need, which is the multiple of the fixed-length records
for the files I usually use for table-driven programming.

So I can see the before-and-after file sizes in the directory after
using EDIT on a WS file.

From: [Jongware] on
"Sarunas Kazlauskas" <referas(a)gmail.com> wrote in message
news:a95a6122-3ebc-42a6-bbdf-0cb7ebfb5fc9(a)l22g2000hsc.googlegroups.com...
> ... or how do you know you've reached the end of txt file?

Taken all of the above answers into account, you should

- read the entire file into memory. Your read routine will tell you when there
are no more bytes -- don't worry, you can't read beyond a file end even if you
wanted to.
- if you treat this as a text file, check if it happens to end with a Ctrl+Z.
Most modern (> 1985!) programs don't close off, but if you expect legacy files,
you'll encounter them occasionally.
- If the last byte is a Ctrl+Z, there might be more. Some legacy files are
filled up to a sector size with Ctrl+Zs.
- If you *do* clip the Ctrl+Z at the end of a file, remember to check the rest!
There might be one right after the first line, and the rest of the file should
be discarded.

In short, don't bother at all if you aren't bothered about old files.

[Jongware]