From: Jim Langston on
Gerhard Wolf wrote:
> Hi,
>
> how can i get the last n lines of a text-file without start loop from
> the files front (bad performanc on huge files).

Open the file for reading. Go to the end of the file. Start reading one
character at a time backwards until you reach a newline. Then you either
have the line (if you bothered on building a string) or you can read the
line from that postion. This would require you open the file in binary
mode.

--
Jim Langston
tazmaster(a)rocketmail.com


From: Francis Glassborow on
Jim Langston wrote:
> Gerhard Wolf wrote:
>> Hi,
>>
>> how can i get the last n lines of a text-file without start loop from
>> the files front (bad performanc on huge files).
>
> Open the file for reading. Go to the end of the file. Start reading one
> character at a time backwards until you reach a newline. Then you either
> have the line (if you bothered on building a string) or you can read the
> line from that postion. This would require you open the file in binary
> mode.
>

And I can imagine that for n (number of lines) greater than 1 that this
could be much slower than reading the file once to create and index
table into the lines and then using that to access the lines you want.
Note that if you have a text file that you frequently read but
infrequently write creating an index-file for use between writes might
be worth the effort.
From: Jerry Coffin on
In article <69vb1oF34g1etU1(a)mid.individual.net>, quisquiliae(a)gmx.de
says...
> Hi,
>
> how can i get the last n lines of a text-file without start loop from
> the files front (bad performanc on huge files).

Estimate the maximum possible length of a line. Multiply that by N to
figure how much data at the end of the file you need to read. If you can
live with a minute bit of technically non-portable code, get the file
size and figure up the absolute position that represents in the file and
round down to a multiple of something like 2K or 4K (to get to a likely
sector boundary). Seek to the specified spot and start reading into a
circular buffer.

Unless your estimate of the maximum line length was substantially low,
when you reach the end of the file you have N lines in your circular
buffer. If you haven't thrown away at least one line, extrapolate a new
size based on how many lines you DID read, seek to the new spot, and
read more data (keeping in mind that the _first_ line you read in the
previous attempt will normally only be a partial line, so the end of the
last line in the new batch needs to get the first line from the previous
batch appended to it.

The version of tail I've used for years has used 1024 as the estimated
maximum line length. This is usually _extremely_ high, but in a typical
case still only reads about 10K from the end of a file, so there would
be little perceptible gain in performance by using a lower estimate. At
the same time, it nearly always avoids having to do a second (or
subsequent) round of reading.

--
Later,
Jerry.

The universe is a figment of its own imagination.