From: Peter J. Holzer on
On 2010-05-17 00:26, sln(a)netherlands.com <sln(a)netherlands.com> wrote:
> On Sun, 16 May 2010 04:00:02 GMT, PerlFAQ Server <brian(a)theperlreview.com> wrote:
>>5.4: How do I delete the last N lines from a file?
>>
>> (contributed by brian d foy)
>>
>> The easiest conceptual solution is to count the lines in the file then
>> start at the beginning and print the number of lines (minus the last N)
>> to a new file.

This takes two passes: First to get the number of lines, second to copy.


>> Most often, the real question is how you can delete the last N lines
>> without making more than one pass over the file, or how to do it with a
>> lot of copying. The easy concept is the hard reality when you might have
>> millions of lines in your file.
>
> I believe, "or how to do it with a lot of copying." was meant to be
> "or how to do it without a lot of copying."

Probably.


> And, I'm no so sure you're not conflating "making more than one pass over the file"
> with reading/writing the file more than one time.

See above. As you showed, you can do it in one pass at the expense of
using more memory.


>> One trick is to use "File::ReadBackwards", which starts at the end of
>
> Is this really a trick?
>
> I can't remember if there is a truncate at file position primitive.

There is at least on unix-like system. See truncate(2) and ftruncate(2).

> If I take a guess one way, I would say this approach would work as fast
> as any:
>
> create a line stack, the size of N
> read each line, store line in stack, increment a counter
> when the counter equals N, drop the oldest line into a new file, newest line to stack.
> repeat until end of old file
> close new file
> delete old file
> rename new file to old

That still means you have to read the whole file and write all but the
last N lines, and you have to keep N lines in memory.

With File::ReadBackwards and truncate you only have to read N lines and
write none (truncate has to update metadata, of course, but if you cut
of only a small part of the file that's cheap), and you don't have to
keep lines in memory. (On the other hand reading backwards is usually a
lot slower than reading forward)

hp

From: Peter J. Holzer on
On 2010-05-17 17:54, Ralph Malph <ralph(a)happydays.com> wrote:
> On 5/17/2010 12:43 PM, Ralph Malph wrote:
> [snip]
>> $ time cat puke | wc -l | xargs echo -10000 + | bc \
>> | xargs echo head puke -n | sh > top_n-10000
[...]
> lines = `wc -l puke`
> let num_lines=$(($lines-10000))
> head puke -n $num_lines

That's still needlessly complicated.

head -n -10000 puke

hp
From: sln on
On Mon, 17 May 2010 21:19:16 +0200, "Peter J. Holzer" <hjp-usenet2(a)hjp.at> wrote:

>On 2010-05-17 00:26, sln(a)netherlands.com <sln(a)netherlands.com> wrote:
>>
>> I can't remember if there is a truncate at file position primitive.
>
>There is at least on unix-like system. See truncate(2) and ftruncate(2).

I guess there is the Win32 _chsize() which takes a file descripter
and size parameter. This either expands or truncates the file.
I'm not so sure this doesen't just rewrite the file anyway instead of
altering the table entry. I never used it.

>
>> If I take a guess one way, I would say this approach would work as fast
>> as any:
>>
>> create a line stack, the size of N
>> read each line, store line in stack, increment a counter
>> when the counter equals N, drop the oldest line into a new file, newest line to stack.
>> repeat until end of old file
>> close new file
>> delete old file
>> rename new file to old
>
>That still means you have to read the whole file and write all but the
>last N lines, and you have to keep N lines in memory.
>
>With File::ReadBackwards and truncate you only have to read N lines and
>write none (truncate has to update metadata, of course, but if you cut
>of only a small part of the file that's cheap), and you don't have to
>keep lines in memory. (On the other hand reading backwards is usually a
>lot slower than reading forward)
>

Yes, I guess reading backwards for newline would be the logically
shorter solution, but whats the need to truncate at the end, and on
translated text mode, anyway ?

-sln
From: sln on
On Mon, 17 May 2010 13:47:53 -0700, sln(a)netherlands.com wrote:

>I guess there is the Win32 _chsize() which takes a file descripter
>and size parameter. This either expands or truncates the file.
>I'm not so sure this doesen't just rewrite the file anyway instead of
>altering the table entry. I never used it.
>

I have no idea what Perl uses (if at all) for truncation
on Windows.

There is also the SetEndOfFile() if working with HANDLE's.

Some docs:
"The SetEndOfFile function moves the end-of-file (EOF) position
for the specified file to the current position of the file pointer.
This function sets the physical end of a file (as indicated by
allocated clusters). To set the logical end of a file,
use the SetFileValidData function."

HANDLE's in windows allows all sorts of weird things, like
sync/async io. Its the basis and has its roots in the kernel.
So every read/write/open/close of all files, all devices, absolutely
everything is based on a single paradigm of reading, writing, opening
or closing a file. Where a "file" is a device, and a "device" is a file.

All the stubs behind win32, funnel down to a relatively small core
of kernel mode functions.

-sln