|
From: Jim Langston on 26 May 2008 07:18 Gerhard Wolf wrote: > Hi, > > how can i get the last n lines of a text-file without start loop from > the files front (bad performanc on huge files). Open the file for reading. Go to the end of the file. Start reading one character at a time backwards until you reach a newline. Then you either have the line (if you bothered on building a string) or you can read the line from that postion. This would require you open the file in binary mode. -- Jim Langston tazmaster(a)rocketmail.com
From: Francis Glassborow on 26 May 2008 08:53 Jim Langston wrote: > Gerhard Wolf wrote: >> Hi, >> >> how can i get the last n lines of a text-file without start loop from >> the files front (bad performanc on huge files). > > Open the file for reading. Go to the end of the file. Start reading one > character at a time backwards until you reach a newline. Then you either > have the line (if you bothered on building a string) or you can read the > line from that postion. This would require you open the file in binary > mode. > And I can imagine that for n (number of lines) greater than 1 that this could be much slower than reading the file once to create and index table into the lines and then using that to access the lines you want. Note that if you have a text file that you frequently read but infrequently write creating an index-file for use between writes might be worth the effort.
From: Jerry Coffin on 26 May 2008 12:21 In article <69vb1oF34g1etU1(a)mid.individual.net>, quisquiliae(a)gmx.de says... > Hi, > > how can i get the last n lines of a text-file without start loop from > the files front (bad performanc on huge files). Estimate the maximum possible length of a line. Multiply that by N to figure how much data at the end of the file you need to read. If you can live with a minute bit of technically non-portable code, get the file size and figure up the absolute position that represents in the file and round down to a multiple of something like 2K or 4K (to get to a likely sector boundary). Seek to the specified spot and start reading into a circular buffer. Unless your estimate of the maximum line length was substantially low, when you reach the end of the file you have N lines in your circular buffer. If you haven't thrown away at least one line, extrapolate a new size based on how many lines you DID read, seek to the new spot, and read more data (keeping in mind that the _first_ line you read in the previous attempt will normally only be a partial line, so the end of the last line in the new batch needs to get the first line from the previous batch appended to it. The version of tail I've used for years has used 1024 as the estimated maximum line length. This is usually _extremely_ high, but in a typical case still only reads about 10K from the end of a file, so there would be little perceptible gain in performance by using a lower estimate. At the same time, it nearly always avoids having to do a second (or subsequent) round of reading. -- Later, Jerry. The universe is a figment of its own imagination.
|
Pages: 1 Prev: Callback function and templates Next: Pointer to an array |