From: Kvetch Kvetch on
I have a script that parses a bunch of data off the wire and pipes it
into a fastercsv array. I want to read in a text file and check certain
columns for matches from the data in the text file. There seems to be
tons of different ways read in a file and "grep" match for something. I
would like to do this in the fastest possible method. Reading in the
file each time could potentially take too long, so I assume reading it
into memory would be my best bet but perhaps not. I can arrange the
data in the file in any manner, 1 entry per line, delimited some how or
whatever.

Does anyone have any suggestions on my best bet to do this?

thanks
--
Posted via http://www.ruby-forum.com/.

From: Robert Klemme on
2010/2/11 Kvetch Kvetch <kvetch(a)gmail.com>:
> I have a script that parses a bunch of data off the wire and pipes it
> into a fastercsv array.  I want to read in a text file and check certain
> columns for matches from the data in the text file.  There seems to be
> tons of different ways read in a file and "grep" match for something.  I
> would like to do this in the fastest possible method.  Reading in the
> file each time could potentially take too long, so I assume reading it
> into memory would be my best bet but perhaps not.  I can arrange the
> data in the file in any manner, 1 entry per line, delimited some how or
> whatever.
>
> Does anyone have any suggestions on my best bet to do this?

I opt for:

File.foreach file_name do |line|
if /pattern/ =~ line
puts "found!"
end
end

Because reading is buffered so you do not end up doing a syscall for
every line and you do not need much memory - which you do if you use
File.read to slurp in the whole thing. Still, for small to medium
sized files File.read may still be faster. You need to measure
different approaches to get figures you can rely on.

Kind regards

robert

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/