From: Rob Biedenharn on
On Jul 2, 2010, at 1:15 PM, Christian Smith wrote:
> Rob Biedenharn wrote:
>> On Jul 2, 2010, at 7:11 AM, Brian Candler wrote:
>>>>
>>>> csv4 96 lines 10 cols
>>>>
>>>> Thanks!
>>>>
>>>> Seed
>>>
>>> Why use fastercsv?
>>> cat csv1 csv2 csv3 >csv4
>>> would meet your requirement.
>>
>> except that you'd have headers from csv2 and csv3 (but perhaps your
>> line counts imply no headers?)
>>
>>>
>>> But if you want to use fastercsv, then open each file in turn,
>>> read it
>>> line at a time, and output the line you just read.
>>> --
>>
>> If the files are small-ish, you can avoid a chicken-and-egg problem
>> of
>> the headers by reading all the input files (saving the headers from
>> the first), then writing it all out from memory.
>>
>> -Rob
>>
>> Rob Biedenharn
>> Rob(a)AgileConsultingLLC.com http://AgileConsultingLLC.com/
>> rab(a)GaslightSoftware.com http://GaslightSoftware.com/
>
> If the files are small-ish, you can avoid a chicken-and-egg problem of
> the headers by reading all the input files (saving the headers from
> the first), then writing it all out from memory.
>
> The files aren't smallish but memory isn't an issue. I would love to
> be
> able to do this. I am able to read the 3 files into an array but it's
> parsing them back into 1 csv I am having trouble with. I would assume
> this would be a lot faster than a line read>write approach.


OK, let's read them all in and then write out one file...

headers = nil
all_rows = []
input_files.each do |input_file|
csv = FasterCSV.table(input_file, :headers => true)
in_headers, *in_rows = csv.to_a
headers ||= in_headers
all_rows.concat(in_rows)
end
FasterCSV.open(output_file, 'w') do |csv|
csv << headers
all_rows.each {|row| csv << row }
end

The full example is at:
http://gist.github.com/461784

The details may have to change a bit depending on your circumstances,
but the general idea is sound.

-Rob

Rob Biedenharn
Rob(a)AgileConsultingLLC.com http://AgileConsultingLLC.com/
rab(a)GaslightSoftware.com http://GaslightSoftware.com/


From: brabuhr on
On Fri, Jul 2, 2010 at 3:27 PM, Rob Biedenharn
<Rob(a)agileconsultingllc.com> wrote:
> OK, let's read them all in and then write out one file...
>
> headers = nil
> all_rows = []
> input_files.each do |input_file|
>  csv = FasterCSV.table(input_file, :headers => true)
>  in_headers, *in_rows = csv.to_a
>  headers ||= in_headers
>  all_rows.concat(in_rows)
> end
> FasterCSV.open(output_file, 'w') do |csv|
>  csv << headers
>  all_rows.each {|row| csv << row }
> end

FasterCSV.open(output_file, 'w') do |ocsv|
input_files.each_with_index do |input_file, i|
FasterCSV.foreach(input_file, :headers => true, :return_headers =>
true) do |row|
next if i > 0 and row.header_row?
ocsv << row
end
end
end