|
From: Marek Stepanek on 25 Jul 2005 07:14 On 25.07.2005 12:41, in article dc2fi5$9le$1(a)newsg4.svr.pol.co.uk, "John L" <jl(a)lammtarra.notthisbit.fslife.co.uk> wrote: > > <oraustin(a)hotmail.com> wrote in message > news:1122282459.844933.97360(a)g49g2000cwa.googlegroups.com... >> Firstly - is there a sed newsgroup for me to be posting to? >> Quick question now :) >> I have a CSV file which I created by concatenation of multiple files. >> In some of the files the fields are also delimited with double quotes. >> 123,ouahfds,12341 >> "123,"fsdfsd,ewfdw",14324 >> >> I'd like to double quote all the fields. Not sure how to achieve this >> - please help and why not simply striping out all " ? are there some quotation marks in your cvs file which you want to keep ? Or this helping ? % sed 's/\"//g' < test.txt > test02.txt greetings from Munich -- ______________________________________________________________________ ___PODIUM_INTERNATIONAL_//_the_embassy_for_talented_young_musicians___ ______ Marek_Stepanek mstep_[at]_PodiumInternational_[dot]_org ______ _________________ http://www.PodiumInternational.org _________________ ______________________________________________________________________
From: oraustin on 25 Jul 2005 07:17 the second line of my corrected example shows why I can't just strip the " - some of the text contains commas Just out of interest- why to you put a backslash before the " - I don't seem to need that with GNU sed on windows. Also what is the < redirection arrow for - I don't have to use that either
From: Michael Tosch on 25 Jul 2005 12:13 oraustin(a)hotmail.com wrote: > Firstly - is there a sed newsgroup for me to be posting to? > Quick question now :) > I have a CSV file which I created by concatenation of multiple files. > In some of the files the fields are also delimited with double quotes. > 123,ouahfds,12341 > "123,"fsdfsd,ewfdw",14324 > > I'd like to double quote all the fields. Not sure how to achieve this > - please help > Thanks > Oliver > This is easier with awk: awk -F, '{for(i=1;i<=NF;++i){if($i!~/"*"/){$i="\""$i"\""}};print}' OFS=, -- Michael Tosch @ hp : com
From: Michael Tosch on 25 Jul 2005 12:19 Michael Tosch wrote: > oraustin(a)hotmail.com wrote: > >> Firstly - is there a sed newsgroup for me to be posting to? >> Quick question now :) >> I have a CSV file which I created by concatenation of multiple files. >> In some of the files the fields are also delimited with double quotes. >> 123,ouahfds,12341 >> "123,"fsdfsd,ewfdw",14324 >> >> I'd like to double quote all the fields. Not sure how to achieve this >> - please help >> Thanks >> Oliver >> > > > This is easier with awk: > > awk -F, '{for(i=1;i<=NF;++i){if($i!~/"*"/){$i="\""$i"\""}};print}' OFS=, > > To satisfy the purists and Ed's: one can omit some curly brackets: awk -F, '{for(i=1;i<=NF;++i)if($i!~/"*"/)$i="\""$i"\"";print}' OFS=, -- Michael Tosch @ hp : com
From: Chris F.A. Johnson on 25 Jul 2005 12:41
On 2005-07-25, oraustin(a)hotmail.com wrote: > Firstly - Alan Connor - Why bother to post? First, ignore AC; most of us have him killfiled. Wading through his rants to find the occasional nugget of helpful information is not worth the time and effort. > I have no choice than to use Google Groups. Second, Google Groups does let you quote properly, though not by default: "If you want to post a followup via groups.google.com, don't use the broken "Reply" link at the bottom of the article. Click on "show options" at the top of the article, then click on the "Reply" at the bottom of the article headers." Third, you do have a choice; it's just a matter of finding it. > I am not knowledgable about sed but in the past have > been active in the Eiffel Newsgroup and to a much lesser extent C. > Ok back to the topic > sorry my mistake about the unbalanced quotes. > 123,houlf,134 > "132","housdfgd,sdfd","14324" > > I could do the concatenation better - how about before concatenation if > the first character of the file is not a " then substitute , with "," > in the file - I could do with some help with the command for this. Pipe the file through this script to add quotes to all fields: csv_split() { csv_vnum=0 ## field number csv_record=${1%"${CR}"} ## remove carriage return, if any unset record_vals ## we need a pristine (global) array ## remove each field from the record and store in record_vals[] ## when all the records are stored, $csv_record will be empty while [ -n "$csv_record" ] do case $csv_record in ## if $csv_record starts with a quotation mark, ## extract up to '",' or end of record \"*) csv_right=${csv_record#*\",} csv_value=${csv_record%%\",*} record_vals[$csv_vnum]=${csv_value#\"} ;; ## otherwise extract to the next comma *) record_vals[$csv_vnum]=${csv_record%%,*} csv_right=${csv_record#*,} ;; esac csv_record=${csv_right} ## the remains of the record ## If what remains is the same as the record before the previous ## field was extracted, it is the last field, so store it and exit ## the loop if [ "$csv_record" = "$csv_last" ] then csv_record=${csv_record#\"} record_vals[$csv_vnum]=${csv_record%\"} break fi csv_last=$csv_record csv_vnum=$(( $csv_vnum + 1 )) done } put_csv() { _PUT_CSV= for field in "$@" ## loop through the fields (on command line) do _PUT_CSV=${_PUT_CSV:+$_PUT_CSV,}\"$field\" done _PUT_CSV=${_PUT_CSV%,} ## remove trailing comma printf "%s\n" "$_PUT_CSV" } while IFS= read -r line do csv_split "$line" put_csv "${record_vals[@]}" done -- Chris F.A. Johnson <http://cfaj.freeshell.org> ================================================================== Shell Scripting Recipes: A Problem-Solution Approach, 2005, Apress <http://www.torfree.net/~chris/books/cfaj/ssr.html> |