|
From: Ed Morton on 26 Jul 2005 09:35 oraustin(a)hotmail.com wrote: >> In this case, the question is about turning malformed "CSV" files >> into a standardized format. Without a statement of how they are to >> be interpreted, all attempts are mere conjecture. > > > Thanks for all your input on this - it seems to have gone off at a > little tangent (which is good I think). > The CSV I need to convert are not badly formed - that was a big type in > my first example. There are only two types of lines > 123,4556,efwref,134 > or > "123","123412","sdfhuk,aqfds","1432" Sigh: sed '/^[^"]/s/\([^,]*\)\(,*\)/"\1"\2/g' file Regards, Ed.
From: William Park on 26 Jul 2005 10:12 oraustin(a)hotmail.com wrote: > > In this case, the question is about turning malformed "CSV" files > > into a standardized format. Without a statement of how they are to > > be interpreted, all attempts are mere conjecture. > > Thanks for all your input on this - it seems to have gone off at a > little tangent (which is good I think). > The CSV I need to convert are not badly formed - that was a big type in > my first example. There are only two types of lines > 123,4556,efwref,134 > or > "123","123412","sdfhuk,aqfds","1432" > I'm very grateful for all the scripts but firstly I'd like just to use > one unix command - probably SED, other people have to use and modify > what I write and they and I don't have time right now to become > profficient in a number of commands. > > surely this is simple in SED. > check if the first character of a line is not " and if so then > substitie all , with "," in the rest of the line. Except I don't know > how to write that :) > Thanks Oliver sed '/^"/! { s/,/","/g; s/^/"/; s/$/"/ }' > > On an aside maybe you can advise - we get data from many companies it > always requires some manipulation - the example here is fairly simple > - is PERL the best thing for me to focus on for file > alteration/manipulation? Can this link in with Microsoft Biz talk > which is the framework I believe we will use? Any time you have more than 2 liners, Bash shell is better way, especially if you have relatively newbies maintaining the code. -- William Park <opengeometry(a)yahoo.ca>, Toronto, Canada ThinFlash: Linux thin-client on USB key (flash) drive http://home.eol.ca/~parkw/thinflash.html BashDiff: Super Bash shell http://freshmeat.net/projects/bashdiff/
From: oraustin on 26 Jul 2005 10:50 > sed '/^"/! { s/,/","/g; s/^/"/; s/$/"/ }' Thanks William - that's perfect except I had to reverse the order of operations - no idea why As you supplied the full command the first comma is replaced by 3 double quotes. If I reverse the 2 and 3rd sub commands it works fine....strange. Any other comments on whether I should investigate Perl as our data manipulation language - I'm suggesting we buy visual studio.net perl plugin. Can't afford (time or money) to go down the wrong path. Can't afford not to bite the bullet and plump for a technology and start learning.
From: Ed Morton on 26 Jul 2005 11:11 oraustin(a)hotmail.com wrote: >> sed '/^"/! { s/,/","/g; s/^/"/; s/$/"/ }' > > Thanks William - that's perfect except I had to reverse the order of > operations - no idea why > As you supplied the full command the first comma is replaced by 3 > double quotes. > If I reverse the 2 and 3rd sub commands it works fine....strange. > > Any other comments on whether I should investigate Perl as our data > manipulation language - I'm suggesting we buy visual studio.net perl > plugin. Can't afford (time or money) to go down the wrong path. Can't > afford not to bite the bullet and plump for a technology and start > learning. > Get gawk from http://www.gnu.org/software/gawk/. It's free, powerful, well documented, and simple to use/understand if you have any experiance at all with an Algol-like language (C, etc.) Ed.
From: William Park on 26 Jul 2005 11:54
oraustin(a)hotmail.com wrote: > > > sed '/^"/! { s/,/","/g; s/^/"/; s/$/"/ }' > Thanks William - that's perfect except I had to reverse the order of > operations - no idea why As you supplied the full command the first > comma is replaced by 3 double quotes. If I reverse the 2 and 3rd sub > commands it works fine....strange. > > Any other comments on whether I should investigate Perl as our data > manipulation language - I'm suggesting we buy visual studio.net perl > plugin. Can't afford (time or money) to go down the wrong path. Can't > afford not to bite the bullet and plump for a technology and start > learning. Well, if you buy Visual Studio.NET, then you would definitely be biting a bullet or two. If this is the kind of data processing you need, then hire me, and I'll fix this straight. -- William Park <opengeometry(a)yahoo.ca>, Toronto, Canada ThinFlash: Linux thin-client on USB key (flash) drive http://home.eol.ca/~parkw/thinflash.html BashDiff: Super Bash shell http://freshmeat.net/projects/bashdiff/ |