Prev: Data cleaning issue involving bad wide characters in what ought to be ascii data
Next: SIGUSR1 ignored during poll() if sleep() and warn() called?
From: toralf on 9 Sep 2009 10:39 I've a file containing tab separated values - most, but not all are quoted - and now I'm wondering how to substitute a non-quoted value like <tab>20090807<tab> by sth. like <tab>"20090807"<tab> -- MfG/Sincerely Toralf F�rster pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3
From: David Harmon on 9 Sep 2009 12:43 On Wed, 9 Sep 2009 14:39:59 +0000 (UTC) in comp.lang.perl.misc, toralf <toralf.foerster(a)gmx.de> wrote, >I've a file containing tab separated values - most, but not all are >quoted - and now I'm wondering how to substitute a non-quoted value like ><tab>20090807<tab> by sth. like <tab>"20090807"<tab> What part are you having trouble with? What did you try and what happened? I suggest, use regex substitution s/// with \t for tab, capture the string of digits, then use $1 to put them back in the replacement string. Don't forget to account for the first and last fields not having both tabs.
From: sln on 9 Sep 2009 15:09 On Wed, 9 Sep 2009 14:39:59 +0000 (UTC), toralf <toralf.foerster(a)gmx.de> wrote: >I've a file containing tab separated values - most, but not all are >quoted - and now I'm wondering how to substitute a non-quoted value like ><tab>20090807<tab> by sth. like <tab>"20090807"<tab> You would have to do some complicated regex (possibly multiple regx's), but that depends on the data sample in regards to quote's, newlines, or other shapes it can have. A simple way is to split on tab, fix up the value, the join it back together. Its ugly though. -sln ------------ use strict; use warnings; my @ar = ( qq{\n12345145\t\n36367\t"qfqqbv"\n\t"0987"\t"asdf"a"\n }, qq{\t"01234"\taaaa\t494848\t} ); for my $str (@ar) { my $newstring = join "<tab>", map {/^(\s*|)"*(.*?|)"*(\s*|)$/; $1.'"'.$2.'"'.$3 } split (/\t/,$str); print "-> $newstring\n\n"; } __END__
From: David Harmon on 9 Sep 2009 20:19 On Wed, 09 Sep 2009 12:09:05 -0700 in comp.lang.perl.misc, sln(a)netherlands.com wrote, > (\s*|) Just in case \s* fails to match the empty string, eh?
From: sln on 9 Sep 2009 22:09
On Wed, 09 Sep 2009 17:19:50 -0700, David Harmon <source(a)netcom.com> wrote: >On Wed, 09 Sep 2009 12:09:05 -0700 in comp.lang.perl.misc, >sln(a)netherlands.com wrote, > >> (\s*|) > >Just in case \s* fails to match the empty string, eh? Doh, what was I thinking. /^(\s*)"*(.*?)"*(\s*)$/ -sln |