|
Prev: Regular expression for stanza file
Next: printf "time=%02d:%02d:%02d\n", $hour,$minute,$sec Does NOT work
From: Chris Martin on 18 Jun 2008 12:52 I have a file with lines like these: abc1234|one;two;three xyz3245|two;three def9876|four From a data perspective, this is two fields separated by '|', where the second field contains multiple subitems separated by semicolons. I want to use sed to output this: abc1234|one abc1234|two abc1234|three xyz3245|two xyz3245|three def9876|four In other words, I want to pair the first field on each line (before the '|') with each sub-element in the second field, one pair to each line, separated by '|'. I can replace the semicolons with '\n' newlines, which gives me this: abc1234|one two three xyz3245|two three def9876|four I seems like this should be straightforward, but I can't figure out a way to substitute the first item multiple times, to end up with a repeat of the first item on each line in every line from that set. Suggestions from sed gurus will be appreciated. Chris Martin University of North Carolina at Chapel Hill School of Medicine
From: Dave B on 18 Jun 2008 17:12 Chris Martin wrote: > I have a file with lines like these: > > abc1234|one;two;three > xyz3245|two;three > def9876|four > > From a data perspective, this is two fields separated by '|', where the > second field contains multiple subitems separated by semicolons. > > I want to use sed to output this: > > abc1234|one > abc1234|two > abc1234|three > xyz3245|two > xyz3245|three > def9876|four > > In other words, I want to pair the first field on each line (before the > '|') with each sub-element in the second field, one pair to each line, > separated by '|'. > > I can replace the semicolons with '\n' newlines, which gives me this: > > abc1234|one > two > three > xyz3245|two > three > def9876|four > > I seems like this should be straightforward, but I can't figure out a way > to substitute the first item multiple times, to end up with a repeat of > the first item on each line in every line from that set. Suggestions from > sed gurus will be appreciated. > > > Chris Martin > University of North Carolina at Chapel Hill > School of Medicine You can do that in more than one way. A very straightforward way is using awk, like this: awk -F '[|;]' '{for(i=2;i<=NF;i++)print $1"|"$i}' yourfile That solution assumes that "|" and ";" are only used to delimit fields and do not occur anywhere else. If you want to use sed, you can do this: sed ':n;s/\(\([^|]*\)|.*\);\([^;]*\)$/\1\n\2|\3/;tn' yourfile (should work with most modern seds) Hope this helps. -- echo 0|sed 's909=oO#3u)o19;s0#0ooo)].O0;s()(0bu}=(;s#}#.1m"?0^2{#; s)")9v2@3%"9$);so%op]t(p$e#!o;sz(z^+.z;su+ur!z"au;sxzxd?_{h)cx;:b; s/\(\(.\).\)\(\(..\)*\)\(\(.\).\)\(\(..\)*#.*\6.*\2.*\)/\5\3\1\7/; tb'|awk '{while((i+=2)<=length($1)-18)a=a substr($1,i,1);print a}'
From: Rakesh Sharma on 19 Jun 2008 04:22
On Jun 18, 9:52 pm, Chris Martin <c...(a)vfemail.net> wrote: > I have a file with lines like these: > > abc1234|one;two;three > xyz3245|two;three > def9876|four > > From a data perspective, this is two fields separated by '|', where the > second field contains multiple subitems separated by semicolons. > > I want to use sed to output this: > > abc1234|one > abc1234|two > abc1234|three > xyz3245|two > xyz3245|three > def9876|four > sed -e ' s/^\([^|]*[|]\)\([^;]*\)[;]/\1\2\ \1/ P;D ' yourfile |