|
From: Jay on 30 Jan 2007 12:13 Howdy, I'm trying to break an input string into multpile pieces using a series of delimiters that start with an asterisk. Following the asterisk is a mulitple character identifier immediately followed by a data string of variable length. The input string may contain more than one identifier anywhere in the string. In all, there are 50+ identifiers to search for and the asterisk is allowed to part of the data string as long as it isn't defined as an identifier (it would be treated as another identifier at that point). Here is a simple example: *CZ1 2.3 4-56 *fuuuS24364 08 23 72 I'd like to break this into CZ 1 2.3 4-56 fuuu S24364 08 23 72 I have tried the pattern (?:\*(CZ|fuuu)(.*)), which produces the following ouput: CZ 1 2.3 4-56 *fuuuS24364 08 23 72 How can I force it to repeat the capturing? Thanks, Jay
From: Jim Gibson on 30 Jan 2007 12:43 In article <1170177214.014726.69240(a)k78g2000cwa.googlegroups.com>, Jay <JaythePCguy(a)gmail.com> wrote: > Howdy, > > I'm trying to break an input string into multpile pieces using a > series of delimiters that start with an asterisk. Following the > asterisk is a mulitple character identifier immediately followed by a > data string of variable length. The input string may contain more than > one identifier anywhere in the string. In all, there are 50+ > identifiers to search for and the asterisk is allowed to part of the > data string as long as it isn't defined as an identifier (it would be > treated as another identifier at that point). > > Here is a simple example: > *CZ1 2.3 4-56 *fuuuS24364 08 23 72 > > I'd like to break this into > CZ > 1 2.3 4-56 > fuuu > S24364 08 23 72 > > I have tried the pattern (?:\*(CZ|fuuu)(.*)), which produces the > following ouput: > CZ > 1 2.3 4-56 *fuuuS24364 08 23 72 > > How can I force it to repeat the capturing? Use split and capture the delimiters: $x = "*CZ1 2.3 4-56 *fuuuS24364 08 23 72"; @f = split( /(\*(?:CZ|fuuu))/, $x); print join("\n",@f),"\n"; *CZ 1 2.3 4-56 *fuuu S24364 08 23 72 See perldoc -f split Posted Via Usenet.com Premium Usenet Newsgroup Services ---------------------------------------------------------- ** SPEED ** RETENTION ** COMPLETION ** ANONYMITY ** ---------------------------------------------------------- http://www.usenet.com
From: Paul Lalli on 30 Jan 2007 12:50 On Jan 30, 12:13 pm, "Jay" <JaythePC...(a)gmail.com> wrote: > I'm trying to break an input string into multpile pieces using a > series of delimiters that start with an asterisk. Following the > asterisk is a mulitple character identifier immediately followed by a > data string of variable length. The input string may contain more than > one identifier anywhere in the string. In all, there are 50+ > identifiers to search for and the asterisk is allowed to part of the > data string as long as it isn't defined as an identifier (it would be > treated as another identifier at that point). > > Here is a simple example: > *CZ1 2.3 4-56 *fuuuS24364 08 23 72 > > I'd like to break this into > CZ > 1 2.3 4-56 > fuuu > S24364 08 23 72 So CZ and fuuu are your delimiters, but only if preceded by an asterisk, and you want those delimiters to also be in your results? > I have tried the pattern (?:\*(CZ|fuuu)(.*)), What does that mean? How did you try it? In a list-context pattern match? In a split? In a scalar-context pattern match with the /g option? Please show your actual code, not a tiny piece of it. > which produces the > following ouput: > CZ > 1 2.3 4-56 *fuuuS24364 08 23 72 > > How can I force it to repeat the capturing? Without knowing what you actually did, there's no way to tell you how to modify it. I will say that the following seems to produce the results you were looking for, for the data you gave: perl -le' my @fields = split /(\*(?:CZ|fuuu))/, q{*CZ1 2.3 4-56 *fuuuS24364 08 23 72}; s/^\*// for @fields; print for grep { length } @fields; ' CZ 1 2.3 4-56 fuuu S24364 08 23 72 perldoc -f split perldoc -f grep perldoc perlretut Paul Lalli
From: Mirco Wahab on 30 Jan 2007 12:59 Jay wrote: > I'm trying to break an input string into multpile pieces using a > series of delimiters that start with an asterisk. Following the > asterisk is a mulitple character identifier immediately followed by a > data string of variable length. > Here is a simple example: > *CZ1 2.3 4-56 *fuuuS24364 08 23 72 > I'd like to break this into > CZ > 1 2.3 4-56 > fuuu > S24364 08 23 72 > > I have tried the pattern (?:\*(CZ|fuuu)(.*)), which produces the > following ouput: > CZ > 1 2.3 4-56 *fuuuS24364 08 23 72 > How can I force it to repeat the capturing? You can force the repeated capturing by the /g flag on the regex. Your complete solution should look, if I guessed correct from your riddle, sth. like: ... my $simple = q{*CZ1 2.3 4-56 *fuuuS24364 08 23 72 *AAA3 44 5-66}; my %hits; $hits{$1} = $2 while $simple=~/\*([a-z]+|[A-Z]+)([^*\\z]+)/g; print "$_ ==> $hits{$_}\n" for keys %hits; ... This would print (on the above data): CZ ==> 1 2.3 4-56 AAA ==> 3 44 5-66 fuuu ==> S24364 08 23 72 But your problem is not really completely specified ... Regards M.
From: Todd on 30 Jan 2007 12:54 Jay wrote: > Howdy, > > I'm trying to break an input string into multpile pieces using a > series of delimiters that start with an asterisk. Following the > asterisk is a mulitple character identifier immediately followed by a > data string of variable length. The input string may contain more than > one identifier anywhere in the string. In all, there are 50+ > identifiers to search for and the asterisk is allowed to part of the > data string as long as it isn't defined as an identifier (it would be > treated as another identifier at that point). > > Here is a simple example: > *CZ1 2.3 4-56 *fuuuS24364 08 23 72 > > I'd like to break this into > CZ > 1 2.3 4-56 > fuuu > S24364 08 23 72 > > I have tried the pattern (?:\*(CZ|fuuu)(.*)), which produces the > following ouput: > CZ > 1 2.3 4-56 *fuuuS24364 08 23 72 > > How can I force it to repeat the capturing? > > Thanks, > Jay > my $line = '*CZ1 2.3 4-56 *fuuuS24364 08 23 72'; $line =~ /\*(CZ)(.+)\s+\*(fuuu)(.+)\s*$/; # $1 = CZ # $2 = 1 2.3 4-56 # $3 = fuuu # $4 = S24364 08 23 72 Todd
|
Pages: 1 Prev: Max size of an array used in perl. Next: Win32::OLE version conflict problem...please help |