Prev: Odd json encoding erro
Next: Subclassing RegexObject
From: Gabriel Genellina on 16 Dec 2009 09:35 En Wed, 16 Dec 2009 11:09:32 -0300, Ed Keith <e_d_k(a)yahoo.com> escribió: > I am having a problem when substituting a raw string. When I do the > following: > > re.sub('abc', r'a\nb\nc', '123abcdefg') > > I get > > """ > 123a > b > cdefg > """ > > what I want is > > r'123a\nb\ncdefg' From http://docs.python.org/library/re.html#re.sub re.sub(pattern, repl, string[, count]) ...repl can be a string or a function; if it is a string, any backslash escapes in it are processed. That is, \n is converted to a single newline character, \r is converted to a linefeed, and so forth. So you'll have to double your backslashes: py> re.sub('abc', r'a\\nb\\nc', '123abcdefg') '123a\\nb\\ncdefg' -- Gabriel Genellina
From: Ed Keith on 16 Dec 2009 12:19 --- On Wed, 12/16/09, Gabriel Genellina <gagsl-py2(a)yahoo.com.ar> wrote: > From: Gabriel Genellina <gagsl-py2(a)yahoo.com.ar> > Subject: Re: Raw string substitution problem > To: python-list(a)python.org > Date: Wednesday, December 16, 2009, 9:35 AM > En Wed, 16 Dec 2009 11:09:32 -0300, > Ed Keith <e_d_k(a)yahoo.com> > escribió: > > > I am having a problem when substituting a raw string. > When I do the following: > > > > re.sub('abc', r'a\nb\nc', '123abcdefg') > > > > I get > > > > """ > > 123a > > b > > cdefg > > """ > > > > what I want is > > > > r'123a\nb\ncdefg' > > From http://docs.python.org/library/re.html#re.sub > > re.sub(pattern, repl, string[, count]) > > ...repl can be a string or a function; > if > it is a string, any backslash escapes > in > it are processed.. That is, \n is > converted > to a single newline character, \r is > converted to a linefeed, and so forth. > > So you'll have to double your backslashes: > > py> re.sub('abc', r'a\\nb\\nc', '123abcdefg') > '123a\\nb\\ncdefg' > > --Gabriel Genellina > > --http://mail.python.org/mailman/listinfo/python-list > That is going to be a nontrivial exercise. I have control over the pattern, but the texts to be substituted and substituted into will be read from user supplied files. I need to reproduce the exact text the is read from the file. Maybe what I should do is use re to break the string into two pieces, the part before the pattern to be replaces and the part after it, then splice the replacement text in between them. Seems like doing it the hard way, but it should work. Thanks, -EdK
From: Peter Otten on 16 Dec 2009 12:51 Ed Keith wrote: > --- On Wed, 12/16/09, Gabriel Genellina <gagsl-py2(a)yahoo.com.ar> wrote: > >> From: Gabriel Genellina <gagsl-py2(a)yahoo.com.ar> >> Subject: Re: Raw string substitution problem >> To: python-list(a)python.org >> Date: Wednesday, December 16, 2009, 9:35 AM >> En Wed, 16 Dec 2009 11:09:32 -0300, >> Ed Keith <e_d_k(a)yahoo.com> >> escribió: >> >> > I am having a problem when substituting a raw string. >> When I do the following: >> > >> > re.sub('abc', r'a\nb\nc', '123abcdefg') >> > >> > I get >> > >> > """ >> > 123a >> > b >> > cdefg >> > """ >> > >> > what I want is >> > >> > r'123a\nb\ncdefg' >> >> From http://docs.python.org/library/re.html#re.sub >> >> re.sub(pattern, repl, string[, count]) >> >> ...repl can be a string or a function; >> if >> it is a string, any backslash escapes >> in >> it are processed. That is, \n is >> converted >> to a single newline character, \r is >> converted to a linefeed, and so forth. >> >> So you'll have to double your backslashes: >> >> py> re.sub('abc', r'a\\nb\\nc', '123abcdefg') >> '123a\\nb\\ncdefg' >> >> --Gabriel Genellina >> >> --http://mail.python.org/mailman/listinfo/python-list >> > > That is going to be a nontrivial exercise. I have control over the > pattern, but the texts to be substituted and substituted into will be read > from user supplied files. I need to reproduce the exact text the is read > from the file. There is a helper function re.escape() that you can use to sanitize the substitution: >>> print re.sub('abc', re.escape(r'a\nb\nc'), '123abcdefg') 123a\nb\ncdefg Peter
From: Gabriel Genellina on 16 Dec 2009 14:23 En Wed, 16 Dec 2009 14:51:08 -0300, Peter Otten <__peter__(a)web.de> escribi�: > Ed Keith wrote: > >> --- On Wed, 12/16/09, Gabriel Genellina <gagsl-py2(a)yahoo.com.ar> wrote: >> >>> Ed Keith <e_d_k(a)yahoo.com> >>> escribi�: >>> >>> > I am having a problem when substituting a raw string. >>> When I do the following: >>> > >>> > re.sub('abc', r'a\nb\nc', '123abcdefg') >>> > >>> > I get >>> > >>> > """ >>> > 123a >>> > b >>> > cdefg >>> > """ >>> > >>> > what I want is >>> > >>> > r'123a\nb\ncdefg' >>> >>> So you'll have to double your backslashes: >>> >>> py> re.sub('abc', r'a\\nb\\nc', '123abcdefg') >>> '123a\\nb\\ncdefg' >>> >> That is going to be a nontrivial exercise. I have control over the >> pattern, but the texts to be substituted and substituted into will be >> read >> from user supplied files. I need to reproduce the exact text the is read >> from the file. > > There is a helper function re.escape() that you can use to sanitize the > substitution: > >>>> print re.sub('abc', re.escape(r'a\nb\nc'), '123abcdefg') > 123a\nb\ncdefg Unfortunately re.escape does much more than that: py> print re.sub('abc', re.escape(r'a.b.c'), '123abcdefg') 123a\.b\.cdefg I think the string_escape encoding is what the OP needs: py> print re.sub('abc', r'a\n(b.c)\nd'.encode("string_escape"), '123abcdefg') 123a\n(b.c)\nddefg -- Gabriel Genellina
From: Peter Otten on 16 Dec 2009 14:54
Gabriel Genellina wrote: > En Wed, 16 Dec 2009 14:51:08 -0300, Peter Otten <__peter__(a)web.de> > escribió: > >> Ed Keith wrote: >> >>> --- On Wed, 12/16/09, Gabriel Genellina <gagsl-py2(a)yahoo.com.ar> wrote: >>> >>>> Ed Keith <e_d_k(a)yahoo.com> >>>> escribió: >>>> >>>> > I am having a problem when substituting a raw string. >>>> When I do the following: >>>> > >>>> > re.sub('abc', r'a\nb\nc', '123abcdefg') >>>> > >>>> > I get >>>> > >>>> > """ >>>> > 123a >>>> > b >>>> > cdefg >>>> > """ >>>> > >>>> > what I want is >>>> > >>>> > r'123a\nb\ncdefg' >>>> >>>> So you'll have to double your backslashes: >>>> >>>> py> re.sub('abc', r'a\\nb\\nc', '123abcdefg') >>>> '123a\\nb\\ncdefg' >>>> >>> That is going to be a nontrivial exercise. I have control over the >>> pattern, but the texts to be substituted and substituted into will be >>> read >>> from user supplied files. I need to reproduce the exact text the is read >>> from the file. >> >> There is a helper function re.escape() that you can use to sanitize the >> substitution: >> >>>>> print re.sub('abc', re.escape(r'a\nb\nc'), '123abcdefg') >> 123a\nb\ncdefg > > Unfortunately re.escape does much more than that: > > py> print re.sub('abc', re.escape(r'a.b.c'), '123abcdefg') > 123a\.b\.cdefg Sorry, I didn't think of that. > I think the string_escape encoding is what the OP needs: > > py> print re.sub('abc', r'a\n(b.c)\nd'.encode("string_escape"), > '123abcdefg') > 123a\n(b.c)\nddefg Another possibility: >>> print re.sub('abc', lambda m: r'a\nb\n.c\a', '123abcdefg') 123a\nb\n.c\adefg Peter |