From: Brian Candler on
> SO i entered data is like.
> ---------------------------------------------------------------
> <p> pargraph1 pargraph1 &nbsp;&nbsp; </p>
> <p>pargraph2 pargraph2 pargraph2 &nbsp;&nbsp; </p>
> <p> pargraph3 hello3 hell13 &nbsp;&nbsp; </p>
> <p> pargraph4 hello3 hell14 &nbsp;&nbsp; </p>
> <p>&nbsp;&nbsp;&nbsp;&nbsp;</p>
> --------------------------------------------------------
>
> In the above text 4th paragraph is the last paragrph which i
> entered.after that i was pressed enter button so editor converted this
> into " <p>&nbsp;&nbsp;&nbsp;&nbsp;</p"
>
> I want to remove nbsp's after text in last paragrpah means result look
> like
> -------------------------------------------
> <p> pargraph1 pargraph1 &nbsp;&nbsp; </p>
> <p>pargraph2 pargraph2 pargraph2 &nbsp;&nbsp; </p>
> <p> pargraph3 hello3 hell13 &nbsp;&nbsp; </p>
> <p> pargraph4 hello3 hell14</p>
> ----------------------------------------------------------------

So the approach is:
(1) Write a regular expression which matches just the thing you want to
delete;
(2) Invoke it with gsub to replace that text with the empty string.

For example, to delete *all* empty paragraphs, then you want to match
<p> followed by any mixture of &nbsp; and space followed by </p>. So you
could write:

str.gsub! /<p>(&nbsp;|\s)*<\/p>/, ''

(x|y) means match x or y, \s means match any whitespace character, and *
means match it 0 or more times.

To delete only the *last* paragraph if it is empty, then you can tweak
it to:

str.gsub! /<p>(&nbsp;|\s)*<\/p>\s*\z/, ''

where \z matches the end of the string, and \s* allows 0 or more space
characters, including newlines, to precede that.

Once you're happy with that, then you can do another match and replace
to change the final instance of "&nbsp;&nbsp; </p>" into just "</p>"

But you might want to be sure this is what you really want. How did the
previous &nbsp; entries get there? Do you really want to keep them? It
would be much simpler just to replace all sequences of &nbsp; or space
with a single space.

str.gsub! /(&nbsp;|\s)+/, ' '
--
Posted via http://www.ruby-forum.com/.

From: Lucky Nl on
Brian Candler wrote:
>> SO i entered data is like.
>> ---------------------------------------------------------------
>> <p> pargraph1 pargraph1 &nbsp;&nbsp; </p>
>> <p>pargraph2 pargraph2 pargraph2 &nbsp;&nbsp; </p>
>> <p> pargraph3 hello3 hell13 &nbsp;&nbsp; </p>
>> <p> pargraph4 hello3 hell14 &nbsp;&nbsp; </p>
>> <p>&nbsp;&nbsp;&nbsp;&nbsp;</p>
>> --------------------------------------------------------
>>
>> In the above text 4th paragraph is the last paragrph which i
>> entered.after that i was pressed enter button so editor converted this
>> into " <p>&nbsp;&nbsp;&nbsp;&nbsp;</p"
>>
>> I want to remove nbsp's after text in last paragrpah means result look
>> like
>> -------------------------------------------
>> <p> pargraph1 pargraph1 &nbsp;&nbsp; </p>
>> <p>pargraph2 pargraph2 pargraph2 &nbsp;&nbsp; </p>
>> <p> pargraph3 hello3 hell13 &nbsp;&nbsp; </p>
>> <p> pargraph4 hello3 hell14</p>
>> ----------------------------------------------------------------
>
> So the approach is:
> (1) Write a regular expression which matches just the thing you want to
> delete;
> (2) Invoke it with gsub to replace that text with the empty string.
>
> For example, to delete *all* empty paragraphs, then you want to match
> <p> followed by any mixture of &nbsp; and space followed by </p>. So you
> could write:
>
> str.gsub! /<p>(&nbsp;|\s)*<\/p>/, ''
>
> (x|y) means match x or y, \s means match any whitespace character, and *
> means match it 0 or more times.
>
> To delete only the *last* paragraph if it is empty, then you can tweak
> it to:
>
> str.gsub! /<p>(&nbsp;|\s)*<\/p>\s*\z/, ''
>
> where \z matches the end of the string, and \s* allows 0 or more space
> characters, including newlines, to precede that.
>
> Once you're happy with that, then you can do another match and replace
> to change the final instance of "&nbsp;&nbsp; </p>" into just "</p>"
>
> But you might want to be sure this is what you really want. How did the
> previous &nbsp; entries get there? Do you really want to keep them? It
> would be much simpler just to replace all sequences of &nbsp; or space
> with a single space.
>
> str.gsub! /(&nbsp;|\s)+/, ' '


Hi when i was used below logic.
str.gsub! /<p>(&nbsp;|\s)*<\/p>\s*\z/, ''

-------------------------------------------------
str = "<p> pargraph1 pargraph1 &nbsp;&nbsp; </p> <p>pargraph2 pargraph2
pargraph2 &nbsp;&nbsp; </p>
<p> pargraph3 hello3 hell13 &nbsp;&nbsp; </p>
<p> pargraph4 hello3 hell14 &nbsp;&nbsp;</p> <p>&nbsp;&nbsp;</p>"
str = str.gsub! /<p>(&nbsp;|\s)*<\/p>\s*\z/, ''
puts str
---------------------------------
its returns below resule and it is fine.
----------------------------------------------
<p> pargraph1 pargraph1 &nbsp;&nbsp; </p> <p>pargraph2 pargraph2
pargraph2 &nbsp;&nbsp; </p>
<p> pargraph3 hello3 hell13 &nbsp;&nbsp; </p>
<p> pargraph4 hello3 hell14 &nbsp;&nbsp;</p>
---------------------------------------------
But if i gave with littile modifiaction at endof line is enetered with
chars <p>dada&nbsp;&nbsp;ddsa</p>"

str giving result nil
--
Posted via http://www.ruby-forum.com/.

From: Lucky Nl on
Oh k basically result not modified then returns nil right?

its very helpful your regular expression
But i need one mroe help
str = "<p> pargraph1 pargraph1 &nbsp;&nbsp; </p> <p>pargraph2 pargraph2
pargraph2 &nbsp;&nbsp; </p>
<p> pargraph3 hello3 hell13 &nbsp;&nbsp; </p>
<p> pargraph4 hello3 hell14 &nbsp;&nbsp;</p> <p>&nbsp;&nbsp;</p>"


In the above str <p>&nbsp;&nbsp;</p> is empty line inmy point of view.
so the enetered text in editor is upto

str = "<p> pargraph1 pargraph1 &nbsp;&nbsp; </p> <p>pargraph2 pargraph2
pargraph2 &nbsp;&nbsp; </p>
<p> pargraph3 hello3 hell13 &nbsp;&nbsp; </p>
<p> pargraph4 hello4 hell14 &nbsp;&nbsp;</p>

i want to remove spaces wt the endof the lastpargrpah also.Not in the 1
&2&3rd pargrpahs
Want result is
str = "<p> pargraph1 pargraph1 &nbsp;&nbsp; </p> <p>pargraph2 pargraph2
pargraph2 &nbsp;&nbsp; </p>
<p> pargraph3 hello3 hell13 &nbsp;&nbsp; </p>
<p> pargraph4 hello4 hell14</p>


--
Posted via http://www.ruby-forum.com/.

From: Brian Candler on
Lucky Nl wrote:
> But if i gave with littile modifiaction at endof line is enetered with
> chars <p>dada&nbsp;&nbsp;ddsa</p>"
>
> str giving result nil

Yes, the result of gsub! is nil if no change is made; but the string
remains as it was.

irb(main):001:0> str = "abc"
=> "abc"
irb(main):002:0> str.gsub!(/d/,"")
=> nil
irb(main):003:0> str
=> "abc"

It's intended so you can say

if str.gsub! ...
# it changed
else
# it didn't
end

If you use gsub instead of gsub!, then it always returns the resulting
string.

irb(main):004:0> str2 = str.gsub(/d/,"")
=> "abc"
--
Posted via http://www.ruby-forum.com/.

From: Brian Candler on
Lucky Nl wrote:
> i want to remove spaces wt the endof the lastpargrpah also.Not in the 1
> &2&3rd pargrpahs
> Want result is
> str = "<p> pargraph1 pargraph1 &nbsp;&nbsp; </p> <p>pargraph2 pargraph2
> pargraph2 &nbsp;&nbsp; </p>
> <p> pargraph3 hello3 hell13 &nbsp;&nbsp; </p>
> <p> pargraph4 hello4 hell14</p>

So, write a regexp which matches any number of &nbsp; or space, followed
by </p>, followed by end of string. I've given you the tools to do that
already.

If you can't make it work then show what you tried, and we can explain
what needs changing.

You can test your regexps using irb, or you can use this web site:
http://rubular.com/
--
Posted via http://www.ruby-forum.com/.