regex to remove lines made of only whitespace [Python]

Prev: Line-by-line processing when stdin is not a tty
Next: urgent requirement at Hyderabad

From: Chris Withers on 11 Aug 2010 07:13

Hi All,

I'm looking for a regex (or other solution, as long as it's quick!) that
could be used to strip out lines made up entirely of whitespace.

eg:

'x\n \t \n\ny' -> 'x\ny'

Does anyone have one handy?

cheers,

Chris

--
Simplistix - Content Management, Batch Processing & Python Consulting
- http://www.simplistix.co.uk

From: Andreas Tawn on 11 Aug 2010 07:21

> Hi All,
>
> I'm looking for a regex (or other solution, as long as it's quick!)
> that
> could be used to strip out lines made up entirely of whitespace.
>
> eg:
>
> 'x\n \t \n\ny' -> 'x\ny'
>
> Does anyone have one handy?
>
> cheers,
>
> Chris

for line in lines:
if not line.strip():
continue
doStuff(line)

cheers,

Drea

From: Steven D'Aprano on 11 Aug 2010 07:31

On Wed, 11 Aug 2010 12:13:29 +0100, Chris Withers wrote:

> Hi All,
>
> I'm looking for a regex (or other solution, as long as it's quick!) that
> could be used to strip out lines made up entirely of whitespace.

def strip_blank_lines(lines):
for line in lines:
if not line.isspace():
yield line

text = ''.join(strip_blank_lines(lines.split('\n')))

--
Steven

From: Tim Chase on 11 Aug 2010 07:33

On 08/11/10 06:21, Andreas Tawn wrote:
>> I'm looking for a regex (or other solution, as long as it's quick!)
>> that could be used to strip out lines made up entirely of whitespace.
>>
>> eg:
>>
>> 'x\n \t \n\ny' -> 'x\ny'
>
> for line in lines:
> if not line.strip():
> continue
> doStuff(line)

Note that the OP's input and output were a single string.
Perhaps something like

>>> s = 'x\n \t \n\ny'
>>> '\n'.join(line for line in s.splitlines() if line.strip())
'x\ny'

which, IMHO, has much greater clarity than any regexp with the
added bonus of fewer regexp edge-cases (blanks at the
beginning/middle/end of the text).

-tkc

From: Andreas Tawn on 11 Aug 2010 07:39

> On 08/11/10 06:21, Andreas Tawn wrote:
> >> I'm looking for a regex (or other solution, as long as it's quick!)
> >> that could be used to strip out lines made up entirely of
> whitespace.
> >>
> >> eg:
> >>
> >> 'x\n \t \n\ny' -> 'x\ny'
> >
> > for line in lines:
> > if not line.strip():
> > continue
> > doStuff(line)
>
> Note that the OP's input and output were a single string.

Ah, indeed. What do they say about the first part of assume?

> Perhaps something like
> >>> s = 'x\n \t \n\ny'
> >>> '\n'.join(line for line in s.splitlines() if line.strip())
> 'x\ny'
>
> which, IMHO, has much greater clarity than any regexp with the
> added bonus of fewer regexp edge-cases (blanks at the
> beginning/middle/end of the text).
>
> -tkc

This what I meant (no really) ;o).

Cheers,

Drea

| Next | Last
Pages: 1 2
Prev: Line-by-line processing when stdin is not a tty
Next: urgent requirement at Hyderabad