From: J on
Can someone make me un-crazy?

I have a bit of code that right now, looks like this:

status = getoutput('smartctl -l selftest /dev/sda').splitlines()[6]
status = re.sub(' (?= )(?=([^"]*"[^"]*")*[^"]*$)', ":",status)
print status

Basically, it pulls the first actual line of data from the return you
get when you use smartctl to look at a hard disk's selftest log.

The raw data looks like this:

# 1 Short offline Completed without error 00% 679 -

Unfortunately, all that whitespace is arbitrary single space
characters. And I am interested in the string that appears in the
third column, which changes as the test runs and then completes. So
in the example, "Completed without error"

The regex I have up there doesn't quite work, as it seems to be
subbing EVERY space (or at least in instances of more than one space)
to a ':' like this:

# 1: Short offline:::::: Completed without error:::::: 00%:::::: 679:::::::: -

Ultimately, what I'm trying to do is either replace any space that is
> one space wiht a delimiter, then split the result into a list and
get the third item.

OR, if there's a smarter, shorter, or better way of doing it, I'd love to know.

The end result should pull the whole string in the middle of that
output line, and then I can use that to compare to a list of possible
output strings to determine if the test is still running, has
completed successfully, or failed.

Unfortunately, my google-fu fails right now, and my Regex powers were
always rather weak anyway...

So any ideas on what the best way to proceed with this would be?
From: Grant Edwards on
On 2010-04-07, J <dreadpiratejeff(a)gmail.com> wrote:

> Can someone make me un-crazy?

Definitely. Regex is driving you crazy, so don't use a regex.

inputString = "# 1 Short offline Completed without error 00% 679 -"

print ' '.join(inputString.split()[4:-3])

> So any ideas on what the best way to proceed with this would be?

Anytime you have a problem with a regex, the first thing you should
ask yourself: "do I really, _really_ need a regex?

Hint: the answer is usually "no".

--
Grant Edwards grant.b.edwards Yow! I'm continually AMAZED
at at th'breathtaking effects
gmail.com of WIND EROSION!!
From: Patrick Maupin on
On Apr 7, 4:40 pm, J <dreadpiratej...(a)gmail.com> wrote:
> Can someone make me un-crazy?
>
> I have a bit of code that right now, looks like this:
>
> status = getoutput('smartctl -l selftest /dev/sda').splitlines()[6]
>         status = re.sub(' (?= )(?=([^"]*"[^"]*")*[^"]*$)', ":",status)
>         print status
>
> Basically, it pulls the first actual line of data from the return you
> get when you use smartctl to look at a hard disk's selftest log.
>
> The raw data looks like this:
>
> # 1  Short offline       Completed without error       00%       679         -
>
> Unfortunately, all that whitespace is arbitrary single space
> characters.  And I am interested in the string that appears in the
> third column, which changes as the test runs and then completes.  So
> in the example, "Completed without error"
>
> The regex I have up there doesn't quite work, as it seems to be
> subbing EVERY space (or at least in instances of more than one space)
> to a ':' like this:
>
> # 1: Short offline:::::: Completed without error:::::: 00%:::::: 679:::::::: -
>
> Ultimately, what I'm trying to do is either replace any space that is> one space wiht a delimiter, then split the result into a list and
>
> get the third item.
>
> OR, if there's a smarter, shorter, or better way of doing it, I'd love to know.
>
> The end result should pull the whole string in the middle of that
> output line, and then I can use that to compare to a list of possible
> output strings to determine if the test is still running, has
> completed successfully, or failed.
>
> Unfortunately, my google-fu fails right now, and my Regex powers were
> always rather weak anyway...
>
> So any ideas on what the best way to proceed with this would be?

You mean like this?

>>> import re
>>> re.split(' {2,}', '# 1 Short offline Completed without error 00%')
['# 1', 'Short offline', 'Completed without error', '00%']
>>>

Regards,
Pat
From: Patrick Maupin on
On Apr 7, 4:47 pm, Grant Edwards <inva...(a)invalid.invalid> wrote:
> On 2010-04-07, J <dreadpiratej...(a)gmail.com> wrote:
>
> > Can someone make me un-crazy?
>
> Definitely.  Regex is driving you crazy, so don't use a regex.
>
>   inputString = "# 1  Short offline       Completed without error     00%       679         -"
>
>   print ' '.join(inputString.split()[4:-3])
>
> > So any ideas on what the best way to proceed with this would be?
>
> Anytime you have a problem with a regex, the first thing you should
> ask yourself:  "do I really, _really_ need a regex?
>
> Hint: the answer is usually "no".
>
> --
> Grant Edwards               grant.b.edwards        Yow! I'm continually AMAZED
>                                   at               at th'breathtaking effects
>                               gmail.com            of WIND EROSION!!

OK, fine. Post a better solution to this problem than:

>>> import re
>>> re.split(' {2,}', '# 1 Short offline Completed without error 00%')
['# 1', 'Short offline', 'Completed without error', '00%']
>>>

Regards,
Pat
From: Patrick Maupin on
On Apr 7, 7:49 pm, Patrick Maupin <pmau...(a)gmail.com> wrote:
> On Apr 7, 4:40 pm, J <dreadpiratej...(a)gmail.com> wrote:
>
>
>
> > Can someone make me un-crazy?
>
> > I have a bit of code that right now, looks like this:
>
> > status = getoutput('smartctl -l selftest /dev/sda').splitlines()[6]
> >         status = re.sub(' (?= )(?=([^"]*"[^"]*")*[^"]*$)', ":",status)
> >         print status
>
> > Basically, it pulls the first actual line of data from the return you
> > get when you use smartctl to look at a hard disk's selftest log.
>
> > The raw data looks like this:
>
> > # 1  Short offline       Completed without error       00%       679         -
>
> > Unfortunately, all that whitespace is arbitrary single space
> > characters.  And I am interested in the string that appears in the
> > third column, which changes as the test runs and then completes.  So
> > in the example, "Completed without error"
>
> > The regex I have up there doesn't quite work, as it seems to be
> > subbing EVERY space (or at least in instances of more than one space)
> > to a ':' like this:
>
> > # 1: Short offline:::::: Completed without error:::::: 00%:::::: 679:::::::: -
>
> > Ultimately, what I'm trying to do is either replace any space that is> one space wiht a delimiter, then split the result into a list and
>
> > get the third item.
>
> > OR, if there's a smarter, shorter, or better way of doing it, I'd love to know.
>
> > The end result should pull the whole string in the middle of that
> > output line, and then I can use that to compare to a list of possible
> > output strings to determine if the test is still running, has
> > completed successfully, or failed.
>
> > Unfortunately, my google-fu fails right now, and my Regex powers were
> > always rather weak anyway...
>
> > So any ideas on what the best way to proceed with this would be?
>
> You mean like this?
>
> >>> import re
> >>> re.split(' {2,}', '# 1  Short offline       Completed without error       00%')
>
> ['# 1', 'Short offline', 'Completed without error', '00%']
>
>
>
> Regards,
> Pat

BTW, although I find it annoying when people say "don't do that" when
"that" is a perfectly good thing to do, and although I also find it
annoying when people tell you what not to do without telling you what
*to* do, and although I find the regex solution to this problem to be
quite clean, the equivalent non-regex solution is not terrible, so I
will present it as well, for your viewing pleasure:

>>> [x for x in '# 1 Short offline Completed without error 00%'.split(' ') if x.strip()]
['# 1', 'Short offline', ' Completed without error', ' 00%']

Regards,
Pat