From: AyOut on
On Nov 6, 2:09 pm, Ed Morton <mortons...(a)gmail.com> wrote:
> On Nov 6, 1:39 pm, AyOut <mort...(a)gmail.com> wrote:
>
>
>
> > On Nov 4, 10:11 pm, Ed Morton <mortons...(a)gmail.com> wrote:
>
> > > AyOut wrote:
> > > > I have a GC log file with entries like this one:
>
> > > > 2.729: [GC [PSYoungGen: 70850K->6800K(152896K)] 70850K->6800K
> > > > (502464K), 0.0165440 secs] [Times: user=0.09 sys=0.04, real=0..02
> > > > secs]
>
> > > > I would like to parse this to output for easy plotting using gnuplot
> > > > and would like the following output:
>
> > > > 2.729, 70850, 6800, 152896, 70850, 6800, 502464, 0.0165440, 0.09,
> > > > 0.04, 0.02
>
> > > Assuming the input is all on one line:
>
> > > $ cat file
> > > 2.729: [GC [PSYoungGen: 70850K->6800K(152896K)] 70850K->6800K (502464K),
> > > 0.0165440 secs] [Times: user=0.09 sys=0.04, real=0.02 secs]
>
> > > $ awk '{OFS=", "; gsub(/[^[:digit:].]/," "); $1=$1}1' file
> > > 2.729, 70850, 6800, 152896, 70850, 6800, 502464, 0.0165440, 0.09, 0.04, 0.02
>
> > >      Ed.
>
> > That's a beautiful solution!  Now, there's a change in the log file
> > output.  The first field is now a date and time stamp
>
> > 2009-11-05T15:00:16.965-0600: 0.405: [GC 2112K->750K(7680K), 0.0204170
> > secs]
> > 2009-11-05T15:00:17.087-0600: 0.527: [GC 2862K->1010K(7680K),
> > 0.0043760 secs]
>
> > and applying this command
>
> > cat ${gclogfile}|sed 's/^[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9][A-
> > Z]:*//'|sed 's/\.[0-9][0-9][0-9]-[0-9][0-9][0-9][0-9]:*//'|awk -F:
> > '{print (NR==1||(!$1&&$1!=p)?++c:c),$0;p=$1}'
>
> "beautiful solution" discarded apparently!
>
> > generates the following output:
>
> > 15, 00, 16, 0.405, 2112, 750, 7680, 0.0204170
> > 15, 00, 17, 0.527, 2862, 1010, 7680, 0.0043760
>
> > where the time stamp (15:00:16) shows up as 15, 00, 16.  Is there a
> > way to have the output look like this:
>
> > 15:00:16, 0.405, 2112, 750, 7680, 0.0204170
> > 15:00:17, 0.527, 2862, 1010, 7680, 0.0043760
>
> > Thanks!- Hide quoted text -
>
> > - Show quoted text -
>
> Why do you keep going back to pipelines of cat, sed, and awk? If
> you're going to use awk anyway, you don't need sed or cat.
>
> Try this:
>
> awk '{OFS=", "; t=substr($0,12,8); $0=substr($0,30);
>         gsub(/[[:digit:].]/," "); $1=$1; print t,$0}' file
>
>    Ed.

Thanks, Ed!

Well, I'm by no means a shell expert.

Running your command on the file, I get the following output:

15:00:16, :, [GC, K->, K(, K),, secs]
From: Ed Morton on
On Nov 6, 2:29 pm, AyOut <mort...(a)gmail.com> wrote:
> On Nov 6, 2:09 pm, Ed Morton <mortons...(a)gmail.com> wrote:
>
>
>
>
>
> > On Nov 6, 1:39 pm, AyOut <mort...(a)gmail.com> wrote:
>
> > > On Nov 4, 10:11 pm, Ed Morton <mortons...(a)gmail.com> wrote:
>
> > > > AyOut wrote:
> > > > > I have a GC log file with entries like this one:
>
> > > > > 2.729: [GC [PSYoungGen: 70850K->6800K(152896K)] 70850K->6800K
> > > > > (502464K), 0.0165440 secs] [Times: user=0.09 sys=0.04, real=0.02
> > > > > secs]
>
> > > > > I would like to parse this to output for easy plotting using gnuplot
> > > > > and would like the following output:
>
> > > > > 2.729, 70850, 6800, 152896, 70850, 6800, 502464, 0.0165440, 0.09,
> > > > > 0.04, 0.02
>
> > > > Assuming the input is all on one line:
>
> > > > $ cat file
> > > > 2.729: [GC [PSYoungGen: 70850K->6800K(152896K)] 70850K->6800K (502464K),
> > > > 0.0165440 secs] [Times: user=0.09 sys=0.04, real=0.02 secs]
>
> > > > $ awk '{OFS=", "; gsub(/[^[:digit:].]/," "); $1=$1}1' file
> > > > 2.729, 70850, 6800, 152896, 70850, 6800, 502464, 0.0165440, 0.09, 0..04, 0.02
>
> > > >      Ed.
>
> > > That's a beautiful solution!  Now, there's a change in the log file
> > > output.  The first field is now a date and time stamp
>
> > > 2009-11-05T15:00:16.965-0600: 0.405: [GC 2112K->750K(7680K), 0.0204170
> > > secs]
> > > 2009-11-05T15:00:17.087-0600: 0.527: [GC 2862K->1010K(7680K),
> > > 0.0043760 secs]
>
> > > and applying this command
>
> > > cat ${gclogfile}|sed 's/^[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9][A-
> > > Z]:*//'|sed 's/\.[0-9][0-9][0-9]-[0-9][0-9][0-9][0-9]:*//'|awk -F:
> > > '{print (NR==1||(!$1&&$1!=p)?++c:c),$0;p=$1}'
>
> > "beautiful solution" discarded apparently!
>
> > > generates the following output:
>
> > > 15, 00, 16, 0.405, 2112, 750, 7680, 0.0204170
> > > 15, 00, 17, 0.527, 2862, 1010, 7680, 0.0043760
>
> > > where the time stamp (15:00:16) shows up as 15, 00, 16.  Is there a
> > > way to have the output look like this:
>
> > > 15:00:16, 0.405, 2112, 750, 7680, 0.0204170
> > > 15:00:17, 0.527, 2862, 1010, 7680, 0.0043760
>
> > > Thanks!- Hide quoted text -
>
> > > - Show quoted text -
>
> > Why do you keep going back to pipelines of cat, sed, and awk? If
> > you're going to use awk anyway, you don't need sed or cat.
>
> > Try this:
>
> > awk '{OFS=", "; t=substr($0,12,8); $0=substr($0,30);
> >         gsub(/[[:digit:].]/," "); $1=$1; print t,$0}' file
>
> >    Ed.
>
> Thanks, Ed!
>
> Well, I'm by no means a shell expert.
>
> Running your command on the file, I get the following output:
>
> 15:00:16, :, [GC, K->, K(, K),, secs]- Hide quoted text -
>
> - Show quoted text -

Are you sure you copy/pasted my script instead of retyping it?
Are you sure your input file is the same as you posted?

Look:

$ cat file
2009-11-05T15:00:16.965-0600: 0.405: [GC 2112K->750K(7680K),
0.0204170 secs]
2009-11-05T15:00:17.087-0600: 0.527: [GC 2862K->1010K(7680K),
0.0043760 secs]

$ awk '{OFS=", "; t=substr($1,12,8); $0=substr($0,30); gsub(/[^
[:digit:].]/," "); $1=$1; print t,$0}' file
15:00:16, 0.405, 2112, 750, 7680, 0.0204170
15:00:17, 0.527, 2862, 1010, 7680, 0.0043760

Please post exactly the same commands and their output so we can see
where something's going wrong.

Ed.
From: Ed Morton on
On Nov 6, 2:37 pm, Ed Morton <mortons...(a)gmail.com> wrote:
> On Nov 6, 2:29 pm, AyOut <mort...(a)gmail.com> wrote:
>
>
>
>
>
> > On Nov 6, 2:09 pm, Ed Morton <mortons...(a)gmail.com> wrote:
>
> > > On Nov 6, 1:39 pm, AyOut <mort...(a)gmail.com> wrote:
>
> > > > On Nov 4, 10:11 pm, Ed Morton <mortons...(a)gmail.com> wrote:
>
> > > > > AyOut wrote:
> > > > > > I have a GC log file with entries like this one:
>
> > > > > > 2.729: [GC [PSYoungGen: 70850K->6800K(152896K)] 70850K->6800K
> > > > > > (502464K), 0.0165440 secs] [Times: user=0.09 sys=0.04, real=0.02
> > > > > > secs]
>
> > > > > > I would like to parse this to output for easy plotting using gnuplot
> > > > > > and would like the following output:
>
> > > > > > 2.729, 70850, 6800, 152896, 70850, 6800, 502464, 0.0165440, 0.09,
> > > > > > 0.04, 0.02
>
> > > > > Assuming the input is all on one line:
>
> > > > > $ cat file
> > > > > 2.729: [GC [PSYoungGen: 70850K->6800K(152896K)] 70850K->6800K (502464K),
> > > > > 0.0165440 secs] [Times: user=0.09 sys=0.04, real=0.02 secs]
>
> > > > > $ awk '{OFS=", "; gsub(/[^[:digit:].]/," "); $1=$1}1' file
> > > > > 2.729, 70850, 6800, 152896, 70850, 6800, 502464, 0.0165440, 0.09, 0.04, 0.02
>
> > > > >      Ed.
>
> > > > That's a beautiful solution!  Now, there's a change in the log file
> > > > output.  The first field is now a date and time stamp
>
> > > > 2009-11-05T15:00:16.965-0600: 0.405: [GC 2112K->750K(7680K), 0.0204170
> > > > secs]
> > > > 2009-11-05T15:00:17.087-0600: 0.527: [GC 2862K->1010K(7680K),
> > > > 0.0043760 secs]
>
> > > > and applying this command
>
> > > > cat ${gclogfile}|sed 's/^[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9][A-
> > > > Z]:*//'|sed 's/\.[0-9][0-9][0-9]-[0-9][0-9][0-9][0-9]:*//'|awk -F:
> > > > '{print (NR==1||(!$1&&$1!=p)?++c:c),$0;p=$1}'
>
> > > "beautiful solution" discarded apparently!
>
> > > > generates the following output:
>
> > > > 15, 00, 16, 0.405, 2112, 750, 7680, 0.0204170
> > > > 15, 00, 17, 0.527, 2862, 1010, 7680, 0.0043760
>
> > > > where the time stamp (15:00:16) shows up as 15, 00, 16.  Is there a
> > > > way to have the output look like this:
>
> > > > 15:00:16, 0.405, 2112, 750, 7680, 0.0204170
> > > > 15:00:17, 0.527, 2862, 1010, 7680, 0.0043760
>
> > > > Thanks!- Hide quoted text -
>
> > > > - Show quoted text -
>
> > > Why do you keep going back to pipelines of cat, sed, and awk? If
> > > you're going to use awk anyway, you don't need sed or cat.
>
> > > Try this:
>
> > > awk '{OFS=", "; t=substr($0,12,8); $0=substr($0,30);
> > >         gsub(/[[:digit:].]/," "); $1=$1; print t,$0}' file
>
> > >    Ed.
>
> > Thanks, Ed!
>
> > Well, I'm by no means a shell expert.
>
> > Running your command on the file, I get the following output:
>
> > 15:00:16, :, [GC, K->, K(, K),, secs]- Hide quoted text -
>
> > - Show quoted text -
>
> Are you sure you copy/pasted my script instead of retyping it?
> Are you sure your input file is the same as you posted?
>
> Look:
>
> $ cat file
> 2009-11-05T15:00:16.965-0600: 0.405: [GC 2112K->750K(7680K),
> 0.0204170 secs]
> 2009-11-05T15:00:17.087-0600: 0.527: [GC 2862K->1010K(7680K),
> 0.0043760 secs]
>
> $ awk '{OFS=", "; t=substr($1,12,8); $0=substr($0,30); gsub(/[^
> [:digit:].]/," "); $1=$1; print t,$0}' file
> 15:00:16, 0.405, 2112, 750, 7680, 0.0204170
> 15:00:17, 0.527, 2862, 1010, 7680, 0.0043760
>
> Please post exactly the same commands and their output so we can see
> where something's going wrong.
>
>      Ed.- Hide quoted text -
>
> - Show quoted text -

Hint: check if you mistyped the gsub() as

gsub(/[[:digit:].]/," ")

instead of what I had:

gsub(/[^[:digit:].]/," ")

Note the "^".

Regards,

Ed.
From: AyOut on
On Nov 6, 2:37 pm, Ed Morton <mortons...(a)gmail.com> wrote:
> On Nov 6, 2:29 pm, AyOut <mort...(a)gmail.com> wrote:
>
>
>
> > On Nov 6, 2:09 pm, Ed Morton <mortons...(a)gmail.com> wrote:
>
> > > On Nov 6, 1:39 pm, AyOut <mort...(a)gmail.com> wrote:
>
> > > > On Nov 4, 10:11 pm, Ed Morton <mortons...(a)gmail.com> wrote:
>
> > > > > AyOut wrote:
> > > > > > I have a GC log file with entries like this one:
>
> > > > > > 2.729: [GC [PSYoungGen: 70850K->6800K(152896K)] 70850K->6800K
> > > > > > (502464K), 0.0165440 secs] [Times: user=0.09 sys=0.04, real=0.02
> > > > > > secs]
>
> > > > > > I would like to parse this to output for easy plotting using gnuplot
> > > > > > and would like the following output:
>
> > > > > > 2.729, 70850, 6800, 152896, 70850, 6800, 502464, 0.0165440, 0.09,
> > > > > > 0.04, 0.02
>
> > > > > Assuming the input is all on one line:
>
> > > > > $ cat file
> > > > > 2.729: [GC [PSYoungGen: 70850K->6800K(152896K)] 70850K->6800K (502464K),
> > > > > 0.0165440 secs] [Times: user=0.09 sys=0.04, real=0.02 secs]
>
> > > > > $ awk '{OFS=", "; gsub(/[^[:digit:].]/," "); $1=$1}1' file
> > > > > 2.729, 70850, 6800, 152896, 70850, 6800, 502464, 0.0165440, 0.09, 0.04, 0.02
>
> > > > >      Ed.
>
> > > > That's a beautiful solution!  Now, there's a change in the log file
> > > > output.  The first field is now a date and time stamp
>
> > > > 2009-11-05T15:00:16.965-0600: 0.405: [GC 2112K->750K(7680K), 0.0204170
> > > > secs]
> > > > 2009-11-05T15:00:17.087-0600: 0.527: [GC 2862K->1010K(7680K),
> > > > 0.0043760 secs]
>
> > > > and applying this command
>
> > > > cat ${gclogfile}|sed 's/^[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9][A-
> > > > Z]:*//'|sed 's/\.[0-9][0-9][0-9]-[0-9][0-9][0-9][0-9]:*//'|awk -F:
> > > > '{print (NR==1||(!$1&&$1!=p)?++c:c),$0;p=$1}'
>
> > > "beautiful solution" discarded apparently!
>
> > > > generates the following output:
>
> > > > 15, 00, 16, 0.405, 2112, 750, 7680, 0.0204170
> > > > 15, 00, 17, 0.527, 2862, 1010, 7680, 0.0043760
>
> > > > where the time stamp (15:00:16) shows up as 15, 00, 16.  Is there a
> > > > way to have the output look like this:
>
> > > > 15:00:16, 0.405, 2112, 750, 7680, 0.0204170
> > > > 15:00:17, 0.527, 2862, 1010, 7680, 0.0043760
>
> > > > Thanks!- Hide quoted text -
>
> > > > - Show quoted text -
>
> > > Why do you keep going back to pipelines of cat, sed, and awk? If
> > > you're going to use awk anyway, you don't need sed or cat.
>
> > > Try this:
>
> > > awk '{OFS=", "; t=substr($0,12,8); $0=substr($0,30);
> > >         gsub(/[[:digit:].]/," "); $1=$1; print t,$0}' file
>
> > >    Ed.
>
> > Thanks, Ed!
>
> > Well, I'm by no means a shell expert.
>
> > Running your command on the file, I get the following output:
>
> > 15:00:16, :, [GC, K->, K(, K),, secs]- Hide quoted text -
>
> > - Show quoted text -
>
> Are you sure you copy/pasted my script instead of retyping it?
> Are you sure your input file is the same as you posted?
>
> Look:
>
> $ cat file
> 2009-11-05T15:00:16.965-0600: 0.405: [GC 2112K->750K(7680K),
> 0.0204170 secs]
> 2009-11-05T15:00:17.087-0600: 0.527: [GC 2862K->1010K(7680K),
> 0.0043760 secs]
>
> $ awk '{OFS=", "; t=substr($1,12,8); $0=substr($0,30); gsub(/[^
> [:digit:].]/," "); $1=$1; print t,$0}' file
> 15:00:16, 0.405, 2112, 750, 7680, 0.0204170
> 15:00:17, 0.527, 2862, 1010, 7680, 0.0043760
>
> Please post exactly the same commands and their output so we can see
> where something's going wrong.
>
>      Ed.

My bad! I lost the ^ in the copy/past.

Thanks, Ed!
From: Ben Bacarisse on
AyOut <morty3e(a)gmail.com> writes:

> I have a GC log file with entries like this one:
>
> 2.729: [GC [PSYoungGen: 70850K->6800K(152896K)] 70850K->6800K
> (502464K), 0.0165440 secs] [Times: user=0.09 sys=0.04, real=0.02
> secs]
>
> I would like to parse this to output for easy plotting using gnuplot
> and would like the following output:
>
> 2.729, 70850, 6800, 152896, 70850, 6800, 502464, 0.0165440, 0.09,
> 0.04, 0.02

If you can live without the spaces:

tr -sc '0-9.' ,

or (since gnuplot won't mind):

tr -sc '0-9.' ' '

<snip>
--
Ben.
First  |  Prev  |  Next  |  Last
Pages: 1 2 3
Prev: Script improvement
Next: object oriented shell scripts