|
From: Markus Dehmann on 20 Jan 2006 22:08 I have a convenient way to open possibly gzip'ed files: open(F, ($f =~ m/\.gz$/) ? "gunzip -c $f |" : "$f"); So, if the file name ends in .gz I send it through gunzip. So far, so good. (I don't want to use the PerlIO:Gzip module because it's not installed by default, so it's a hassle.) But now, my script should be callable in the following ways: $ cat data | ./script.pl $ ./script.pl data.gz $ ./script.pl data Usually, I would just use the while loop: while(<>){...}. But that does not read gzip'ed data. How would you handle that? I could think of the following code, but it's long and not nice ... if(defined $ARGV[0] && -f $ARGV[0]){ readFromFile($ARGV[0]); }else{ readFromStdin(); } sub readFromFile{ my ($f) = @_; open(F, ($f =~ m/\.gz$/) ? "gunzip -c $f |" : "$f") or die("Could not open $f: $!"); while(<F>){ processLine($_); } close F; } sub readFromStdin{ while(<>){ processLine($_); } } sub processLine{ ... } Thanks! Markus
From: attn.steven.kuo@gmail.com on 21 Jan 2006 00:58 Markus Dehmann wrote: > I have a convenient way to open possibly gzip'ed files: > > open(F, ($f =~ m/\.gz$/) ? "gunzip -c $f |" : "$f"); > > So, if the file name ends in .gz I send it through gunzip. So far, so > good. (I don't want to use the PerlIO:Gzip module because it's not > installed by default, so it's a hassle.) > > But now, my script should be callable in the following ways: > $ cat data | ./script.pl > $ ./script.pl data.gz > $ ./script.pl data > > Usually, I would just use the while loop: while(<>){...}. But that does > not read gzip'ed data. > > How would you handle that? I could think of the following code, but it's > long and not nice ... > > if(defined $ARGV[0] && -f $ARGV[0]){ > readFromFile($ARGV[0]); > }else{ > readFromStdin(); > } (snipped) Look under 'perldoc perlopentut' where the minus (-) file is discussed: my $input = defined($ARGV[0]) ? $ARGV[0] : '-'; $input = $input =~ /\.gz$/ ? "gunzip -c $input |" : $input ; open (FH, $input) or die $!; process_line($_) while (<FH>); close FH; -- Hope this helps, Steven
From: jgraber on 23 Jan 2006 00:31 "attn.steven.kuo(a)gmail.com" <attn.steven.kuo(a)gmail.com> writes: > Markus Dehmann wrote: > > I have a convenient way to open possibly gzip'ed files: > > open(F, ($f =~ m/\.gz$/) ? "gunzip -c $f |" : "$f"); > > > > So, if the file name ends in .gz I send it through gunzip. So far, so > > good. (I don't want to use the PerlIO:Gzip module because it's not > > installed by default, so it's a hassle.) > > > > But now, my script should be callable in the following ways: > > $ cat data | ./script.pl > > $ ./script.pl data.gz > > $ ./script.pl data > > > > Usually, I would just use the while loop: while(<>){...}. But that does > > not read gzip'ed data. > (snipped) > > Look under 'perldoc perlopentut' > where the minus (-) file is discussed: > > my $input = defined($ARGV[0]) ? $ARGV[0] : '-'; > $input = $input =~ /\.gz$/ > ? "gunzip -c $input |" > : $input ; > open (FH, $input) > or die $!; > process_line($_) while (<FH>); > close FH; I discovered that my currently installed version of gzip -d would correctly read plain files, gzipped files (.gz), and even packed files (.Z). So now I use gzip -d for everything. According to top, it uses only 1% of the CPU when called uselessly. It also works for the occasional file that is gzipped without a .gz extention, or vice-versa. I remember it working for $infile = "-" as well, for those gzipped output pipes. I've been recommending this as the "universal input pipe", $gzip_pid = open( FH, $fp="/usr/local/bin/gzip -dfc $infile |" ) || die "Cant open input pipe '$fp' : $!\n"; I'm primarily used to writing in perl4 style. I'd welcome the likely followup to this post with an example of a more modern style. Is this a security hole for the occasionally maliciously named file like "x;rm -rf / " ? -- Joel
From: Anno Siegel on 23 Jan 2006 07:02 Markus Dehmann <markus.dehmann(a)gmail.com> wrote in comp.lang.perl.misc: > I have a convenient way to open possibly gzip'ed files: > > open(F, ($f =~ m/\.gz$/) ? "gunzip -c $f |" : "$f"); > > So, if the file name ends in .gz I send it through gunzip. So far, so > good. (I don't want to use the PerlIO:Gzip module because it's not > installed by default, so it's a hassle.) > > But now, my script should be callable in the following ways: > $ cat data | ./script.pl > $ ./script.pl data.gz > $ ./script.pl data > > Usually, I would just use the while loop: while(<>){...}. But that does > not read gzip'ed data. > > How would you handle that? I could think of the following code, but it's > long and not nice ... [snip] /\.gz$/ and $_ = "gunzip -c $_ |" for @ARGV; print while <>; Anno -- If you want to post a followup via groups.google.com, don't use the broken "Reply" link at the bottom of the article. Click on "show options" at the top of the article, then click on the "Reply" at the bottom of the article headers.
From: Markus Dehmann on 23 Jan 2006 16:59 jgraber(a)ti.com wrote: > "attn.steven.kuo(a)gmail.com" <attn.steven.kuo(a)gmail.com> writes: > >>Markus Dehmann wrote: >> >>>I have a convenient way to open possibly gzip'ed files: >>>open(F, ($f =~ m/\.gz$/) ? "gunzip -c $f |" : "$f"); >>> >>>So, if the file name ends in .gz I send it through gunzip. So far, so >>>good. (I don't want to use the PerlIO:Gzip module because it's not >>>installed by default, so it's a hassle.) >>> >>>But now, my script should be callable in the following ways: >>>$ cat data | ./script.pl >>>$ ./script.pl data.gz >>>$ ./script.pl data >>> >>>Usually, I would just use the while loop: while(<>){...}. But that does >>>not read gzip'ed data. >> >>(snipped) >> >>Look under 'perldoc perlopentut' >>where the minus (-) file is discussed: >> >>my $input = defined($ARGV[0]) ? $ARGV[0] : '-'; >> $input = $input =~ /\.gz$/ >> ? "gunzip -c $input |" >> : $input ; >>open (FH, $input) >> or die $!; >>process_line($_) while (<FH>); >>close FH; > > > I discovered that my currently installed version of gzip -d > would correctly read plain files, gzipped files (.gz), > and even packed files (.Z). So now I use gzip -d > for everything. According to top, it uses only 1% > of the CPU when called uselessly. It also works > for the occasional file that is gzipped without a .gz > extention, or vice-versa. I remember it working for > $infile = "-" as well, for those gzipped output pipes. > > I've been recommending this as the "universal input pipe", > $gzip_pid = open( FH, $fp="/usr/local/bin/gzip -dfc $infile |" ) Now, a slightly offtopic question: Why do people often use the full path to an application (like here, /usr/local/bin/gzip)? That just makes it more unlikely to work, since my gzip might be in /usr/bin. Why not just: open(F, "gzip -dfc $infile |"); Same thing with the perl command: Why don't we write #!perl -w as the first line of a perl program, and let the $PATH variable figure out which perl is meant? Thanks! Markus
|
Next
|
Last
Pages: 1 2 Prev: FTP PUT with proxy? Next: How to install perl's module without internet connection? |