From: Janis Papanagnou on
Sven Mascheck wrote:
> Janis wrote:
>
>>>> cat file | cmd
>>>> A workaround for commands which have been compiled without
>>>> largefile support but accept a pipe, e.g. compressing
>>>> utilities.
>> I am puzzled about this one. Why is it a problem for (some?)
>> compression programs to read the file from a non-pipe stdin
>> channel
>
> The problem occurs e.g. if the program detects an ordinary file
> and wants to seek() it (without having large file support).
> Ironically the program might be able to handle a stream -
> a real world example is gzip 1.2.4:
>
> $ dd if=/dev/zero of=large seek=2G count=0 bs=1
> $ ls -l large
> -rw-r--r-- 1 mascheck users 2147483648 2010-02-24 20:16 large
>
> $ gzip-1.2.4 ./large # _llseek() returns "illegal seek"
> ./large: Value too large for defined data type

Okay.

> $ gzip-1.2.4 < ./large > large.gz

(No option -c required?)

> gzip-1.2.4: stdin: fstat(stdin)

What's the meaning of that message? What is gzip trying to achieve?

> $ cat ./large|gzip-1.2.4 > large.gz
> [... ok]

Does gzip differentiate between stdin from pipe and stdin from shell
redirection? And why?

Janis
From: Alan Curry on
In article <hm45cg$22s$1(a)news.m-online.net>,
Janis Papanagnou <janis_papanagnou(a)hotmail.com> wrote:
|Sven Mascheck wrote:
|>
|> The problem occurs e.g. if the program detects an ordinary file
|> and wants to seek() it (without having large file support).
|> Ironically the program might be able to handle a stream -
|> a real world example is gzip 1.2.4:
|>
|> $ dd if=/dev/zero of=large seek=2G count=0 bs=1

This command could be shorter: since none of the zeros from /dev/zero are
being read, might as well use if=/dev/null and omit the count=0.

|> $ ls -l large
|> -rw-r--r-- 1 mascheck users 2147483648 2010-02-24 20:16 large
|>
|> $ gzip-1.2.4 ./large # _llseek() returns "illegal seek"
|> ./large: Value too large for defined data type

When I tried this, it wasn't a seek that failed but an lstat.

|
|Okay.
|
|> $ gzip-1.2.4 < ./large > large.gz
|
|(No option -c required?)

When the input file is not specified by name, where else could the output go
but stdout? Can't append ".gz" to a nonexistent name.

|
|> gzip-1.2.4: stdin: fstat(stdin)
|
|What's the meaning of that message? What is gzip trying to achieve?

It's a message that really should have included strerror(errno). What it's
trying to achieve by fstat'ing the input file is to get the modification
timestamp, so it can put that into the gzip header.

But it failed because it used the 32-bit fstat call, which refused to return
a struct stat with an incorrect st_size. It has no way to know that the
caller is only interested in st_mtime.

|
|> $ cat ./large|gzip-1.2.4 > large.gz
|> [... ok]
|
|Does gzip differentiate between stdin from pipe and stdin from shell
|redirection? And why?

fstat differentiates between them, because with a redirected regular file it
wants to give the correct st_size. For a pipe, st_size is an automatic 0.

--
Alan Curry