From: Mark White on
Hi,

I have some code to download large files as part of a larger class. I've
been in a discussion with the developer of a library that I'm using who has
told me clearly that my code will not work at all, even though it does. He
is suggesting my problems are due to my not understanding the nature of
fread() even the code is very similar to examples on php.net.

I do get very rare timeout problems where the stream_set_timeout() does not
seem to be firing, and PHP exits on a general timeout. However I'm using
this under Tomcat and the logs are not giving me as much information as
under Apache webserver, where I have been unable to reproduce this problem.
So it's proving difficult to track down and I'm not able to reproduce the
error consistently. I would appreciate any comments about the validity of
the code (at the bottom) so I have a better idea whether it is my problem,
or not. It might be that I need to catch and handle the error, but that is
an area where I have no experience as yet.

I'm aware the code could be rewritten in CURL, but for now I'm more after an
understanding of what problems there might be, if any, with this approach.
The server is returning content-length in the header, and chunk encoding is
not an approach I'm intending to use right now.

Also, I'd appreciate any ideas on what the developer might mean by the
following quote. He's asked that I do not use his mailing list anymore and
should take my questions to php-general, so it would be impolite to ignore
this so I can ask him to explain further:

"3. The network buffer used by the PHP streams implementation reads data
eagerly. If you fread($socket, 1024) and the network buffer already
contains 24 bytes, PHP will try to read 1000 bytes nevertheless."

My understanding is the fread() will wait until is has 1024 bytes (in this
example) and then return that, unless EOF is encountered when the data up to
and including EOF is returned. I'm not sure what he's trying to say.

Many thanks for any advice on this.


Mark...

Code:
(The intentions are: used for downloading very large files while avoiding
memory problems, this is contained in a loop for a list of files, if the
socket is unavailable then the download is not attempted for that file, if
the socket is available but no data is received in a 30 second period, then
that download should be aborted and retried up to 5 times)

$download_attempt = 1;
do {
$fs = fopen('http://' . $host . $file, "rb");

if (!$fs) {
$this->writeDebugInfo("FAILED to open stream for ", "http://"
.. $host . $file);
} else {

$fm = fopen ($temp_file_name, "w");
stream_set_timeout($fs, 30);

while(!feof($fs)) {
$contents = fread($fs, 4096); // Buffered download
fwrite($fm, $contents);
$info = stream_get_meta_data($fs);
if ($info['timed_out']) {
break;
}
}
fclose($fm);
fclose($fs);

if ($info['timed_out']) {
// Delete temp file if fails
unlink($temp_file_name);
$this->writeDebugInfo("FAILED on attempt " .
$download_attempt . " - Connection timed out: ", $temp_file_name);
$download_attempt++;
if ($download_attempt < 5) {
$this->writeDebugInfo("RETRYING: ", $temp_file_name);
}
} else {
// Move temp file if succeeds
$media_file_name = str_replace('temp/', 'media/',
$temp_file_name);
rename($temp_file_name, $media_file_name);
$this->newDownload = true;
$this->writeDebugInfo("SUCCESS: ", $media_file_name);
}
}
} while ($download_attempt < 5 && $info['timed_out']);