From: Arne Vajhøj on
On 20-05-2010 08:23, Bob wrote:
> On Tue, 18 May 2010 21:50:52 -0400, Arne Vajh�j<arne(a)>
> wrote:
>> On 18-05-2010 20:40, Bob wrote:
>>> On Tue, 18 May 2010 19:55:42 -0400, Arne Vajh�j<arne(a)>
>>> wrote:
>>>> On 18-05-2010 07:29, Bob wrote:
>>>>> I need to scan a large number of web-resident files, primarily to get
>>>>> file size. IOW, a simple operation. Can anyone provide the benefit of
>>>>> their intuition on how to set the timeout, and how many retries to
>>>>> attempt?
>>>>> Currently I have the WebRequest timeout set for 2 seconds, and if the
>>>>> request times out, I loop back and try again. So just 2 tries. Not
>>>>> sure if that's optimal.
>>> I've been using a 2 second timeout, then retrying once if it fails. Is
>>> that what you meant by 'small'?
>> 2 seconds is a pretty huge timeout for HTTP.
> Hi again, Arne. I've run some tests (time consuming) on the file info
> retrieval function. Reliabilty actually does stay pretty consistent
> when the timeout is dropped from 2 seconds to 1 second as long as I do
> at least one retry on failure. At 1/2 sec, I get a few errors, but at
> 1/4 sec, the error rate goes up.

Must be a slow connection.

> Doing at least one retry seems important. Otherwise, even with a 4
> second timeout, I get a consiiderable number of errors.
> When I say "errors" above, I mean that the WebRequest times out. IOW,
> setting the WebRequest timeout function to 4 seconds does not work as
> well as 1 sec with a single retry.
> Interesting how that works, but it took a long while to do those
> tests.
>> I think doing many in parallel would be speed up things a lot.
>> And you can still use the progress bar.
> Now that you mention it, is there an easy way to determine the number
> of 'channels' that would be optimal? There's got to be a logical limit
> on connections.

I would go for 25 threads per core or something like that for
this purpose.

>>>>> Another thing: I've often got WebResponse file sizes that are one byte
>>>>> different from the actual size of the file. Any idea what's up there?
>>>> Difficult to say without an example URL.
>>> I'll try to look for a few examples. I thought maybe that was a very
>>> common thing, given that it seems to be just one byte much of the
>>> time...just seemed 'coincidental'.
>> OK.
> Of course I haven't been able to get that to happen again since my
> last post.