From: Ray Mitchell on


"Ray Mitchell" wrote:

> Hello,
>
> I'm trying to download a binary ZIP file using HTTP. Following an example
> in MSDN and other sources, my code is:
>
> WebClient client = new WebClient();
> client.DownloadFile"http://www.website.com/BinaryFile.zip", "BinaryFile.zip");
>
> Using FTP I can download the file fine, but HTTP downloads something else,
> which is not what is in my file. Instead its content is something like the
> following. It looks to me like some sort of status information is being
> downloaded instead of the actual file:
>
> <html>
> <head>
> <title>website.com: The Leading Storage Archive Site on the Net</title>
> </head>
> <frameset cols="1,*" border=0>
> <frame name="top"
> src="t.php?uid=ws1048efc603a93ed3.10806440&src=&cat=computers%2Finternet%2Fdownloads&kw=Storage+Archive&sc=storage+media"
> scrolling=no frameborder=0 noresize framespacing=0 marginwidth=0
> marginheight=0>
> <frame src="search.php?uid=ws1048efc603a93ed3.10806440&src="
> scrolling="auto" framespacing=0 marginwidth=0 marginheight=0 noresize>
> </frameset>
> <noframes>
> This page requires frames.
> </noframes>
> </html>
>

So, as it turns out, the name of the file was incorrect and the named file
didn't exist. However, I did use a try/catch and just assumed that if the
file didn't exist I'd get an exception, which I didn't. What is the proper
way to approach this. Is there any way to get a directory listing or test if
a file actually exists before attempting a download? If I just do it like
I'm currently doing it, I assume I'll have to actually inspect the contents
of what did get downloaded to see if it's what I expect. This doesn't seem
like the correct approach. Thanks, Ray
From: Peter Duniho on
On Fri, 10 Oct 2008 15:24:17 -0700, Ray Mitchell
<RayMitchell_NOSPAM_(a)meanoldteacher.com> wrote:

> So, as it turns out, the name of the file was incorrect and the named
> file
> didn't exist. However, I did use a try/catch and just assumed that if
> the
> file didn't exist I'd get an exception, which I didn't. What is the
> proper
> way to approach this. Is there any way to get a directory listing or
> test if
> a file actually exists before attempting a download?

Not reliably, no. It would depend on the server, and the exact behavior
is not standardized as far as I know (I've seen different HTTP servers
return directory information in a variety of different ways).

> If I just do it like
> I'm currently doing it, I assume I'll have to actually inspect the
> contents
> of what did get downloaded to see if it's what I expect. This doesn't
> seem
> like the correct approach. Thanks, Ray

I agree that inspecting the returned content itself is less than optimal.

It seems to me that you _should_ be able to somehow get the status code
for the HTTP response. This is the three-digit numeric value that's
returned in the very first response line, even before any headers are sent
from the server. You would be able to inspect the status code to
determine whether the retrieval was actually successful or not.

But even that depends on the HTTP server somewhat, in that some are
configured to return without error an HTML page when some failure occurs.

Assuming you have a server that _does_ set the status code correctly (it
could return an error page and still set the status code), once the
download has completed you should be able to call WebClient.GetResponse(),
cast the return value to HttpWebResponse, and look at the StatusCode
property.

I think that that's probably the most reliable way to detect an error, but
it does depend on the server being well-behaved. That said, that's true
for ALL network operations, so you should always be coding defensively in
any case. Assume at every point along the way that you might receive a
response other than what's valid.

Pete
From: Ray Mitchell on


"Peter Duniho" wrote:

> On Fri, 10 Oct 2008 15:24:17 -0700, Ray Mitchell
> <RayMitchell_NOSPAM_(a)meanoldteacher.com> wrote:
>
> > So, as it turns out, the name of the file was incorrect and the named
> > file
> > didn't exist. However, I did use a try/catch and just assumed that if
> > the
> > file didn't exist I'd get an exception, which I didn't. What is the
> > proper
> > way to approach this. Is there any way to get a directory listing or
> > test if
> > a file actually exists before attempting a download?
>
> Not reliably, no. It would depend on the server, and the exact behavior
> is not standardized as far as I know (I've seen different HTTP servers
> return directory information in a variety of different ways).
>
> > If I just do it like
> > I'm currently doing it, I assume I'll have to actually inspect the
> > contents
> > of what did get downloaded to see if it's what I expect. This doesn't
> > seem
> > like the correct approach. Thanks, Ray
>
> I agree that inspecting the returned content itself is less than optimal.
>
> It seems to me that you _should_ be able to somehow get the status code
> for the HTTP response. This is the three-digit numeric value that's
> returned in the very first response line, even before any headers are sent
> from the server. You would be able to inspect the status code to
> determine whether the retrieval was actually successful or not.
>
> But even that depends on the HTTP server somewhat, in that some are
> configured to return without error an HTML page when some failure occurs.
>
> Assuming you have a server that _does_ set the status code correctly (it
> could return an error page and still set the status code), once the
> download has completed you should be able to call WebClient.GetResponse(),
> cast the return value to HttpWebResponse, and look at the StatusCode
> property.
>
> I think that that's probably the most reliable way to detect an error, but
> it does depend on the server being well-behaved. That said, that's true
> for ALL network operations, so you should always be coding defensively in
> any case. Assume at every point along the way that you might receive a
> response other than what's valid.
>
> Pete
>

Thanks Pete, as always. The more I consider this the more I think that FTP
is the right way to go. Ray
From: Peter Duniho on
On Fri, 10 Oct 2008 17:27:00 -0700, Ray Mitchell
<RayMitchell_NOSPAM_(a)meanoldteacher.com> wrote:

> Thanks Pete, as always. The more I consider this the more I think that
> FTP is the right way to go. Ray

It might be. Or it might not. It depends a lot on the use case.

FTP should definitely offer a less variable experience. But some networks
are set up to allow only HTTP traffic. You might consider providing the
user with the choice as to what approach they want to use.

Of course, if you have complete end-to-end control over the entire system,
then you could simply go with what works best for your purposes. But it
sounds like you don't.

Have you at least checked the error case that you're looking at right now
to see whether the StatusCode field is set accordingly? That would be a
useful data point for your decision-making.

Pete
From: Ray Mitchell on


"Peter Duniho" wrote:

> On Fri, 10 Oct 2008 17:27:00 -0700, Ray Mitchell
> <RayMitchell_NOSPAM_(a)meanoldteacher.com> wrote:
>
> > Thanks Pete, as always. The more I consider this the more I think that
> > FTP is the right way to go. Ray
>
> It might be. Or it might not. It depends a lot on the use case.
>
> FTP should definitely offer a less variable experience. But some networks
> are set up to allow only HTTP traffic. You might consider providing the
> user with the choice as to what approach they want to use.
>
> Of course, if you have complete end-to-end control over the entire system,
> then you could simply go with what works best for your purposes. But it
> sounds like you don't.
>
> Have you at least checked the error case that you're looking at right now
> to see whether the StatusCode field is set accordingly? That would be a
> useful data point for your decision-making.
>
> Pete
>

Pete,

I haven't checked the status code yet, but before I put out any more effort
in the HTTP arena I'm still considering whether it's even appropriate for my
application, especially considering the difficulty in getting directory/file
listings with HTTP versus the ease of doing it with FTP, and I definitely
need such listings. I actually will have control over the end-to-end system
since I will be specifying what the users must have available.

My application uses an FTP client to upload thousands of files into a
complex directory configuration on a web server. Then there will be multiple
users that need to download some of them that they haven't already
downloaded. Ultimately I'll need to restrict certain users to certain
directories. I know I can do this in FTP by setting up separate
password-protected user subaccounts with limited access, but I haven't looked
into this possibility using HTTP. However, I keep coming back to the issues
involved in getting directory/file listings and FTP seems to be the clear
winner there.

Ray