|
From: Ray Mitchell on 10 Oct 2008 18:24 "Ray Mitchell" wrote: > Hello, > > I'm trying to download a binary ZIP file using HTTP. Following an example > in MSDN and other sources, my code is: > > WebClient client = new WebClient(); > client.DownloadFile"http://www.website.com/BinaryFile.zip", "BinaryFile.zip"); > > Using FTP I can download the file fine, but HTTP downloads something else, > which is not what is in my file. Instead its content is something like the > following. It looks to me like some sort of status information is being > downloaded instead of the actual file: > > <html> > <head> > <title>website.com: The Leading Storage Archive Site on the Net</title> > </head> > <frameset cols="1,*" border=0> > <frame name="top" > src="t.php?uid=ws1048efc603a93ed3.10806440&src=&cat=computers%2Finternet%2Fdownloads&kw=Storage+Archive&sc=storage+media" > scrolling=no frameborder=0 noresize framespacing=0 marginwidth=0 > marginheight=0> > <frame src="search.php?uid=ws1048efc603a93ed3.10806440&src=" > scrolling="auto" framespacing=0 marginwidth=0 marginheight=0 noresize> > </frameset> > <noframes> > This page requires frames. > </noframes> > </html> > So, as it turns out, the name of the file was incorrect and the named file didn't exist. However, I did use a try/catch and just assumed that if the file didn't exist I'd get an exception, which I didn't. What is the proper way to approach this. Is there any way to get a directory listing or test if a file actually exists before attempting a download? If I just do it like I'm currently doing it, I assume I'll have to actually inspect the contents of what did get downloaded to see if it's what I expect. This doesn't seem like the correct approach. Thanks, Ray
From: Peter Duniho on 10 Oct 2008 20:06 On Fri, 10 Oct 2008 15:24:17 -0700, Ray Mitchell <RayMitchell_NOSPAM_(a)meanoldteacher.com> wrote: > So, as it turns out, the name of the file was incorrect and the named > file > didn't exist. However, I did use a try/catch and just assumed that if > the > file didn't exist I'd get an exception, which I didn't. What is the > proper > way to approach this. Is there any way to get a directory listing or > test if > a file actually exists before attempting a download? Not reliably, no. It would depend on the server, and the exact behavior is not standardized as far as I know (I've seen different HTTP servers return directory information in a variety of different ways). > If I just do it like > I'm currently doing it, I assume I'll have to actually inspect the > contents > of what did get downloaded to see if it's what I expect. This doesn't > seem > like the correct approach. Thanks, Ray I agree that inspecting the returned content itself is less than optimal. It seems to me that you _should_ be able to somehow get the status code for the HTTP response. This is the three-digit numeric value that's returned in the very first response line, even before any headers are sent from the server. You would be able to inspect the status code to determine whether the retrieval was actually successful or not. But even that depends on the HTTP server somewhat, in that some are configured to return without error an HTML page when some failure occurs. Assuming you have a server that _does_ set the status code correctly (it could return an error page and still set the status code), once the download has completed you should be able to call WebClient.GetResponse(), cast the return value to HttpWebResponse, and look at the StatusCode property. I think that that's probably the most reliable way to detect an error, but it does depend on the server being well-behaved. That said, that's true for ALL network operations, so you should always be coding defensively in any case. Assume at every point along the way that you might receive a response other than what's valid. Pete
From: Ray Mitchell on 10 Oct 2008 20:27 "Peter Duniho" wrote: > On Fri, 10 Oct 2008 15:24:17 -0700, Ray Mitchell > <RayMitchell_NOSPAM_(a)meanoldteacher.com> wrote: > > > So, as it turns out, the name of the file was incorrect and the named > > file > > didn't exist. However, I did use a try/catch and just assumed that if > > the > > file didn't exist I'd get an exception, which I didn't. What is the > > proper > > way to approach this. Is there any way to get a directory listing or > > test if > > a file actually exists before attempting a download? > > Not reliably, no. It would depend on the server, and the exact behavior > is not standardized as far as I know (I've seen different HTTP servers > return directory information in a variety of different ways). > > > If I just do it like > > I'm currently doing it, I assume I'll have to actually inspect the > > contents > > of what did get downloaded to see if it's what I expect. This doesn't > > seem > > like the correct approach. Thanks, Ray > > I agree that inspecting the returned content itself is less than optimal. > > It seems to me that you _should_ be able to somehow get the status code > for the HTTP response. This is the three-digit numeric value that's > returned in the very first response line, even before any headers are sent > from the server. You would be able to inspect the status code to > determine whether the retrieval was actually successful or not. > > But even that depends on the HTTP server somewhat, in that some are > configured to return without error an HTML page when some failure occurs. > > Assuming you have a server that _does_ set the status code correctly (it > could return an error page and still set the status code), once the > download has completed you should be able to call WebClient.GetResponse(), > cast the return value to HttpWebResponse, and look at the StatusCode > property. > > I think that that's probably the most reliable way to detect an error, but > it does depend on the server being well-behaved. That said, that's true > for ALL network operations, so you should always be coding defensively in > any case. Assume at every point along the way that you might receive a > response other than what's valid. > > Pete > Thanks Pete, as always. The more I consider this the more I think that FTP is the right way to go. Ray
From: Peter Duniho on 10 Oct 2008 21:34 On Fri, 10 Oct 2008 17:27:00 -0700, Ray Mitchell <RayMitchell_NOSPAM_(a)meanoldteacher.com> wrote: > Thanks Pete, as always. The more I consider this the more I think that > FTP is the right way to go. Ray It might be. Or it might not. It depends a lot on the use case. FTP should definitely offer a less variable experience. But some networks are set up to allow only HTTP traffic. You might consider providing the user with the choice as to what approach they want to use. Of course, if you have complete end-to-end control over the entire system, then you could simply go with what works best for your purposes. But it sounds like you don't. Have you at least checked the error case that you're looking at right now to see whether the StatusCode field is set accordingly? That would be a useful data point for your decision-making. Pete
From: Ray Mitchell on 10 Oct 2008 22:06
"Peter Duniho" wrote: > On Fri, 10 Oct 2008 17:27:00 -0700, Ray Mitchell > <RayMitchell_NOSPAM_(a)meanoldteacher.com> wrote: > > > Thanks Pete, as always. The more I consider this the more I think that > > FTP is the right way to go. Ray > > It might be. Or it might not. It depends a lot on the use case. > > FTP should definitely offer a less variable experience. But some networks > are set up to allow only HTTP traffic. You might consider providing the > user with the choice as to what approach they want to use. > > Of course, if you have complete end-to-end control over the entire system, > then you could simply go with what works best for your purposes. But it > sounds like you don't. > > Have you at least checked the error case that you're looking at right now > to see whether the StatusCode field is set accordingly? That would be a > useful data point for your decision-making. > > Pete > Pete, I haven't checked the status code yet, but before I put out any more effort in the HTTP arena I'm still considering whether it's even appropriate for my application, especially considering the difficulty in getting directory/file listings with HTTP versus the ease of doing it with FTP, and I definitely need such listings. I actually will have control over the end-to-end system since I will be specifying what the users must have available. My application uses an FTP client to upload thousands of files into a complex directory configuration on a web server. Then there will be multiple users that need to download some of them that they haven't already downloaded. Ultimately I'll need to restrict certain users to certain directories. I know I can do this in FTP by setting up separate password-protected user subaccounts with limited access, but I haven't looked into this possibility using HTTP. However, I keep coming back to the issues involved in getting directory/file listings and FTP seems to be the clear winner there. Ray |