From: Abe Simpson on
Hi all,

Note sure if this group is the right place to ask this question but I will
try anyway.....

I have been writing an app to analyze my IIS logs for marketing purposes,
and I am especially interested in the number of downloads for the product we
sell, myproduct.exe. In my logs I find very large clusters of entries with
code 206 (partial download) and the IP address is almost always from a
country with high fraud profile. For example:

2010-07-27 23:18:31 W3SVC21 xx.xx.xx.xx GET /myproduct.exe - 80 -
78.163.76.85
Mozilla/5.0+(Windows;+U;+Windows+NT+5.1;+tr;+rv:1.9.2.8)+Gecko/20100722+Firefox/3.6.8
- http://www.mysite.com/downloadpage.html 206 0 64 196925 234 1812
2010-07-27 23:18:31 W3SVC21 xx.xx.xx.xx GET /myproduct.exe - 80 -
78.163.76.85
Mozilla/5.0+(Windows;+U;+Windows+NT+5.1;+tr;+rv:1.9.2.8)+Gecko/20100722+Firefox/3.6.8
- http://www.mysite.com/downloadpage.html 206 0 64 131394 245 1687
2010-07-27 23:18:31 W3SVC21 xx.xx.xx.xx GET /myproduct.exe - 80 -
78.163.76.85
Mozilla/5.0+(Windows;+U;+Windows+NT+5.1;+tr;+rv:1.9.2.8)+Gecko/20100722+Firefox/3.6.8
- http://www.mysite.com/downloadpage.html 206 0 64 65858 245 1250
2010-07-27 23:18:31 W3SVC21 xx.xx.xx.xx GET /myproduct.exe - 80 -
78.163.76.85
Mozilla/5.0+(Windows;+U;+Windows+NT+5.1;+tr;+rv:1.9.2.8)+Gecko/20100722+Firefox/3.6.8
- http://www.mysite.com/downloadpage.html 206 0 64 65858 245 1625
2010-07-27 23:18:31 W3SVC21 xx.xx.xx.xx GET /myproduct.exe - 80 -
78.163.76.85
Mozilla/5.0+(Windows;+U;+Windows+NT+5.1;+tr;+rv:1.9.2.8)+Gecko/20100722+Firefox/3.6.8
- http://www.mysite.com/downloadpage.html 206 0 64 65857 245 1109
2010-07-27 23:18:31 W3SVC21 xx.xx.xx.xx GET /myproduct.exe - 80 -
78.163.76.85
Mozilla/5.0+(Windows;+U;+Windows+NT+5.1;+tr;+rv:1.9.2.8)+Gecko/20100722+Firefox/3.6.8
- http://www.mysite.com/downloadpage.html 206 0 64 0 245 1281
2010-07-27 23:18:31 W3SVC21 xx.xx.xx.xx GET /myproduct.exe - 80 -
78.163.76.85
Mozilla/5.0+(Windows;+U;+Windows+NT+5.1;+tr;+rv:1.9.2.8)+Gecko/20100722+Firefox/3.6.8
- http://www.mysite.com/downloadpage.html 206 0 0 32299 245 921
2010-07-27 23:18:32 W3SVC21 xx.xx.xx.xx GET /myproduct.exe - 80 -
78.163.76.85
Mozilla/5.0+(Windows;+U;+Windows+NT+5.1;+tr;+rv:1.9.2.8)+Gecko/20100722+Firefox/3.6.8
- http://www.mysite.com/downloadpage.html 206 0 64 0 245 1015


I obviously don't want my app to count those "downloads" -- they are
obviously not "good" ones. But what are those exactly? How are they created?
Why do they always come in large clusters (20 or 30 entries) and always from
high-fraud countries?

Thanks for the insight.

-- Abe

From: Dan on

"Abe Simpson" <abe(a)simpson.com> wrote in message
news:evN2FeAMLHA.5624(a)TK2MSFTNGP02.phx.gbl...
> Hi all,
>
> Note sure if this group is the right place to ask this question but I will
> try anyway.....
>
> I have been writing an app to analyze my IIS logs for marketing purposes,
> and I am especially interested in the number of downloads for the product
> we sell, myproduct.exe. In my logs I find very large clusters of entries
> with code 206 (partial download) and the IP address is almost always from
> a country with high fraud profile. For example:
>
> 2010-07-27 23:18:31 W3SVC21 xx.xx.xx.xx GET /myproduct.exe - 80 -
> 78.163.76.85
> Mozilla/5.0+(Windows;+U;+Windows+NT+5.1;+tr;+rv:1.9.2.8)+Gecko/20100722+Firefox/3.6.8
> - http://www.mysite.com/downloadpage.html 206 0 64 196925 234 1812
> 2010-07-27 23:18:31 W3SVC21 xx.xx.xx.xx GET /myproduct.exe - 80 -
> 78.163.76.85
> Mozilla/5.0+(Windows;+U;+Windows+NT+5.1;+tr;+rv:1.9.2.8)+Gecko/20100722+Firefox/3.6.8
> - http://www.mysite.com/downloadpage.html 206 0 64 131394 245 1687
> 2010-07-27 23:18:31 W3SVC21 xx.xx.xx.xx GET /myproduct.exe - 80 -
> 78.163.76.85
> Mozilla/5.0+(Windows;+U;+Windows+NT+5.1;+tr;+rv:1.9.2.8)+Gecko/20100722+Firefox/3.6.8
> - http://www.mysite.com/downloadpage.html 206 0 64 65858 245 1250
> 2010-07-27 23:18:31 W3SVC21 xx.xx.xx.xx GET /myproduct.exe - 80 -
> 78.163.76.85
> Mozilla/5.0+(Windows;+U;+Windows+NT+5.1;+tr;+rv:1.9.2.8)+Gecko/20100722+Firefox/3.6.8
> - http://www.mysite.com/downloadpage.html 206 0 64 65858 245 1625
> 2010-07-27 23:18:31 W3SVC21 xx.xx.xx.xx GET /myproduct.exe - 80 -
> 78.163.76.85
> Mozilla/5.0+(Windows;+U;+Windows+NT+5.1;+tr;+rv:1.9.2.8)+Gecko/20100722+Firefox/3.6.8
> - http://www.mysite.com/downloadpage.html 206 0 64 65857 245 1109
> 2010-07-27 23:18:31 W3SVC21 xx.xx.xx.xx GET /myproduct.exe - 80 -
> 78.163.76.85
> Mozilla/5.0+(Windows;+U;+Windows+NT+5.1;+tr;+rv:1.9.2.8)+Gecko/20100722+Firefox/3.6.8
> - http://www.mysite.com/downloadpage.html 206 0 64 0 245 1281
> 2010-07-27 23:18:31 W3SVC21 xx.xx.xx.xx GET /myproduct.exe - 80 -
> 78.163.76.85
> Mozilla/5.0+(Windows;+U;+Windows+NT+5.1;+tr;+rv:1.9.2.8)+Gecko/20100722+Firefox/3.6.8
> - http://www.mysite.com/downloadpage.html 206 0 0 32299 245 921
> 2010-07-27 23:18:32 W3SVC21 xx.xx.xx.xx GET /myproduct.exe - 80 -
> 78.163.76.85
> Mozilla/5.0+(Windows;+U;+Windows+NT+5.1;+tr;+rv:1.9.2.8)+Gecko/20100722+Firefox/3.6.8
> - http://www.mysite.com/downloadpage.html 206 0 64 0 245 1015
>
>
> I obviously don't want my app to count those "downloads" -- they are
> obviously not "good" ones. But what are those exactly? How are they
> created? Why do they always come in large clusters (20 or 30 entries) and
> always from high-fraud countries?
>
> Thanks for the insight.

If the client requests a range of bytes, then the web server will send just
that range. If you have a lot of these for the same client, then they could
be running a download manager that requests just a small chunk at a time,
possibly via multiple simultaneous connections, and reassembles the parts at
the client. You will want to count them as they are actual downloads, but
you'll need to handle them specially as each request is not a full download.

eg. client requests bytes 1-1024, then 1025-2048, etc, it's downloading 1KB
at a time in pieces and reassembling. This is good for resilience as if a
connection drops only a small part of the file download needs to be repeated
rather than the whole thing.

The other possibility is that you're seeing the result of a the client
losing the connection repeatedly - each time the connection drops the client
download software can simply request the download to restart after the last
byte it had successfully downloaded. This is again of great benefit to both
your server and the client - if for instance the file is 100MB and the
client downloads 99MB and then drops out, it only has to download 1MB
instead of the entire 100MB file again.

--
Dan