From: Joseph M. Newcomer on
See below...
On Fri, 19 Mar 2010 19:21:17 -0500, "Peter Olcott" <NoSpam(a)OCR4Screen.com> wrote:

>
>"Hector Santos" <sant9442(a)nospam.gmail.com> wrote in message
>news:OdDHnT7xKHA.5480(a)TK2MSFTNGP06.phx.gbl...
>> Peter Olcott Asked:
>>
>>> If I can not reject a file at the HTTP level, then I have
>>> to work at a lower level. In the ideal case I can receive
>>> the file size before any of the rest of the file is sent,
>>> and reject is with only a few bytes of wasted bandwidth.
>>
>>
>> You will get the size (Content-Length:) of the HTTP
>> request body from the HTTP request header block.
>>
>> You can issue an error at that point after receiving the
>> header and before receiving the body. This is described
>> in HTTP 1.1 standard
>
>This also includes all lower levels (TCP/IP, et cetera) of
>any data transmitted over HTTP?
>
>I imagine that there could be an underlying buffer that
>another lower level protocol uses, such that when HTTP sees
>the first 20 bytes, this lower level protocol has already
>eaten up 100K of my bandwidth quota.
****
You are completley missing the point here, once again getting hung up on irrelevant
concepts such as "buffers" and "packets". I repeat, you have NO CONTROL over any of this,
which is purely magic. Your app gets a stream. YOu have NO IDEA how much of that stream
is sitting in buffers in your server already; it might be 276 bytes, and it might be 100K
bytes, and it is nothing you need to worry about. That's what shutdown() is for: it says
"kill off any incoming buffered data I haven't received yet and don't bother receiving any
more" and "make sure any outgoing buffers are sent NOW!" (each of these is indicated by a
1-bit flag that forms the parameter value of shutdown(). You are obsessing over
irrelevancies here. And go read about TCP/IP buffer management and packet flow, and then
realize that NONE of these aspects of TCP/IP are remotely visible to you as a TCP/IP
programmer, even if you are ignoring HTTP level and wrirting raw socket code! For HTTP,
you are typically going to get the whole thing that was sent, before the protocol bothers
to inform you that something has arrived; the "something" is an atomic entity. It is
going to feed it as a stream to stdin, and you will have NO IDEA what packet magic, MTU
boundaries, etc. are involved in getting that stream to your app. So just go do it, and
stop worrying about cases that are not going to be any problem in practice. It is simply
MANDATORY that your server will validate every detail of the file format, including
illegal PNG encodings; anything else is a minor implementation detail, and should be below
your radar.

Yes, bad transmissions use up bandwidth. Technically, we refer to this using the phrase
"life is hard". Meaning, tough, suck it up and live with the fact.Until there is
convincing proof that this is a serious issue, it is not worth worrying about. Note that
you can create a "blacklist" of IP addresses that you refuse to accept connections from,
and if you do this, realize that these addresses (used by crackers who are trying to break
your program) are probably spoofed addresses of potentially legitimate users. Life is
hard.. You can "age" the blacklist so repeated trials from the same script kiddie will
probably be rejected right away, but in case they spoofed a legitimate address, it will
become legal after N hours, for some N of your choice (N=24, N=24*k for k a number of
days, are good initial ideas for aging parameters).
joe
****
joe
****
>
>> RFC 2068, section 8.2 Message Transmission Requirements:
>>
>> http://www.ietf.org/rfc/rfc2068.txt
>>
>> This is a feature of a HTTP 1.1 server, not HTTP 1.0
>> server which generally requires the entire payload to be
>> received first.
>>
>> You HTTP 1.1 web server needs to do this very carefully,
>> otherwise it can cause resends by the clients.
>>
>>> After I determine that the file is not too large I then
>>> get however many minimal bytes are required to determine
>>> the file type, and then reject the rest of the file if it
>>> is not 24-bit PNG.
>>
>>
>> Only HTTP 1.1 clients will gracefully support a mid-stream
>> reject by the web server. Otherwise, resends can occur.
>>
>> In other words, if the client is using HTTP 1.0, you will
>> see that in the first line of the HTTP request header
>> block, you could either reject the usage of this client or
>> ignore the fact the user will see irregular "disconnected"
>> error pages.
>>
>> --
>> HLS
>
>Since HTTP 1.1 has been around for 14 years I may simply
>reject all HTTP 1.0 calls and request the user update to a
>newer browser.
>
>Or I could force the user to use some sort of java applet
>that sends the data using the HTTP 1.1 format. This client
>side code could also verify the size and type of the file
>(specifically 24-bit PNG) before anything is sent. It could
>also provide a file search dialogbox that only looks for PNG
>files. With this scenario I would no longer be limited to
>HTTP, I could devise my own protocol that could strip off
>some of the extra HTTP baggage.
>
>Would I be back to sockets again if I did this? Would this
>be programming at the TCP/IP level or some other level?
>
Joseph M. Newcomer [MVP]
email: newcomer(a)flounder.com
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm
From: Hector Santos on
Peter Olcott wrote:

>> Now, I think you are overly concern about this since I
>> believe you said your transactions will be authenticated.
>> If so, your abuse is a lot less and the only concerned are
>> compromised users which will be pretty rare IMO. Unless
>> you become a MAJOR ATTRACTIVE site worth attacking, just
>> follow basic ideas and quite tying to control every aspect
>> of your design. Get something completed first.
>
> Does the mark the IP cost me any future bandwidth?


No, other than the annoyance that someone is bothering you. If you
spent time eyeballing your computer and logs and see:

2010-03-19 11:38:00 CONNECT ATTEMPT: IP 1.2.3.4 BLOCKED
2010-03-19 11:38:00 CONNECT ATTEMPT: IP 1.2.3.4 BLOCKED
2010-03-19 11:38:00 CONNECT ATTEMPT: IP 1.2.3.4 BLOCKED
2010-03-19 11:38:02 CONNECT ATTEMPT: IP 1.2.3.4 BLOCKED
2010-03-19 11:38:02 CONNECT ATTEMPT: IP 1.2.3.4 BLOCKED
2010-03-19 11:38:00 CONNECT ATTEMPT: IP 1.2.3.4 BLOCKED
2010-03-19 11:38:03 CONNECT ATTEMPT: IP 1.2.3.4 BLOCKED
2010-03-19 11:38:03 CONNECT ATTEMPT: IP 1.2.3.4 BLOCKED
2010-03-19 11:38:03 CONNECT ATTEMPT: IP 1.2.3.4 BLOCKED

and that bothers you (as it does for some customers), then don't log
it if you don't want to worry about it, or filter it at the network
level using the Microsoft IP Helper Library. Its whats SNORTON or all
the others use.

In short, you are not going to stop the abusive nature of the world
trying to break into your system. Once you have a server running open
to the public network, you are open not only to BAD GUYS but all the
WEB SPIDERS out there bothering your system.

--
HLS
From: Joseph M. Newcomer on
There are several interpretations to the phrase "raw sockets".

For eample, if you don't use CAsyncSocket but use the base socket API including the select
function, you are doing "raw socket programming".

In another context, if you provide the exact packet bits, including the headers normally
provided by your protocol stack (so you can spoof the sending IP and Port #, among other
evil things) you are said to be using "raw sockets".

If you creat the socket and specify SOCK_RAW instead of SOCK_DATAGRAM or SOCK_STREAM you
are abole to spoof packets, and this is sometimes called "raw sockets".

If you implmenet the low-level protocol (remember the one I gave with the length followed
by a semicolon?) then you are sometimes said to be doing "raw" socket programming, even if
you use CAsyncSocket, because you are implementing the stream protocol directly by talking
to the socket.

If you use a server that implments HTTP, and you handle it by having the server route the
data to your component (external, e.g., a CGI script, or internal, what has been referred
to here as "embedded"), you are not doing "raw"sockets; but SOMEONE wrote code to do
bind(), listen(), accept(), recv() and send(), and whoever did that can be said, by some
interpretations of the phrase "raw sockets" to have done raw socket programming.

Note that you do NOT want to implement a protocol other than UDP or TCP/IP using raw
datagrams (where you supply your own headers, for example). This way lies madness. Well,
more madness than whatever madness led to such a decision. UDP and TCP/IP are understood
by every router in the world, and the routers know what to do with them. Your own
protocol will be a mystery, and will Not Play Well With Others. That level of raw socket
programming should be avoided.
joe

On Fri, 19 Mar 2010 21:00:25 -0500, "Peter Olcott" <NoSpam(a)OCR4Screen.com> wrote:

>
>"Hector Santos" <sant9442(a)nospam.gmail.com> wrote in message
>news:OwVal18xKHA.5936(a)TK2MSFTNGP04.phx.gbl...
>> Peter Olcott wrote:
>>
>>
>>> Would I be back to sockets again if I did this? Would
>>> this be programming at the TCP/IP level or some other
>>> level?
>>
>> Peter you won't be programming in TCP/IP. The socket
>> library handles all that for you.
>>
>> When you open a socket (in TCP mode), try to connect to a
>> certain IP address at a certain port, the client (your
>> software) is already designed to know what SERVER protocol
>> he will be using to talk to the remote server.
>>
>> In general, when you open port 80, that means you will be
>> sending HTTP data over that port.
>>
>> The socket stack layer transmits (and receives) using TCP
>> packets. Its hidden from you.
>>
>> --
>> HLS
>
>I have heard of raw sockets. From what I understand, at this
>level not all of the lower level is hidden here. What level
>are raw sockets exactly?
>
>Okay so I would be using the sockets library in TCP mode. I
>am guessing that I could cut out a lot of the HTTP bandwidth
>overhead by doing this, and have more control over the
>connection.
>
Joseph M. Newcomer [MVP]
email: newcomer(a)flounder.com
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm
From: Peter Olcott on

"Hector Santos" <sant9442(a)nospam.gmail.com> wrote in message
news:e9auHD%23xKHA.5940(a)TK2MSFTNGP02.phx.gbl...
> Peter Olcott wrote:
>
>>> Now, I think you are overly concern about this since I
>>> believe you said your transactions will be
>>> authenticated. If so, your abuse is a lot less and the
>>> only concerned are compromised users which will be
>>> pretty rare IMO. Unless you become a MAJOR ATTRACTIVE
>>> site worth attacking, just follow basic ideas and quite
>>> tying to control every aspect of your design. Get
>>> something completed first.
>>
>> Does the mark the IP cost me any future bandwidth?
>
>
> No, other than the annoyance that someone is bothering
> you. If you spent time eyeballing your computer and logs
> and see:
>
> 2010-03-19 11:38:00 CONNECT ATTEMPT: IP 1.2.3.4
> BLOCKED
> 2010-03-19 11:38:00 CONNECT ATTEMPT: IP 1.2.3.4
> BLOCKED
> 2010-03-19 11:38:00 CONNECT ATTEMPT: IP 1.2.3.4
> BLOCKED
> 2010-03-19 11:38:02 CONNECT ATTEMPT: IP 1.2.3.4
> BLOCKED
> 2010-03-19 11:38:02 CONNECT ATTEMPT: IP 1.2.3.4
> BLOCKED
> 2010-03-19 11:38:00 CONNECT ATTEMPT: IP 1.2.3.4
> BLOCKED
> 2010-03-19 11:38:03 CONNECT ATTEMPT: IP 1.2.3.4
> BLOCKED
> 2010-03-19 11:38:03 CONNECT ATTEMPT: IP 1.2.3.4
> BLOCKED
> 2010-03-19 11:38:03 CONNECT ATTEMPT: IP 1.2.3.4
> BLOCKED
>
> and that bothers you (as it does for some customers), then
> don't log it if you don't want to worry about it, or
> filter it at the network level using the Microsoft IP
> Helper Library. Its whats SNORTON or all the others use.
>
> In short, you are not going to stop the abusive nature of
> the world trying to break into your system. Once you have
> a server running open to the public network, you are open
> not only to BAD GUYS but all the WEB SPIDERS out there
> bothering your system.
>
> --
> HLS

I just want to understand how the handle the most malicious
user in the most robust way. If I can block an IP address
without costing me any bandwidth, then I think I have one
significant aspect of blocking the most malicious user.


From: Joseph M. Newcomer on
See be;ow...
On Fri, 19 Mar 2010 17:20:55 -0500, "Peter Olcott" <NoSpam(a)OCR4Screen.com> wrote:

>
>"Hector Santos" <sant9442(a)nospam.gmail.com> wrote in message
>news:%23s1kB76xKHA.4240(a)TK2MSFTNGP06.phx.gbl...
>> Peter Olcott wrote:
>>
>>> It looks like I am going to be using HTTP as the
>>> protocol. I just bought two books on it. I am estimating
>>> that the TCP/IP is mostly invisible at the HTTP layer. I
>>> am using the HTTP protocol because it seems that the HTML
>>> element that I am using, sends the data using this
>>> protocol.
>>> <input name="userfile" type="file" accept="image/png"
>>> />
>>>
>>> Using HTTP is it possible to reject a file that is the
>>> wrong format before the entire file is sent?
>>
>>
>> The HTTP client will send the type in the HTTP request
>> BODY block:
>>
>> Content-type: image/png
>>
>>> Using HTTP is it possible to reject a file that is too
>>> large before very much of this file is sent?
>>
>>
>> The HTTP request will have a HTTP request header:
>>
>> Content-length:
>>
>> But that length is the entire HTTP body following the
>> request header block, i.e. you can send two or more
>> uploads. You can use this length for a rough value, but
>> to get the actual image size you need to parse the HTTP
>> BODY to get the type and size.
>>
>> Note: Not all browsers will send a size in the body block,
>> like if its only 1 image. Therefore you use the top HTTP
>> request block Content-Length: header.
>>
>>
>> --
>> HLS
>
>OK great is this before or after my bandwidth quota has been
>hit with the full data load?
****
You are obsessing again. If you want to use HTTP, you have to live with its limitations.
If you want to implement your own protocol, you can abort a partially-completed
transaction just by closing the connection when you decide that the data is invalid. This
means you have to not just "accept" the data AS IF it is valid PNG encoding, but you have
to VERIFY that it makes sense at every step.

But as long as you keep obsessing about details, you will make no progress. Either you
accept that HTTP can do the job (and that you will sometimes get an upload that isn't
valid) or you have to implement your own port# and protocol spec, and this allows you to
abort at any time, but note that SOME amount of data has already been received. Suck it
up and accept that this is reality. It happens. Every Web server in the world has to deal
with this, and most of them deal with it by ignoring the problem because they don't really
care in the slightest. If you care, you pay a price: you have to write more complex
server code and you have to write complex validation code. This is simply reality.
joe
****
>
>In other words could a lower level protocol such as TCP/IP
>send a large portion of the file before my HTTP gets a
>chance to reject it because it is "denial of service attack"
>size?
***
Sure. Live with the idea. You aren't going to be able to change this behavior!
****
>
>Alternatively could I somehow structure my code so that I
>only get just the bytes indicating the size and reject all
>other data before eating up any more of my bandwidth quota?
****
If someone wants to screw you, they will send you bogus packets until they saturate your
server or your bandwidht quota. We all them "script kiddies" and they are a real problem,
but you can't make them go away by changing your protocol. They'll get you one way or
another.
****
>
>example I only see these bytes---->"size(1000000000)" and
>reject the file before any additional bytes hit my bandwidth
>quota.
****
If you write your own protocol handler--as I did for my multithreaded socket example--then
you can do anything you want within certain buffer boundaries; you might have 8K or 15K or
64K data already received that the shutdown() will discard, tough. that's life. Deal
with it.
****
>
>NOTE the only valid data sent to my web application will be
>a single 24-bit PNG file that can vary in size up to a
>predetermined limit of something like 10K.
***
Then you are truly obsessing over a problem not worthy of concern. Just build it.
joe
****
>
Joseph M. Newcomer [MVP]
email: newcomer(a)flounder.com
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm