From: Hector Santos on
Why do you quote the entire message just a one line of input? People
can already follow the thread, if they wish. No need to quote the
entire thing, and worst, embed one line response in the middle of it.

Peter Olcott wrote:

>> If you are have a separate service, you don't need this or
>> don't have to worry about FASTCGI. Use any web server
>> with an embedded language or CGI. I'm telling ya, you are
>> making this more complex than it is.
>
> It would be simplest to do this:
> OCR.cpp + mongoose.c = OCR_WebServer.exe


Probably, but I suspect not for you. :)

if you are looking for scalability, follow for the suggestions posted.

--
HLS
From: Joseph M. Newcomer on
See below...
On Tue, 16 Mar 2010 21:34:24 -0500, "Peter Olcott" <NoSpam(a)OCR4Screen.com> wrote:

>
>"Joseph M. Newcomer" <newcomer(a)flounder.com> wrote in
>message news:lsd0q596g340grpj6g4a5q90m659cmjhpd(a)4ax.com...
>> See below...
>> On Tue, 16 Mar 2010 20:31:23 -0500, "Peter Olcott"
>> <NoSpam(a)OCR4Screen.com> wrote:
>>
>>>
>>>"Hector Santos" <sant9442(a)nospam.gmail.com> wrote in
>>>message
>>>news:un$1i$WxKHA.6140(a)TK2MSFTNGP05.phx.gbl...
>>>> Peter Olcott wrote:
>>>>
>>>>> "Hector Santos" <sant9442(a)nospam.gmail.com> wrote in
>>>>> message
>>>>
>>>>
>>>>>>> FastCGI is portable across platforms, and most often
>>>>>>> faster than alternatives.
>>>>
>>>> >>>
>>>>
>>>>>> You are trusting the WIKI's too much.
>>>>>
>>>>> Not just Wikis, I have examined at least one book from
>>>>> Amazon too.
>>>>
>>>>
>>>> So? You really don't know what you are reading Peter.
>>>>
>>>>> All That I need is a minimal cost way to keep my 4.0 GB
>>>>> of data loaded into memory. FastCGI is based on
>>>>> Sockets.
>>>>
>>>>
>>>> Independent concepts. The twain shall never meet.
>>>>
>>>> Honestly, you will be best to use FTP. Some Browser
>>>> allow a Drag and Drop with FTP backends. FTP has two
>>>> channels, DATA and COMMAND so you can handle aborts and
>>>> timeouts better.
>>>
>>>That is not simple enough for my users. I want a browse
>>>button on a webpage that browses the local hard-drive. It
>>>seems that the only completely portable way to do this is
>>>HTML and this HTML is hooked up to CGI. Maybe I could
>>>build
>>>something from scratch at the other end to handle the
>>>input.
>>>I would guess that this would not provide any improvement
>>>over C++ FastCGI, and cost more development time.
>>>I am leaning toward Linux now, so a Microsoft thing ported
>>>to Linux may not be as good as a native Linux thing.
>> ****
>> You have confused client-side (browsing a local directory)
>> with server-side (CGI and
>> related technologies). Client-side features are usually
>> implemented with JavaScript
>> embedded in HTML, or having JavaScript invoke ActiveX (a
>> deadly combination capable of
>> creating security holes large enough to drive a railroad
>> train through, sideways)
>
>In my case the only client side code that I need is an HTML
>file browse widget. Since this is connected to CGI, and
>FastCGI keeps the app resident, I thought that FastCGI
>woould be a good fit.
*****
Actually, at the client side, you have NO IDEA what it is going to be connected to! The
client side treats the server side like an abstract interface, and how the server handles
the transmission is up to what you put on the server. So you are still confused. It
might be that you *know* you are using FastCGI on the server side, but that will have
little impact on what you do on the client side.
*****
>
>>
>> And you are making unrealisitic assumptions that massive
>> programs will remain
>> memory-resident, which is host-OS-specific behavior.
>
>My application will be the only thing running on the
>dedicated server besides the OS. If I only use a fraction of
>the available RAM (for example 75%) then my app's code and
>data would have no good reason to be paged out.
*****
See my comment that you have no idea what the OS is doing. Windows, for example, will as
a matter of policy, page out the pages of idle processes. The last time I looked at Unix
(admittedly, 20 years ago) it worked the same way. I have no idea what linux does, but
you have made an assumption without checking that it is a valid assumption.
joe
****
>
>>
>> You can safely ignore ISAPI, this is a dangerous
>> technology which has been largely
>> replaced by CLR-based solutions (in ISAPI, if you crash
>> your add-in, you crash all of IIS,
>> and if you clobber memory, or leak resources, IIS goes
>> down, too; ISAPI is no longer
>> supported; for example, ISAPI projects are no longer
>> building under VS2008 and later
>> versions. (And maybe under VS2005; I don't use it, so I
>> never noticed when it went away,
>> but I remember it from one of the "breaking changes"
>> lists).
>>
>> For FastCGI, you need to handle connection across a
>> full-duplex pipe, which means you have
>> to write platform-specific versions of your code, that is,
>> you will need a Windows version
>> and a linux version if you are going to run your app on
>> linux, and the OS interface is
>> going to be platform-specific (it is platform-independent
>> only insofar as your host
>> language is platform-independent, e.g., PHP, PERL, Tcl,
>> etc.; for these languages, you
>> rely on the implementor of the OS-interface components of
>> the interpreter for the language
>> to handle the "platform independence" aspect; so my PERL
>> or Python or PHP code remains the
>> same, but my C or C++ server code is going to have to
>> change. WHich means if you wrote in
>> in MFC, you can't have a GUI on it, and if you wrote it in
>> MFC, you are going to have to
>> use some product linke mainsoft's port of MFC to
>> linux/unix platforms (www.mainsoft.com;
>> not cheap). If you wrote it in pure C++ using only the
>> stdnadrd C and C++ (STL) libraries
>> and no OS-specific componenents and NO GUI then you should
>> be able to port it readily.
>>
>> Note that you will have to write the platform-specific
>> pipe interface using the
>> appropriate support libraries for your platform.
>
>Hector was thinking that FastCGI might be overkill for my
>purposes, not providing enough functionality for the
>overhead and learning curve. He suggested simply processing
>the standard input stream might be a better alternative.
>What do you think about this?
****
I have no idea. I haven't done a serious Web-based app in more than 10 years, and my
knowledge of current technology is very small. But large enough to know when someone is
misusing the terminology and doesn't understand what is going on. I'd suggest, as he did,
getting a server-side expert involved in the design. THe only one I knew who was
freelance is now full-time with a company and can no longer do any work on the side (I
tried to get him involved with one of my clients last year, and he wasn't available)
joe
****
>
>>
>> TANSTAAFL.
>> joe
>>
>>>
>>>>
>>>> --
>>>> HLS
>>>
>> Joseph M. Newcomer [MVP]
>> email: newcomer(a)flounder.com
>> Web: http://www.flounder.com
>> MVP Tips: http://www.flounder.com/mvp_tips.htm
>
Joseph M. Newcomer [MVP]
email: newcomer(a)flounder.com
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm
From: Peter Olcott on

"Joseph M. Newcomer" <newcomer(a)flounder.com> wrote in
message news:fr82q55im3pagn9lq813auvtmtj73dc315(a)4ax.com...
> See below...
> On Tue, 16 Mar 2010 22:22:30 -0400, Hector Santos
> <sant9442(a)nospam.gmail.com> wrote:
>
>>
>>Peter Olcott wrote:
>>
>>>> Once again, all you need is a standard C/C++ console
>>>> application that reads standard input to read your HTTP
>>>> transfer encoded data block. You either write the
>>>> multi-part MIME processor or you get a library or class
>>>> that does it for you.
>>>
>>> What handles transmission errors?
>>
>>
>>Socket drop events/detection. You have to be ready for
>>it.
> ****
> Any On... notification of CAsyncSocket has an error code
> *****
>>
>>> Another thing, I want to
>>> immediately close the connection on the first packet if
>>> anything besides a 24-bit PNG file is detected.
>>
>>
>> shutdown(mySocketHandle,2)
>> flush receiver buffer loop
>> closesocket(mySocketHandle)
> *****
> I take exception with his "first packet" concept; it shows
> he is totally clueless about
> TCP/IP protocol, which is *strictly* a stream protocol. I
> have no idea why he would think
> that "packet" is a relevant concept. I just answered this
> in detail.
> ****
>>
>>> My app MUST
>>> be written in C++. I was envisioning this as a single
>>> app.
>>> One that handles the URL-encoded input, and the same one
>>> to
>>> process this data. In any case the one that processes
>>> this
>>> data must remain resident.
>>
>>
>>What web server are you going to be using?
> ****
> It probably doesn't matter if he builds an appropriate
> FASTCGI interface for it. And it
> has to be in pure C++, not MFC, and it must not have a
> GUI.
> ****
>>
>>> I have no idea what you are talking about when referring
>>> to
>>> a memory map.
>>
>>
>>Well you should be aware of these things around here.
>>Other common
>>terms are Memory Map File (MMF) or File Map
>>
>>http://en.wikipedia.org/wiki/Memory-mapped_file
>>http://msdn.microsoft.com/en-us/library/aa366556(VS.85).aspx
>>http://msdn.microsoft.com/en-us/library/ms810613.aspx
>>
>> > It is not a 4.0 GB file, it is numerous files
>>
>>> making up a total of 4.0 GB.
>>
>>
>>So you open each one up as a MMF. If this is static meta
>>data, then
>>merged them into one.
> *****
> He doesn't seem to grasp the fact that he only has 2GB
> (optimistically, 1GB) of contiguous
> virtual memory into which something can be mapped.
> Probably a lot less. He mistakes a
> 32-bit address space *capability* for a 32-bit address
> space *reality*, a notion I have
> tried to explain to him many times in the past as being
> wishful thinking, even if running
> a Win32 app marked /LARGEADDRESSAWARE on Win64 wherer
> there really is 4GB-128K of total
> virtual address space available. I guess his code, stack,
> and heap must all take 0 bytes.
> And there is no threading, so no thread stacks are
> required.

Right now on my 8.0 GB 64-bit Windows 7 system it is
reporting 6051 MB of physical memory is available.
When I run my /LARGEADDRESSAWARE 32-bit process on this
machine it fails when it reaches 4.0 GB.
Now that I have reduced my memory requirements by 100-fold
without any reduction in performance this will be less of an
issue.

> joe
> Joseph M. Newcomer [MVP]
> email: newcomer(a)flounder.com
> Web: http://www.flounder.com
> MVP Tips: http://www.flounder.com/mvp_tips.htm


From: Hector Santos on
Peter Olcott wrote:

> http://mongoose.googlecode.com/svn/trunk/examples/authentication.c
> The answer is already complete in mongoose, also confirming
> that mongoose has cookies!


Mongoose doesn't have COOKIES. Your APPLICATION implements cookies.

It wouldn't be a web server if it didn't a have a GET_HTTP_HEADER()
function. In mongoose, its:

const char * mg_get_header(conn, header_name);

So to get the COOKIE from the HTTP block, it calls:

const char *cookie;
cookie = mg_get_header(conn, "Cookie");

In that example, the NAME and PASS are part the form fields, retrieved by:

const char * mg_get_var(conn, field_name);

Does it support?

const char * mg_post_var(conn, field_name);

Cookies are used here for hold a session condition.

Of course, doens't work if the user has COOKIES disabled.

Is this PCI ready? If you are going to charger people and use credit
cards, it needs to be PCI ready.


--
HLS
From: Peter Olcott on

"Joseph M. Newcomer" <newcomer(a)flounder.com> wrote in
message news:fn42q59m8n702cpis6t409le60pdog0tft(a)4ax.com...
> See below...
> On Tue, 16 Mar 2010 21:06:35 -0500, "Peter Olcott"
> <NoSpam(a)OCR4Screen.com> wrote:
>
>>
>>"Hector Santos" <sant9442(a)nospam.gmail.com> wrote in
>>message
>>news:eyHxkRXxKHA.2644(a)TK2MSFTNGP04.phx.gbl...
>>> Peter Olcott wrote:
>>>
>>>>
>>>> That is not simple enough for my users. I want a browse
>>>> button on a webpage that browses the local hard-drive.
>>>
>>>
>>> <INPUT type="File" name="filename" />
>>>
>>>> It seems that the only completely portable way to do
>>>> this
>>>> is HTML and this HTML is hooked up to CGI. Maybe I
>>>> could
>>>> build something from scratch at the other end to handle
>>>> the input. I would guess that this would not provide
>>>> any
>>>> improvement over C++ FastCGI, and cost more development
>>>> time.
>>>> I am leaning toward Linux now, so a Microsoft thing
>>>> ported to Linux may not be as good as a native Linux
>>>> thing.
>>>
>>> Geez, you ain't going to get nothing done with this lost
>>> focus on "FastCGI".
>>>
>>> Once again, all you need is a standard C/C++ console
>>> application that reads standard input to read your HTTP
>>> transfer encoded data block. You either write the
>>> multi-part MIME processor or you get a library or class
>>> that does it for you.
>>
>>What handles transmission errors? Another thing, I want to
>>immediately close the connection on the first packet if
>>anything besides a 24-bit PNG file is detected. My app
>>MUST
>>be written in C++. I was envisioning this as a single app.
>>One that handles the URL-encoded input, and the same one
>>to
>>process this data. In any case the one that processes this
>>data must remain resident.
> ****
> You clearly missed any comprehsension of TCP/IP. TCP/IP
> is as reliable as a piece of wire
> between the two sites, and that essentially means it is
> 100% utterly reliable, or it has
> failed completely and there is no recovery (read about how
> TCP/IP retransmission works!)
> so there are *no* "transmission errors" at all, or you
> will get a notification that the
> protocol has failed utterly and there is NO recovery from
> that, so browsers deal with it!
> I have no idea what you are talking about when you talk
> about "clos[ing] the connection on
> the first packet" since, among other things, packets are
> invisible at the application
> level and you NEVER see them, you may or may not get all
> the bits you want in one Receive

OK so all this is new to me, We were talking about sockets
which does deal with packets.

> (it is a STREAM transmission, and therefore you simply
> CANNOT tell where packet boundaries
> are or anything else, and MUST assume that it will take 2
> or more Receives to get what you
> need, and you have not identified what you mean by a
> "24-bit PNG file is detected". You

It would be located in the first 1K bytes of the file
header. If I was dealing with packets, then I could examine
the first one and potentially reject the rest.

> must be living in an alternate plane of existence wheere
> there is some magic that
> determines that because some magical bits appear at the
> front of a byte sequence that the
> rest of the sequence must necessarily conform to that
> purported reality, and that in spite
> of the fact that TCP/IP *guarantees* delivery of bits
> intact that you might accidentally
> see the wrong bits after the client makes a transmission
> (this will never happen; TCP/IP
> doesn't even allow for it!) which goes back to the advice
> I gave weeks ago, about reading
> an introductory text on how TCP/IP actually works. There
> will be no "transmission errors"

This might now be moot because it looks like I will be
working at the higher level of HTTP, thus it seems that I
will not be directly dealing with the TCP/IP. Hector has
got me refocused on the webserver point of view. That seems
fruitful. There is a very good freeware webserver that I can
embed in my OCR processor, called mongoose.

> that garbage the message; either the message is received
> correctly or a fake message has
> been sent, and if you have a fake message, you close the
> connection and that's the end of
> it. You go back and wait for the next connection. If you
> were using PHP and mime, you
> would reject the data if it were not the correct mime
> image, or, having read it, the
> conversion to an image encountered a problem. It would
> not be a transmission error; it
> would be the result of erroneous data having been sent,
> because you will receive
> everything that was sent, perfectly, or you will be
> notified that there has been an
> unrecoverable network error (as will the transmission
> side). This is inherent in the
> TCP/IP protocol.

OK good. So this low level details of this now seem out of
the current scope of the project.

>
> So forget the concept of "first packet" or even "packets".
> They are nonsense at the
> TCP/IP application level. There is one, and only one,
> reality: the byte sequence. If you
> continue to believe in inappropriate (and erroneous)
> concepts like "packets" when using
> TCP/IP, you are DOOMED! They only have a reality at the
> lowest levels of the protocol,
> and at the application level, they have been erased
> entirely from your awareness. People
> who believe in anything other than the byte stream
> eventually have problems.

OK great good to have that cleared up.

>
> And please abandon your concepts of "resident", since you
> have demonstrated beyond any
> possible doubt that you simply do not understand how
> operating systems work. You have
> these strange wishful-thinking models that have no
> realization in any form of reality most
> of us are familiar with, and you assume that operating
> systems work in accordance to your
> dreams and not the way they are actually implemented.

I think that the 1000-fold faster than OCR performance that
I achieve contradicts you on this. I can process an entire
display screen in 1/10 second on a machine 800% slower than
my current machine. This could not be achieved if my DFA was
not at least almost completely contained with actual
hardware memory. Also the 800% speed up is almost entirely
attributed to faster access to physical RAM.

> ****
>>
>>>
>>> If you write a PHP script, it will do it for you. You
>>> can
>>> put this under a FASTCGI PHP setup under a certain port
>>> and then write a HTTP client do do your testing.
>>>
>>> However, I don't believe PHP naturally supports memory
>>> maps unless you find a public PHP class to expose the
>>> WIN32 memory map functions.
>>>
>>> The WIN32 API has all the memory mapping functions and
>>> you
>>> will need this to handle the 4GB file.
>>>
>>> Now, on the wienie side, I don't know what it offers for
>>> memory mapping. But I am sure it has something
>>> equivalent.
>>>
>>> --
>>> HLS
>>
>>I have no idea what you are talking about when referring
>>to
>>a memory map. It is not a 4.0 GB file, it is numerous
>>files
>>making up a total of 4.0 GB.
> ****
> Note that you cannot read a 4GB file into any WIndows app,
> since you only have 2GB (or
> perhaps 3GB) of address space to use, and not all of that
> is available (your code, your
> heap, and your stack, or each thread's stack, occupy the
> same address space). Therefore,
> you will HAVE to use a memory-mapped file if you wish to
> access more data than you can
> read. He is only pointing out the only possible
> implementation that will let you access
> 4GB of data effiiciently. If you think you can access 4GB
> of data, you can't even do that
> in a Win32 app running in Win64 (which has a 4GB address
> space, but not all of it can be
> used for your data! See previous comment). If you can
> currently read it all in, then you
> don't have 4GB.
> joe

I have seen windows report that my 32-bit app was using >
4000 MB of RAM.
8.0 GB of Ram 64-bit Windows 7, and /LARGEADDRWESSAWARE
This instance 4.0 GB is supposed to be the limit.
http://msdn.microsoft.com/en-us/library/aa366778(VS.85).aspx#memory_limits



>
>>
> Joseph M. Newcomer [MVP]
> email: newcomer(a)flounder.com
> Web: http://www.flounder.com
> MVP Tips: http://www.flounder.com/mvp_tips.htm