From: Peter Olcott on

"Joseph M. Newcomer" <newcomer(a)flounder.com> wrote in
message news:7ekvp5dimsgbnajou6jnqsn7l0imefpjlj(a)4ax.com...
> See below...
> On Tue, 16 Mar 2010 08:25:31 -0500, "Peter Olcott"
> <NoSpam(a)OCR4Screen.com> wrote:
>
>>
>>"Joseph M. Newcomer" <newcomer(a)flounder.com> wrote in
>>message news:uoftp55ehplnk6c7h9869qdfvb8o0fq3jj(a)4ax.com...
>>> See below...
>>> On Sat, 13 Mar 2010 23:02:30 -0600, "Peter Olcott"
>>> <NoSpam(a)OCR4Screen.com> wrote:
>>>
>>>>
>>>>"Joseph M. Newcomer" <newcomer(a)flounder.com> wrote in
>>>>message
>>>>news:vbnop5dhutg9604v8dgsdpqge5ija6d20f(a)4ax.com...
>>>>> See below...
>>>>> On Sat, 13 Mar 2010 16:35:46 -0800, Geoff
>>>>> <geoff(a)invalid.invalid> wrote:
>>>>>
>>>>>>On Sat, 13 Mar 2010 17:58:02 -0600, "Peter Olcott"
>>>>>><NoSpam(a)OCR4Screen.com> wrote:
>>>>>>
>>>>>>>I want to make a web service, and I don't need any
>>>>>>>complex
>>>>>>>protocol such as SOAP. I only need to take a stream
>>>>>>>of
>>>>>>>bytes
>>>>>>>representing a PNG image file as input, and provide
>>>>>>>UTF-8
>>>>>>>text as output.
>>>>>>>
>>>>>>>I might have a bunch of small nearly simultaneous
>>>>>>>requests.
>>>>>>>These can be processed in their order of arrival. A
>>>>>>>few
>>>>>>>of
>>>>>>>these requests may be multi-megabytes. Is socket
>>>>>>>programming
>>>>>>>a good way to go on this?
>>>>>>>
>>>>>>
>>>>>>This sounds like more questions about your OCR
>>>>>>appliance.
>>>>>>
>>>>>>If the clients accessing your service are using the
>>>>>>Internet protocol
>>>>>>to access your server you do not need to ask this
>>>>>>question. You have
>>>>>>no choice.
>>>>>>
>>>>>>If you were coding on the Linux platform you would not
>>>>>>need to ask
>>>>>>this question, it would be automatically assumed that
>>>>>>you
>>>>>>are going to
>>>>>>use a sockets interface for your service. It would
>>>>>>also
>>>>>>take about 30
>>>>>>minutes to code and debug a sockets interface and fork
>>>>>>logic to pass
>>>>>>this kind of data in and out of your OCR process.
>>>>> ****
>>>>> Actually, on the server side it is faster than this,
>>>>> because it is just a cgi gateway
>>>>> script; the data is sent up uuencoded or some other
>>>>> suitable encoding, and passed as data
>>>>> to the CGI-invoked program. Of course, you have to
>>>>> code
>>>>> up the CGI code, and that can
>>>>> take a while, but that's going to be fixed overhead no
>>>>> matter what means is used to encode
>>>>> the client data. But an HTTP POST with suitable
>>>>> encoding
>>>>> on the client side is all that
>>>>> is required, plus waiting for the HTTP response, both
>>>>> of
>>>>> which are (a) easy and (b) the
>>>>> same on Windows and linux.
>>>>> ****
>>>>
>>>>I want whatever code that handles servicing the clients
>>>>to
>>>>be always memory resident. I didn't think that CGI
>>>>worked
>>>>this way. I also thought that you need a separate CGI
>>>>instance for each client, I can't have that.
>>> ****
>>> Your assumption that this is possible is without
>>> foundation. While some servers try to
>>> simulate this, they do no guarantee it. WHat you want
>>> and
>>> what you are going to get seem
>>
>>If my OCR code is not always memory resident then I can
>>not
>>meet my performance requirements. If I am required to load
>>a
>>separate instance of my OCR code to service every request,
>>then the system would utterly fail scalability. I can
>>choose
>>the server and OS that this will run on.
> ****
> And you manage this how? I sincerely hope you are not
> using VirtualLock because that has
> an overall serious negative impact on total system
> peformance. So your need for a
> dedicated server is an essential component here. And the
> way you would normally handle
> this is as a system service which you communicate with.
> Whether or not the means of
> communicating introduces delays is up to the Web hosting
> software; IIS and Apache have
> different approaches to this problem, and I am an expert
> in neither (once, five years ago,
> I configured an Apache server on Windows, but it was a
> long, painful process and I've
> thankfully forgotten most of it)
> joe

You missed an important comment that I made below.
Apparently FastCGI will do what I need. Unless I hear any
objections I think that this is the way that I am going to
go.

>
>>
>>> to be different. If you have unreleastic expectations,
>>> they cannot be met (note that this
>>> is what I was trying to tell you when you were insisting
>>> 500ms was mandatory...you are at
>>> the mercy of a huge numbe of variables, none of which
>>> are
>>> under your control and none of
>>> which are going to change just because of what you
>>> want.)
>>> joe
>>
>> http://en.wikipedia.org/wiki/Common_Gateway_Interface
>>Are you saying that CGI is the ONLY underlying mechanism
>>by
>>which remote webservices are invoked? The article above
>>mentions FastCGI surely I can find some server somewhere
>>that permits this.
>>
>>> *****
>>>>
>>>>>>
>>>>>>Outside of your program, one could program a Perl cgi
>>>>>>script behind an
>>>>>>Apache based web server to copy the received data
>>>>>>streams
>>>>>>to files
>>>>>>with associated cookies or "handles" to the socket
>>>>>>sessions hosted by
>>>>>>the web server but you would have to process the files
>>>>>>and
>>>>>>respond to
>>>>>>them in such a manner that you could guarantee a reply
>>>>>>before the web
>>>>>>session expired. The script would handle the socket
>>>>>>sessions while
>>>>>>your process merely dealt with file I/O (pipes?) to
>>>>>>the
>>>>>>scripts.
>>>>> ****
>>>>> Remember his original question about 500ms? This is
>>>>> part
>>>>> of the overhead that is "beyond
>>>>> the control of the program" that I kept referring to.
>>>>> In
>>>>> fact, the cgi invocation
>>>>> overhead is one of the performance bottlenecks of most
>>>>> Web
>>>>> servers,
>>>>
>>>>So I will have to use something else.
>>> ****
>>> You can do whatever you want; what you get is not under
>>> your control.
>>> joe
>>>
>>> ****
>>>>
>>>>> and Apache is probably
>>>>> the most efficient platform around for handling this
>>>>> (I
>>>>> haven't followed the latest round
>>>>> of tricks, only to note that it now has improved this
>>>>> time
>>>>> considerably)
>>>>> ****
>>>>>>
>>>>>>Even doing this in Windows with or without MFC, just
>>>>>>writing a simple
>>>>>>server, one could write a socket server that passed
>>>>>>files
>>>>>>to/from your
>>>>>>OCR program in a few hours.
>>>>>>
>>>>>>I hope you also realize that your OCR appliance could
>>>>>>probably very
>>>>>>easily be perverted to defeat web based captcha codes
>>>>>>on
>>>>>>a
>>>>>>vast scale.
>>>>> ****
>>>>> I didn't want to mention this...but it seems pretty
>>>>> obvious. But we've always known that
>>>>> captcha codes were going to be short-lived...
>>>>>
>>>>> joe
>>>>> ****
>>>>> Joseph M. Newcomer [MVP]
>>>>> email: newcomer(a)flounder.com
>>>>> Web: http://www.flounder.com
>>>>> MVP Tips: http://www.flounder.com/mvp_tips.htm
>>>>
>>> Joseph M. Newcomer [MVP]
>>> email: newcomer(a)flounder.com
>>> Web: http://www.flounder.com
>>> MVP Tips: http://www.flounder.com/mvp_tips.htm
>>
> Joseph M. Newcomer [MVP]
> email: newcomer(a)flounder.com
> Web: http://www.flounder.com
> MVP Tips: http://www.flounder.com/mvp_tips.htm


From: Hector Santos on
Peter Olcott wrote:

> You missed an important comment that I made below.
> Apparently FastCGI will do what I need. Unless I hear any
> objections I think that this is the way that I am going to
> go.


FastCGI for PHP, as most LAYMAN would know it today, was developed by
one our ex-engineers before he was getting an "offer he couldn't
refuse."

Today, FastCGI implies PHP programming which is not very fast itself
(compared to native code). FastCGI is simply a prestarted port with
the huge PHP dll loaded that will speed up processing by removing the
delays in non-cached loading of all the necessary DLLS for PHP
scripts. If you expect a high rate of your transactions, the normal
PHP script mapping would suffice because of everything (DLLs) will be
cached.

If you are thinking FASTCGI, you might as well right your own native
server in C/C++ if you want speed and performance. Otherwise, you
are will be learning PHP (which is not very fast compared to native)
or you will have to incorporate some other web server with its
external processing.

Once again, you need to workout your "Peter's OCR Client/Server
Protocol - the rules, the state machine, then that will help define
what tools will help do this easier.

Honestly, this is C/S 101. Its a simple file transfer protocol with a
response.

You might as well look at FTP as the simpliest way and with the
protocol methodology for everything there is to know, including error
handling, timeouts, restart, authentication, extensions with QUOTE
commands.

If you want to get going, then purchase our http://www.winserver.com
and it will give you everything to get started, including a customer
account subscription/profile systems, plus any way you wish to develop
your "CGI" or native code in client/server fashion.

But you don't have too. Just trying to explain, this is all TOO
common. There is nothing special here. You can roll your own but you
will be coming across all sorts of communications issues above and
beyond the simple file transfer aspect. Beyond the basic framework,
the only special part is the application itself - "Peter's OCR
Client/Server Protocol."

--
HLS
From: Peter Olcott on

"Hector Santos" <sant9442(a)nospam.gmail.com> wrote in message
news:%23EX4oeUxKHA.5936(a)TK2MSFTNGP04.phx.gbl...
> Peter Olcott wrote:
>
>> You missed an important comment that I made below.
>> Apparently FastCGI will do what I need. Unless I hear
>> any objections I think that this is the way that I am
>> going to go.
>
>
> FastCGI for PHP, as most LAYMAN would know it today, was
> developed by one our ex-engineers before he was getting an
> "offer he couldn't refuse."
>
> Today, FastCGI implies PHP programming which is not very
> fast itself (compared to native code). FastCGI is simply
> a prestarted port with the huge PHP dll loaded that will
> speed up processing by removing the delays in non-cached
> loading of all the necessary DLLS for PHP scripts. If
> you expect a high rate of your transactions, the normal
> PHP script mapping would suffice because of everything
> (DLLs) will be cached.
>
> If you are thinking FASTCGI, you might as well right your
> own native server in C/C++ if you want speed and
> performance. Otherwise, you are will be learning PHP
> (which is not very fast compared to native) or you will
> have to incorporate some other web server with its
> external processing.
>
> Once again, you need to workout your "Peter's OCR
> Client/Server Protocol - the rules, the state machine,
> then that will help define what tools will help do this
> easier.
>
> Honestly, this is C/S 101. Its a simple file transfer
> protocol with a response.
>
> You might as well look at FTP as the simpliest way and
> with the protocol methodology for everything there is to
> know, including error handling, timeouts, restart,
> authentication, extensions with QUOTE commands.
>
> If you want to get going, then purchase our
> http://www.winserver.com and it will give you everything
> to get started, including a customer account
> subscription/profile systems, plus any way you wish to
> develop your "CGI" or native code in client/server
> fashion.
>
> But you don't have too. Just trying to explain, this is
> all TOO common. There is nothing special here. You can
> roll your own but you will be coming across all sorts of
> communications issues above and beyond the simple file
> transfer aspect. Beyond the basic framework, the only
> special part is the application itself - "Peter's OCR
> Client/Server Protocol."
>
> --
> HLS

I would be doing FastCGI in C++, if I do it. I am
considering FastCGI because I need a completely portable way
for a user to browse for a PNG file to send to my server.
HTML has this built in, and it is connected to CGI. As far
as a protocol goes, this would be trivial. The user sends a
24-bit PNG file, the server sends back UTF-8. That's the
whole protocol. Sometimes the UTF-8 is the PNG file
translated into text, and other times it is an error
response.


From: Hector Santos on
Peter Olcott wrote:

>
> I would be doing FastCGI in C++, if I do it. I am
> considering FastCGI because I need a completely portable way
> for a user to browse for a PNG file to send to my server.
> HTML has this built in, and it is connected to CGI. As far
> as a protocol goes, this would be trivial. The user sends a
> 24-bit PNG file, the server sends back UTF-8. That's the
> whole protocol. Sometimes the UTF-8 is the PNG file
> translated into text, and other times it is an error
> response.

Doesn't sound you understand, or at least are showing you don't, what
FastCGI is.

Forget fastcgi, what you need is a dedicated http server with the
built-in procssing or a http server that supports spawning your cgi
EXE application as a special URL.

You will handle standard INPUT as the encoded data. Simple.

Example:

int main(char argc, char *argv[])
{
DoCGI();
return 0;
}

where DoCgi() is reads stdin and maybe environment strings.

int DoCGI()
{
// get environment string, if required
// read stdin as FORM POSTING encoding data
// process, generate response

printf("Status: 200\n");
printf("Content-Type: text/text\n\n");
printf"UTF8:%s\n",szResponse);
return 0;
}

For FastCGI, you would prepare your own socket listening server for
incoming request from the web server itself.

--
HLS
From: Geoff on
On Tue, 16 Mar 2010 17:40:54 -0400, Hector Santos
<sant9442(a)nospam.gmail.com> wrote:

>Peter Olcott wrote:
>
>>
>> I would be doing FastCGI in C++, if I do it. I am
>> considering FastCGI because I need a completely portable way
>> for a user to browse for a PNG file to send to my server.
>> HTML has this built in, and it is connected to CGI. As far
>> as a protocol goes, this would be trivial. The user sends a
>> 24-bit PNG file, the server sends back UTF-8. That's the
>> whole protocol. Sometimes the UTF-8 is the PNG file
>> translated into text, and other times it is an error
>> response.
>
>Doesn't sound you understand, or at least are showing you don't, what
>FastCGI is.
>
>Forget fastcgi, what you need is a dedicated http server with the
>built-in procssing or a http server that supports spawning your cgi
>EXE application as a special URL.
>
>You will handle standard INPUT as the encoded data. Simple.
>
>Example:
>
>int main(char argc, char *argv[])
>{
> DoCGI();
> return 0;
>}
>
>where DoCgi() is reads stdin and maybe environment strings.
>
>int DoCGI()
>{
> // get environment string, if required
> // read stdin as FORM POSTING encoding data
> // process, generate response
>
> printf("Status: 200\n");
> printf("Content-Type: text/text\n\n");
> printf"UTF8:%s\n",szResponse);
> return 0;
>}
>
>For FastCGI, you would prepare your own socket listening server for
>incoming request from the web server itself.

If he's going to write it in C++ as an application wouldn't it be
better to write it using the ISAPI on an IIS server and make his OCR
application a DLL callable from the server extension? It seems that
this would solve the latency problem as well as the scalability
problem since it would be a single server extension instance with
multiple threads. Microsoft would claim this would be superior to CGI.
This might tend to limit it to the Windows platform. I don't know if
there is an ISAPI implementation on Linux.