Can extra processing threads help in this case? [MFC]

Prev: Improving Pete'r Application Performance
Next: Competitors for Pet'e OCR system

From: Hector Santos on 7 Apr 2010 15:33

Peter O.

You have changed every time you discovered a new "technology" or methods.

Look, somethings the best ideas are the first onces you come up with.

All you need here is make your 1 request OCR processor CGI ready if
you say, it can work at 100 ms per request, than at worst, you will
have 10 CGI processors spawned which for your Windows 7 QUAD 8GB
machine would probably run very smooth.

Remember this whole thread started with the 5 GB memory load
requirment and the solutions proposed was to used a shared memory
concept. That is all you need here.

Once you make that a shared read only mapped which is a piece of cake,
you are DONE!!!!

Now you can use ANY web server that supposes CGI, use a web server
with an CMS in it so you can manage user accounts, etc.

And if that is not good enough, go to the 2nd idea you have, make it
FASTCGI where now you have worker pool. I said it wasn't necessary
once you had a shared memory file, but thats ok. Do that if that is
what you need.

If that is not good enough, recompile for 64 bit.

And if the hardware is not good, get INTEL XEON machines to help with
the "Far Calls" to memory per core.

--

Peter Olcott wrote:

>> Peter D wrote:
>>

>> *Which* design???? You have proposed so many designs and
>> tossed around so many things that I wonder if even *you*
>> know what your design is!
>>
>> -Pete
>>
>
> I have not changed this aspect of the design since it was
> initially proposed.
> Here is the current design:
>
> (1) A transaction log file forms the FIFO queue between
> multiple threads (one per HTTP request) of the web server
> and at least one OCR process.
>
> (2) Some form of IPC (probably Unix/Linux named pipes)
> informs the OCR process of an HTTP request than needs to be
> serviced, it sends the offset within the transaction file.
>
> (3) The OCR process uses the offset within the transaction
> file to get the transaction details, and updates the
> transaction flag from [Available] to [Pending].
>
> (4) When the OCR process is done with processing it informs
> the web server, in another FIFO using IPC (such as a
> Unix/Linux named pipe) by passing the offset of the
> transaction file.
>
> (5) The web server reads the Thread-ID from this offset of
> the transaction file and informs the thread so that this
> thread can provide the HTTP response.
>
> I was originally having the OCR process update the
> transaction flag from [Pending] to [Complete], but it might
> make more sense for the thread that receives the HTTP
> acknowledgement of the HTTP response to do this.
>
>

--
HLS

From: Joseph M. Newcomer on 7 Apr 2010 17:00

On Wed, 7 Apr 2010 10:07:02 -0500, "Peter Olcott" <NoSpam(a)OCR4Screen.com> wrote:

>
>"Joseph M. Newcomer" <newcomer(a)flounder.com> wrote in
>message news:isnnr55eirshsintd983glpf4o76bnkm4r(a)4ax.com...
>> See below,,.
>> On Tue, 6 Apr 2010 16:59:13 -0500, "Peter Olcott"
>> <NoSpam(a)OCR4Screen.com> wrote:
>>
>>>
>>>"Joseph M. Newcomer" <newcomer(a)flounder.com> wrote in
>>>message news:2atmr51ml9kn4bb5l5j77h3lpiqtnlq8m3(a)4ax.com...
>>>> See below...
>>>> On Mon, 5 Apr 2010 21:32:44 -0500, "Peter Olcott"
>>>> <NoSpam(a)OCR4Screen.com> wrote:
>>>>
>>>>>Ah but, then you are ignoring the proposed aspect of my
>>>>>design that would handle all those things. What did you
>>>>>call
>>>>>it "mirrored transactions". I called it on-the-fly
>>>>>transaction-by-transaction offsite backup.
>>>> ****
>>>> If you have a "proposed aspect" I presume you have
>>>> examined the budget numbers for actual
>>>> dollars required to achieve this, and the complexity of
>>>> making sure it works right.
>>>>
>>>> I am not ignoring the issue, I'm asking if you have
>>>> ignored the realities involved in
>>>> achieving it!
>>>
>>>I would simply re-implement some of the aspects of my web
>>>application such that there is another web application on
>>>another server that the first server can send its
>>>transactions to.
>> ****
>> Ohh, the Magical Mechanism solution! Of course, this adds
>> time, complexity, and cost, but
>> what do they matter? Maybe you could talk to your ISP
>> about "load balancing" among
>> multiple servers? They've already got this working! At
>> least most ISPs that plan to
>> survive have it working already.
>> *****
>>>
>>>>>I don't want to ever lose any data pertaining to
>>>>>customers
>>>>>adding money to their account. I don't want to have to
>>>>>rely
>>>>>on the payment processor keeping track of this. Maybe
>>>>>there
>>>>>are already mechanisms in place that can be completely
>>>>>relied upon for this.
>>>> ****
>>>> If a customer can add $1 and you spend $5 making sure
>>>> they
>>>> don't lose it, have you won?
>>>
>>>If you don't make sure that you don't lose the customer's
>>>money your reputation will put your out of business. If
>>>you
>>>can't afford to make sure that you won't lose the
>>>customer's
>>>money then you can't afford to go into business.
>> *****
>> Yes, but you have to make sure the mechanisms you create
>> to do this are cost-effective.
>> See my earlier comments about UPS and FedEx not requiring
>> "live signatures" for most
>> deliveries! Sometimes, you lef your "insurance" pay this,
>> and sometimes, you become your
>> own insurer (this is called, technically, being
>> "self-insured").
>> joe
>
>Sure so another way to solve this problem is on the rare
>cases when you do lose a customer's money you simply take
>their word for it and provide a refund. This also would hurt
>the reputation though, because this requires the customer to
>find a mistake that should not have occurred.
****
Incredibly elaborate mechanisms to solve non-problems. Simple mechanisms (e.g., "resubmit
your request") should suffice. Once your requirements state what failure modes are
permitted, it is easier to decide what the implementation would be. Instead, you seem to
be working from implementation proposals back to the requirements documents. While Barry
Boehm did indicate that there was always feedback from the implementation back to the
requirements (the "spiral model" of development), he never suggested that some random
implementation idea should drive the requirements; instead, if the implementation proved
infeasible, revisions of the requirements would ensue.
*****
>
>On the fly transaction by transaction offsite backup will be
>implemented for at least the transactions that add money to
>the customers account.
****
And you guarantee the transactional integrity of the offsite backup exactly how?
****
>
>If I lose the transactions that deduct money, then some
>customers may get some free service. I can afford to give
>the customer more than they paid for. I don't want to ever
>give the customer less than they paid for.
***
That was my point.
joe
****
>
>>
>> *****
>>>
>> Joseph M. Newcomer [MVP]
>> email: newcomer(a)flounder.com
>> Web: http://www.flounder.com
>> MVP Tips: http://www.flounder.com/mvp_tips.htm
>
Joseph M. Newcomer [MVP]
email: newcomer(a)flounder.com
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm

From: Peter Olcott on 7 Apr 2010 17:51

"Hector Santos" <sant9442(a)nospam.gmail.com> wrote in message
news:uv5hodo1KHA.4832(a)TK2MSFTNGP04.phx.gbl...
> Peter Olcott wrote:
>
>>> Unless you get Named Pipe class that will do all the
>>> work, error checking, like error 5/32 sharing violation
>>> timings, etc, exceptions, proper full duplex
>>> communications, you can certainly run into a ugly mess.
>>> I don't recommend it for you. You don't need it.
>>
>> This process has to be event driven rather the a polled
>> interface so I must have some sort of IPC. None of the
>> Unix/Linux people are bringing up any issues with named
>> pipes. Perhaps my design is simple enough to make many of
>> these issues moot. Two half duplex pipes instead of one
>> full duplex pipe, thus forming two very simple FIFO
>> queues.
>
>
> Or they really didn't want to tell you the bad news or
> waste time telling you all the "gotchas." You will have a
> very, more than necessary, complexed design with pipes and
> the odds are high you will has misfired, blocks that don't
> return, etc.
>
> Remember, this is your bottleneck:
>
> Many Web Threads ---> 1 FIFO/OCR Thread
>
> I'm not saying it can't be done, but you will waste more
> time trying to get that right when it didn't call for it.
> On the one hand, you are putting such high constraints on
> so many other things, that this head strong focus on named
> pipes will be a weak point. See below on your
> overloading.
>
>>> What about your HTTP request and response model? Does
>>> the above incorporate a store and forward concept?
>>> Meaning, don't forgot that you have a RESPONSE to
>>> provide. You just can't ignore it. You have to at least
>>> respond with:
>>>
>>> "This will take a long time, we will email you when
>>> done."
>>>
>>
>> The Response model is outlined above.
>
>
> At some point, your model MUST turns into a store and
> forward concept otherwise it will break down. This is
> based on your stated boundary conditions:
>
> 1 fifo/ocr thread - 100 ms turn around time.
>
> That means you can only handle 10 request per second.

No it does not. 100 ms is the real-time limit, actual
processing time will average much less than this, about 10
ms.

>
> But you also stated
>
> 100 request per second.
>
> So therefore, you need at least 10 fifo/ocr thread
> handlers to handle the load, otherwise the bucket will be
> filled pretty darn fast.
>
> Again remember, this is your bottleneck:
>
> Many Web Threads ---> 1 FIFO/OCR Thread
>
> No matter how you configure it, 10 threads in 1 process,
> 10 processes on 1 machine or across machines, you need at
> least 10 handlers to handle the 100 TPS with 100 ms
> transaction times.

10 ms transaction time

>
> Once it can not handle the dynamic response model, it
> becomes a store and forward response model.
>
> You can do what you want, but your expectations for high
> throughput are unaligned with your proposed implementation
> method. Maybe if you said,
>
> "I expect 10 request per second PER OCR station"
>
> than at least you are being more realistic with your 1
> fifo/ocr thinking.
>
> --
> HLS

From: Peter Olcott on 7 Apr 2010 19:10

"Hector Santos" <sant9442(a)nospam.gmail.com> wrote in message
news:%23OLvwko1KHA.5360(a)TK2MSFTNGP06.phx.gbl...
> Peter O.
>
> You have changed every time you discovered a new
> "technology" or methods.
>
> Look, somethings the best ideas are the first onces you
> come up with.
>
> All you need here is make your 1 request OCR processor CGI
> ready if you say, it can work at 100 ms per request, than
> at worst, you will have 10 CGI processors spawned which
> for your Windows 7 QUAD 8GB machine would probably run
> very smooth.
>
> Remember this whole thread started with the 5 GB memory
> load requirment and the solutions proposed was to used a
> shared memory concept. That is all you need here.

That has changed now that I have had a chance to benchmark
the simulation of my redesigned process. Now I often load
the data on the fly.

>
> Once you make that a shared read only mapped which is a
> piece of cake, you are DONE!!!!
>
> Now you can use ANY web server that supposes CGI, use a
> web server with an CMS in it so you can manage user
> accounts, etc.

I am going to use a web server that has source-code so I
won't need any kind of CGI.

>
> And if that is not good enough, go to the 2nd idea you
> have, make it FASTCGI where now you have worker pool. I
> said it wasn't necessary once you had a shared memory
> file, but thats ok. Do that if that is what you need.
>
> If that is not good enough, recompile for 64 bit.
>
> And if the hardware is not good, get INTEL XEON machines
> to help with the "Far Calls" to memory per core.
>
> --
>
> Peter Olcott wrote:
>
> >> Peter D wrote:
> >>
>
>>> *Which* design???? You have proposed so many designs
>>> and tossed around so many things that I wonder if even
>>> *you* know what your design is!
>>>
>>> -Pete
>>>
>>
>> I have not changed this aspect of the design since it was
>> initially proposed.
>> Here is the current design:
>>
>> (1) A transaction log file forms the FIFO queue between
>> multiple threads (one per HTTP request) of the web server
>> and at least one OCR process.
>>
>> (2) Some form of IPC (probably Unix/Linux named pipes)
>> informs the OCR process of an HTTP request than needs to
>> be serviced, it sends the offset within the transaction
>> file.
>>
>> (3) The OCR process uses the offset within the
>> transaction file to get the transaction details, and
>> updates the transaction flag from [Available] to
>> [Pending].
>>
>> (4) When the OCR process is done with processing it
>> informs the web server, in another FIFO using IPC (such
>> as a Unix/Linux named pipe) by passing the offset of the
>> transaction file.
>>
>> (5) The web server reads the Thread-ID from this offset
>> of the transaction file and informs the thread so that
>> this thread can provide the HTTP response.
>>
>> I was originally having the OCR process update the
>> transaction flag from [Pending] to [Complete], but it
>> might make more sense for the thread that receives the
>> HTTP acknowledgement of the HTTP response to do this.
>
>
>
> --
> HLS

From: Hector Santos on 7 Apr 2010 19:38

Peter Olcott wrote:

>> That means you can only handle 10 request per second.
>
> No it does not. 100 ms is the real-time limit, actual
> processing time will average much less than this, about 10
> ms.

Now you are even more unrealistic. That means for a 100 TPS,
you need now need 100 threads.

But I want you to lookup the term Thread Quantum.

In short, what you are claiming is that your complete a request and
processing in 1 CPU cycle of context switching. A quantum is around
~15 ms on multi-core/processors.

>> No matter how you configure it, 10 threads in 1 process,
>> 10 processes on 1 machine or across machines, you need at
>> least 10 handlers to handle the 100 TPS with 100 ms
>> transaction times.
>
> 10 ms transaction time

Unrealistic. Dreaming.

--
HLS

First | Prev | Next | Last
Pages: 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114
Prev: Improving Pete'r Application Performance
Next: Competitors for Pet'e OCR system