Can extra processing threads help in this case? [MFC]

Prev: Improving Pete'r Application Performance
Next: Competitors for Pet'e OCR system

From: Hector Santos on 13 Apr 2010 15:26

Jerry Coffin wrote:

> In fact your work load shows exactly *why* most schedulers just run
> the highest priority task that's read to run: almost no real work
> load is always ready to run. That lets the really simple scheduler
> design actually allocate CPU time quite a bit more fairly (or at
> least evenly) than it initially seems like it would.
>
> The real question is which work load is a more accurate simulation of
> Peter's OCR engine. Mine simulates what Peter has *said* -- that it's
> completely CPU bound. In all honesty, that's probably not correct, so
> yours is probably a more accurate simulation of how it's likely to
> work in reality.

It would be fairly easy to simulate his general basic layman
description. What he doesn't wish to relinquish is his target rates
which can only come closer to simulate when using a rate distribution
per job priority. He can do exactly what he wants but not at the
rates he thinks, and there will need to be a lot of compensation and
acceptable lost.

He also doesn't seem to know the distinctions of single process with
multi-threads vs multiple instances of a single process with one
thread. While both can be simulated, the design implementation are
quite difference. I think this delima is due to his Linux vs Windows
conflict.

And mind you, he has having the save exact battle with the linux group
with the same answers and issues and labeling of him not listening.

Read this thread and see the same exactly issues with him emerging:

http://www.groupsrv.com/linux/post-923554.html

Which makes you wonder what are the motivations? I think this thread
says it all:

http://newsgroups.derkeiler.com/Archive/Comp/comp.lang.javascript/2006-08/msg01993.html

where it gets to a point where he argues he knows more about patent
than the participants of the thread and he doesn't need to write code
or demonstrate if it can work but just create an abstract prototype
for patentability.

It appears that he has locked himself into certain claims for a new
patent, and can not allow any engineering statement deviate how his
claims are possible using predefined environmental and system constraints.

He needs to be able to say:

A method to guarantee X under Y arrangement and conditions and
any other competitor can not reproduce X without infringing on
the Y arrangement.

What everyone is telling him is that his Y arrangement and conditions
conflicts with obtaining X and that he can't believe this because then
his patent filing will be without merit. He must find that 'unique' Y
arrangement. :)

One might be worry if he showed he was able to actually produce something.

--
HLS

From: Joseph M. Newcomer on 13 Apr 2010 15:32

See below...
On Mon, 12 Apr 2010 22:24:00 -0600, Jerry Coffin <jerryvcoffin(a)yahoo.com> wrote:

>In article <G96dnWTQUsVBfF7WnZ2dnUVZ_sydnZ2d(a)giganews.com>,
>NoSpam(a)OCR4Screen.com says...
>
>[ ... ]
>
>> None. A Single queue might work well with multiple threads
>> of a single process because IPC does not need to occur. The
>> MQ is because multiple processes require IPC.
>
>So if I understand your point correctly, your plan is to run four
>separate web servers, each with an OCR engine linked in, simply so
>you can avoid the single shared queue, so that right?
>
>[ ... ]
>
>> A single queue is much more difficult. Write at whatever
>> location is appropriate at the moment and read from the head
>> of whichever portion of the queue applies to this priority.
>> What do you do a linear search for the head? If you don't do
>> a linear search and keep track of the different heads
>> separately then this is essentially four queues, simply
>> strung together. How can that be better than four queues?
>
>A priority queue is normally implemented as a binary heap. It
>involves no linear searches and no separate heads.
****
std::map works well for this, because its iterators are ordered iterators!

Gee, I just gave away an answer that has sound reasoning behind it. I have to stop this,
lest I establish a precedent which the OP will have to follow!
****
>
>It has a number of advantages, such as sharing memory between the
>queues, so instead of a separate hard limit for each priority level,
>you get one limit on total tasks in the queue. It's also easy to
>scale. It's easy to have it keep your multiple processing engines
>busy.
*****
Since the design will be perfect, and the first implementation will work absolutely
correctly, no tuning is necessary, so there is no need to allow for scaling,
parameterization, or any of those other features that reality tends to make desirable for
the rest of us, who are incapable of sound reasoning.
****
>
>I know you've said the processor side is fixed in concrete -- let me
>point out that you're just plain wrong. ISPs come and go. Pricing
>changes faster still. You're looking at a great deal simply because
>it's an ancient machine -- but when it dies (and an already hot
>running Pentium IV running OCR will die, and soon) that pricing will
>be gone.
****
Actually, processors are fixed in epoxy. Only MIL-STD use ceramic packages, which
admittedly is as close to concrete as we are likely to get...

The rest, of course, is completely true, and since he plans to have "on-site" boxes (much
fuss was made about this) which are rendered untamperable, he has a source (but not at
GoodWill resale shops for dead computers--did you know GoodWIlll doesn't accept obosolete
computers such as Pentium I or Pentium 2 or Pentium Pro as donations? I know, I tried),
so I guess he will buy from the antiques section of eBay so his computers are sufficiently
obsolete that his perfect design will work as well as it can on them, but will not work
any better on modern machines.
****
>
>You say you've already spent 10 years on this -- but now you're
>basing decisions that should last another 10 years on a price quote
>that may not last through the end of the month. The supply of Pentium
>IV's used by ISPs is small and shrinking fast. Making long term
>decisions based on this factor is just plain foolish.
****
End of the month? He's ALREADY told us he's leasing "into perpetuity". Wired into an
obsolete model of computing no matter what the ISP does; in a year, 24-core CPUs are going
to be standard for servers (Intel has started making the chips in quantity already), but
he's going to be paying a rapacious price for an obsolete computer. We calll this "sound
reasoning about business decisions". He will be proven correct when the IPO comes out in
a year.
****
>
>You've also mentioned the possibility of using a cluster to run the
>application -- but that won't work with the design you're talking
>about. If running on a cluster is even a remote possibility, you need
>to allow for the web server and OCR engine not only be separate
>processes, but run on completely separate machines.
****
No, no, you don't understand! The Magic Morphing Requirements Document says it MUST
include clustering and it MUST NOT include ability to scale! Read it! (Wherever it
is...)

Besides, how could a superb designer make ANY error in the design? It is already perfect,
and we are just being mean and horrid by asserting otherwise!
joe
****
Joseph M. Newcomer [MVP]
email: newcomer(a)flounder.com
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm

From: Peter Olcott on 13 Apr 2010 15:36

"Joseph M. Newcomer" <newcomer(a)flounder.com> wrote in
message news:rog9s55uk4kddce41ia7p8lt7cvooee5a1(a)4ax.com...
> See below...
> On Mon, 12 Apr 2010 22:15:39 -0500, "Peter Olcott"
> <NoSpam(a)OCR4Screen.com> wrote:
>
>>A single queue is much more difficult. Write at whatever
>>location is appropriate at the moment and read from the
>>head
>>of whichever portion of the queue applies to this
>>priority.
>>What do you do a linear search for the head? If you don't
>>do
>>a linear search and keep track of the different heads
>>separately then this is essentially four queues, simply
>>strung together. How can that be better than four queues?
> ****
> No, you keep the queue in sorted order. You, of course,
> have performance numbers which
> (since you won't experiment) you can derive from sound
> reasoning to demonstrate that what
> is going on there is ultimately a provably correct design
> decision.
>
> Otherwise, you are just ranting. You can't prove you are
> right, so you tell us we must be
> wrong because we disagree with you.
>
> I want to see this sound reasoning laid out in detail,
> with numbers to go with it.
> joe
> ****

http://docs.google.com/viewer?a=v&q=cache:Hb_P22Cj9OAJ:citeseerx.ist.psu.edu/viewdoc/download%3Fdoi%3D10.1.1.93.429%26rep%3Drep1%26type%3Dpdf+multi+queue+multi+server&hl=en&gl=us&pid=bl&srcid=ADGEESh1kerH3RGqAvIEul4ryHpwxxU5HdWzS3edrtXW764CJUPudOBFnvTmUvl7W3uBXqe046N1tNkirmGqVOkUlmWQWTZQgLLwQHf5LolcXX43mvOEc3k0wR55vXqYAklq8Fu2-qgL&sig=AHIEtbSrBAf6HW8XDtNinTOdsNx5lf9tNQ

Google below if the above link does not work
On the stability of the multi-queue multi-server processor
sharing with limited service

>>
> Joseph M. Newcomer [MVP]
> email: newcomer(a)flounder.com
> Web: http://www.flounder.com
> MVP Tips: http://www.flounder.com/mvp_tips.htm

From: Hector Santos on 13 Apr 2010 15:36

Joseph M. Newcomer wrote:

>
> So how is it that your unsubstantiated opinions are so reliable and nobody else's opinions
> can be trusted?
> joe

Joe, he has have the same battles and issues, I mean the SAME, with
the Linux group. Read the thread here:

http://www.groupsrv.com/linux/post-923554.html

Same mis-understandings, same answers, even the "Great" David Schwartz
and practically everyone else have told him same things. And all of
them are recognizing how "uneducable" he is. Read it all six pages
of it and see the Deja Vu. I loved David's statement:

"Yes, stop fighting yourself. Just code what you want.

If you don't want a thread to be doing the 3.5 minute job,
because there's something more important to do, for the
love of god DON'T CODE IT TO DO THAT JOB.

What you're trying to do is code the thread to do one
thing and then use some clever manipulation to get it to
do something else. Just code the thread to do the work you
want done and you won't have to find some way to pre-empt it
or otherwise "trick" it."

"For the Love of God!.." HA! Every answer we provided was provided
in this thread as well! One guy even said:

"Why do you think that you, who doesn't even know about the
operating system you're working under, can implement its
features better than the OS itself?"

and another person adds:

"So you're attempting to optimising the system and
reinvent the scheduler, and yet you've never heard of
poll(2)? Why do I feel that's an unsurprising combination?"

His response?

"Even if I fail at this I will succeed in knowing the
reasoning why I failed."

He really knows when to act dumb.

It reminds me of my recent trip, a long time I have taken a (small)
vacation with the wife when we stopped at a gas station on the
Turnpike and she told me to "Put in $20 of gas." I'm thinking "1/4
tank filled. This is a small car. We will overflow the tank. How about
$10?" She says "Hello?! Gas is 3 bucks a gallon!" I said "I know
silly! I figured the tanks are smaller. Its a small car!" :)

And mind you, the same type of threads with the same questions with
similar results are peppered throughout the net since 2006.

--
HLS

From: Joseph M. Newcomer on 13 Apr 2010 15:39

See below...
On Mon, 12 Apr 2010 23:44:27 -0500, "Peter Olcott" <NoSpam(a)OCR4Screen.com> wrote:

>
>"Jerry Coffin" <jerryvcoffin(a)yahoo.com> wrote in message
>news:MPG.262d9fa7ec108197989873(a)news.sunsite.dk...
>> In article
>> <G96dnWTQUsVBfF7WnZ2dnUVZ_sydnZ2d(a)giganews.com>,
>> NoSpam(a)OCR4Screen.com says...
>>
>> [ ... ]
>>
>>> None. A Single queue might work well with multiple
>>> threads
>>> of a single process because IPC does not need to occur.
>>> The
>>> MQ is because multiple processes require IPC.
>>
>> So if I understand your point correctly, your plan is to
>> run four
>> separate web servers, each with an OCR engine linked in,
>> simply so
>> you can avoid the single shared queue, so that right?
>
>Not at all, but, when you state your assumptions then I can
>correct them.
>I will have one single web server process that connects to
>four separate OCR processes using some sort of FIFO queue,
>one for each OCR process.
****
WHere in the Requirements Document does it require that the OCR computations be done in
separate processes, or that there be a FIFO used for IPC? Actually, those are
IMPLEMENTATION decisions, not REQUIREMENTS; and in an implementaiton, a FIFO queue for IPC
is NOT required; in fact, you can use a queue whose maximum depth is ONE and still achieve
what you need, because you can maintain any queuing elsewhere, such as in memory, or in a
transacted database. You do not need a FIFO queue of unlimited size between your Web
server and your OCR engine as part of the REQUIREMENTS document, which should merely state
something like "Jobs are accepted in any order, but dispatched in priority order to a
plurality of OCR recognizers" without ONCE stating that a FIFO queue is required, that
separate processes are required, that the FIFO shall be of unimited size (unlimited == 4K
is an implementation issue!), but since you drive your requirements document by hearsay
about low-level implementation decisions, of course it gets a little distorted, and
generally unrecognizable as what we used to call a "requirements" document.
****
>
>>> A single queue is much more difficult. Write at whatever
>>> location is appropriate at the moment and read from the
>>> head
>>> of whichever portion of the queue applies to this
>>> priority.
>>> What do you do a linear search for the head? If you don't
>>> do
>>> a linear search and keep track of the different heads
>>> separately then this is essentially four queues, simply
>>> strung together. How can that be better than four queues?
>>
>> A priority queue is normally implemented as a binary heap.
>> It
>> involves no linear searches and no separate heads.
>>
>> It has a number of advantages, such as sharing memory
>> between the
>> queues, so instead of a separate hard limit for each
>> priority level,
>> you get one limit on total tasks in the queue. It's also
>> easy to
>> scale. It's easy to have it keep your multiple processing
>> engines
>> busy.
>
>No advantage that I can see that is worth the cost.
****
Oh, and building a complex interprocess "signaling" mechanism that puts lower-priority
jobs to sleep when high-priority jobs come in has ZERO cost to design, ZERO cost to
implment, and of course, because you are such a superb designer, will have no race
conditions and therefore will never fail?
joe
****
>
Joseph M. Newcomer [MVP]
email: newcomer(a)flounder.com
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm

First | Prev | Next | Last
Pages: 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154
Prev: Improving Pete'r Application Performance
Next: Competitors for Pet'e OCR system