Can extra processing threads help in this case? [MFC]

Prev: Improving Pete'r Application Performance
Next: Competitors for Pet'e OCR system

From: Jerry Coffin on 13 Apr 2010 14:46

In article <G9udnbRPubZBOVnWnZ2dnUVZ_vqdnZ2d(a)giganews.com>,
NoSpam(a)OCR4Screen.com says...

[ ... ]

> That is not what I said. I said that it would take at least
> this long if the drive's (average) seek time is 9 ms, and
> all writes are forced to disk immediately.

I could have sworn this had already been pointed out in this thread,
but nearly all modern disk drives include some buffer memory -- 8 to
16 megabytes is a fairly common amount. Unless you disable that
memory (non-trivial and the necessary steps are specific to the make
of drive) writing do disk only results in an (immediate) write to
that buffer memory, not to the disk platter.

Though it depends somewhat on the disk, most drives store enough
power on board (in a capacitor) that if the power dies, they can
still write the data in the buffer out to the platter. As such, you
generally don't have to worry about bypassing it to assure your data
gets written.

Nonetheless, the buffer decouples your writes from head seeking to a
degree that makes it virtually impossible to correlate the two on an
individual basis.

A couple of other minor details: first, several years ago when people
became aware of seek time, some drive manufacturers decided to
inflate their numbers a bit changing the "average" from seeking
across half the disk to seeking across one third of the disk. You
seem to be assuming the average they supply is what will apply, but
that's rarely the case.

Second, you aren't taking the rotational latency of the disk into
account. The fastest drives I know of today spin at 15,000 RPM. That
equates to 4 ms per rotation. Under normal circumstances, you have to
wait for approximately half a rotation (on average) to get to the
beginning of the track, then it normally writes a full track at a
time, so it takes on more rotation to finish writing. IOW, writing
data to the platters takes ~6ms *after* the disk has seeked to the
right track/cylinder.

Of course, that's with the fastest disks you can get -- with a 10,000
RPM disk, you're looking at 9 ms. With a garden variety 7,200K RPM
disk, it works out to 12.5ms. With a 5,400 RPM disk (e.g., what most
notebooks use) it works out to 16.7 ms.

--
Later,
Jerry.

From: Jerry Coffin on 13 Apr 2010 14:55

In article <h_KdnYjzyrIBMFnWnZ2dnUVZ_qCdnZ2d(a)giganews.com>,
NoSpam(a)OCR4Screen.com says...
>
> "Jerry Coffin" <jerryvcoffin(a)yahoo.com> wrote in message
> news:MPG.262e563b90cdf67c989875(a)news.sunsite.dk...

[ ... ]

> > The real question is which work load is a more accurate
> > simulation of Peter's OCR engine. Mine simulates what Peter has
> > *said* -- that it's completely CPU bound. In all honesty, that's
> > probably not correct, so yours is probably a more accurate
> > simulation of how it's likely to work in reality.
>
> The Task manager shows a solid 25% on my quad-core, so it
> seems that unless the task manager is lying then it really
> is CPU bound.

It's not really a question of Task Manager lying, but of failing to
provide enough detail to answer questions like this with certainty.

Just for example, I'm pretty sure that if you run Hector's test code
with only a single thread, it'll look pretty similar, also using
right around 25% on your quad core. Nonetheless, as you can see when
you run it with two threads, there's enough blocking time that it
gives the lower priority thread some time to run.

If you run the code I originally posted, it'll also use right at 25%
on a quad core, BUT the higher priority thread gets essentially *all*
the CPU time.

--
Later,
Jerry.

From: Peter Olcott on 13 Apr 2010 15:14

"Jerry Coffin" <jerryvcoffin(a)yahoo.com> wrote in message
news:MPG.262e5c39c3902417989876(a)news.sunsite.dk...
> In article
> <G9udnbRPubZBOVnWnZ2dnUVZ_vqdnZ2d(a)giganews.com>,
> NoSpam(a)OCR4Screen.com says...
>
> [ ... ]
>
>> That is not what I said. I said that it would take at
>> least
>> this long if the drive's (average) seek time is 9 ms,
>> and
>> all writes are forced to disk immediately.
>
> I could have sworn this had already been pointed out in
> this thread,
> but nearly all modern disk drives include some buffer
> memory -- 8 to
> 16 megabytes is a fairly common amount. Unless you disable
> that
> memory (non-trivial and the necessary steps are specific
> to the make
> of drive) writing do disk only results in an (immediate)
> write to
> that buffer memory, not to the disk platter.

Because of required fault tolerance they must be immediately
flushed to the actual platters.

>
> Though it depends somewhat on the disk, most drives store
> enough
> power on board (in a capacitor) that if the power dies,
> they can
> still write the data in the buffer out to the platter. As
> such, you
> generally don't have to worry about bypassing it to assure
> your data
> gets written.

When you are dealing with someone else's money (transactions
are dollars) this is not recommended.

>
> Nonetheless, the buffer decouples your writes from head
> seeking to a
> degree that makes it virtually impossible to correlate the
> two on an
> individual basis.
>

Buffer must be shut off, that is exactly and precisely what
I meant by [all writes are forced to disk immediately].

> A couple of other minor details: first, several years ago
> when people
> became aware of seek time, some drive manufacturers
> decided to
> inflate their numbers a bit changing the "average" from
> seeking
> across half the disk to seeking across one third of the
> disk. You
> seem to be assuming the average they supply is what will
> apply, but
> that's rarely the case.

Then they are liars and should be sued.

> Second, you aren't taking the rotational latency of the
> disk into
> account. The fastest drives I know of today spin at 15,000
> RPM. That
> equates to 4 ms per rotation. Under normal circumstances,
> you have to
> wait for approximately half a rotation (on average) to get
> to the

I think that the figure that I quoted may have already
included than, it might really be access time rather than
seek time. I am so unused to the c/c++ library lseek and
fseek meaning that, that I may have related the incorrect
term.

> beginning of the track, then it normally writes a full
> track at a
> time, so it takes on more rotation to finish writing. IOW,
> writing
> data to the platters takes ~6ms *after* the disk has
> seeked to the
> right track/cylinder.
>
> Of course, that's with the fastest disks you can get --
> with a 10,000
> RPM disk, you're looking at 9 ms. With a garden variety
> 7,200K RPM
> disk, it works out to 12.5ms. With a 5,400 RPM disk (e.g.,
> what most
> notebooks use) it works out to 16.7 ms.
>
> --
> Later,
> Jerry.

In any case access time still looks like it is the binding
constraint on my TPS.

From: Joseph M. Newcomer on 13 Apr 2010 15:15

See below...
On Mon, 12 Apr 2010 19:29:17 -0500, "Peter Olcott" <NoSpam(a)OCR4Screen.com> wrote:

>
>"Joseph M. Newcomer" <newcomer(a)flounder.com> wrote in
>message news:11s6s51ufidbrigq8g1u0nl1jkj6hhfkmf(a)4ax.com...
>> See below....
>> On Sun, 11 Apr 2010 21:33:33 -0500, "Peter Olcott"
>> <NoSpam(a)OCR4Screen.com> wrote:
>>
>>>
>
>>>What is the exact scenario that produces this massive
>>>delay?
>> ***
>> Sorry, I already explained this, complete with the
>> arithmetic to prove it. I don't feel
>> llike repeating the obvious.
>
>I don't believe you, this is merely a ruse. If you are going
>to go on and on and on without ever getting to the point
>then you are a harasser and not at all a helper.
****
What don't you believe? That I gave you the computations to prove this? (You said you
didn't read all my messages, and that's not my problem). Or that the computations are
faulty? Will your "sound reasoning" disprove my figures?
>
>> *****
>>>
>>>> whereas a SQMS model with priority-inversion prevention
>>>> can MINIMIZE end-to-end delays in
>>>> the server.
>>>
>>>On a single core processor? If it does I don't see how.
>>>You
>>>have to explain these details. On a quad core it is almost
>>>obvious how it could help.
>> ****
>> Yep, on a single-core processor! One of the little
>> details is one of the most
>> carefully-guarded secrets of modern operating systems, so
>> I;m not surprised you haven't
>> heard about it. It is called "time slicing", and I could
>> tell you more about it, but then
>> I'd have to kill you, because this top-secret technique is
>> known only to very few
>> initiates. I may have violated my sacred oaths even by
>> hinting at its existence, and I
>> will have to watch out for the high priests of operating
>> systems, who may declare me
>> excommunicated for revealing it.
>> ****
>
>Yes yet another ruse. You are merely flatly wrong again and
>hiding it behind rhetoric.
****
Sorry, I gave real numbers. I don't need to repeat all that detailed logic again.
****
>
>>>
>>>>
>>>> But then, performance clearly is not an important
>>>> consideration, or you would want a
>>>> design that minimizes end-to-end transaction time in the
>>>> server under high load
>>>> conditions. And you would not be so insistent that we
>>>> acknowlege your design must be
>>>
>>>No it is just you ignoring design constraints again.
>>>Single
>>>core not quad core.
>> ****
>> I am curious where you are finding these single-core
>> machines? The antique sales on eBay?
>> Is your ISP really willing to support these for you?
>>
>> But SQMS works better on single-core machines (see
>> reference to that secret technique
>> called "time slicing")
>
>So I am already doing that with my MQMS, and because I have
>been told that Linux does not do a very good job with
>threads, my implementation is likely superior to yours. I
>have seen no reasoning to the contrary.
****
You seem to think that when I refer to "threads" I mean "threads inside a single process",
while you meant "pseudo-threads library". Assume that when I say threads, I mean
"independently schedulable preemptible operating system-managed computations" and you
repackage those in whatever form your obsolete operating system requires.
****
>
>I equate the absence of reasoning with the lack of truth
>because the ONLY correct measure of truth is sound
>reasoning. Credibility is a pretty crappy stand-in for sound
>reasoning (valid reasoning based on true premises).
****
Well, there has never been any truth in anything you have said, because there has never
been any trace of "sound reasoning" in any decision you have made. Instead, you use
conjecture, hearsay, and psychic methods to create models that are not recognizable to
actual practitioners of the profession. "Memory-mapped files increase paging traffic"
(whereas, we who known something about them, reasoning from sound principles, know that
they DECREASE paging traffi), "I will turn off virtual memory" (I am not sure what "sound
reasoning" led to this). "I must have contiguous physical memory allocated" (sound
reasoning running at full force here), "TCP/IP can throw away packets" (REALLY sound
reasoning on this one! Apparently this requires an implementation of the TCP/IP protocol
that differs from the standard implementation by such a distance that no one has ever seen
an instance of it!) "pwrite does an atomic append, and therefore preserves transactional
integrity" (proven wrong) "My database update will require one seek and a maximum of 9ms"
(inconsistent with implementations of real file systems; apparently you have confused the
fseek operation with the actual movement of the disk arm) "Variable-length database
records require a second index that maps record numbers to file offsets" (not how ISAM is
implemented) How many times does your "sound reasoning" have to be disproved? That's all
we've been doing for the last month or so. Your credibility is so close to zero that
large negative exponents are required to characterize it. Or simple real numbers in the
range of 0.0..-1.0. I would hate to have to make design decisions based on your
understanding of how systems work. I'd get most of them wrong if I did that.

So how is it that your unsubstantiated opinions are so reliable and nobody else's opinions
can be trusted?
joe

>
>> joe
>> ****
>>>
>>>> right, when I was able to demonstrate, with third-grade
>>>> arithmetic, that it isn't very
>>>> good.
>>>> joe
>>>>
>>>
>> Joseph M. Newcomer [MVP]
>> email: newcomer(a)flounder.com
>> Web: http://www.flounder.com
>> MVP Tips: http://www.flounder.com/mvp_tips.htm
>
Joseph M. Newcomer [MVP]
email: newcomer(a)flounder.com
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm

From: Joseph M. Newcomer on 13 Apr 2010 15:22

See below...
On Mon, 12 Apr 2010 22:15:39 -0500, "Peter Olcott" <NoSpam(a)OCR4Screen.com> wrote:

>
>"Jerry Coffin" <jerryvcoffin(a)yahoo.com> wrote in message
>news:MPG.262d77475858ed16989871(a)news.sunsite.dk...
>> In article
>> <k_adnZQ1so5DJ17WnZ2dnUVZ_rednZ2d(a)giganews.com>,
>> NoSpam(a)OCR4Screen.com says...
>>
>> [ ... ]
>>
>>> > But SQMS works better on single-core machines (see
>>> > reference to that secret technique
>>> > called "time slicing")
>>>
>>> So I am already doing that with my MQMS, and because I
>>> have
>>> been told that Linux does not do a very good job with
>>> threads, my implementation is likely superior to yours. I
>>> have seen no reasoning to the contrary.
>>
>> What relationship do you see between multiple threads of
>> execution,
>> and multiple queues?
>
>None. A Single queue might work well with multiple threads
>of a single process because IPC does not need to occur. The
>MQ is because multiple processes require IPC.
****
OMG! IT IS WORSE THAN I IMAGINED!!!! I want to understand the "sound reasoning" that
means that one form of interthread communication dictates that no other form of
interthread communication can possibly work, and this one-and-only opetion dictates SQMS
cannot work!

You have probably made another error of sound reasoning here, but you have not
demonstrated why you think "IPC" (which is one form of interthread communication) changes
the scheduling picture. Please lay out the details, because we are obviously too stupid
to breathe on our own, and have missed the flash of brilliant insight that led to this
conclusion.
****
>
>>
>> You actually have things exactly backwards: with enough
>> processor
>> cores, separate queues start to gain an advantage due to
>> lower
>> contention over a shared resource -- though contention has
>> to be
>> quite high for that to apply. Even then, you basically
>> have to take
>> pretty careful (and frankly, somewhat tricky) steps to get
>> the
>> multiple queues to act like a single priority queue before
>> you get
>> any good from it.
>>
>> For a single processor, there's no room for question that
>> a single
>> priority queue is the way to go though -- a single
>> hyperthreaded core
>> simply won't even come code to producing the level of
>> contention
>> necessary to make multiple queues superior (something on
>> the order of
>> quad hyperthreaded cores might start to get close).
>
>I just don't see any scenarios with my current design where
>this would be true. Multiple queues are easy simply write to
>the back and read form the front.
****
And your point is...?

Please present the sound reasoning that suggests that other alternatives are not viable.
Produce actual computations by which you have derived this reasoning, with real numbers in
it. If you cannot demonstrate it with real numerical values, then you do not have "sound
reasoning" by any stretch of the definition of the term.
****
>
>A single queue is much more difficult. Write at whatever
>location is appropriate at the moment and read from the head
>of whichever portion of the queue applies to this priority.
>What do you do a linear search for the head? If you don't do
>a linear search and keep track of the different heads
>separately then this is essentially four queues, simply
>strung together. How can that be better than four queues?
****
No, you keep the queue in sorted order. You, of course, have performance numbers which
(since you won't experiment) you can derive from sound reasoning to demonstrate that what
is going on there is ultimately a provably correct design decision.

Otherwise, you are just ranting. You can't prove you are right, so you tell us we must be
wrong because we disagree with you.

I want to see this sound reasoning laid out in detail, with numbers to go with it.
joe
****
>
Joseph M. Newcomer [MVP]
email: newcomer(a)flounder.com
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm

First | Prev | Next | Last
Pages: 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154
Prev: Improving Pete'r Application Performance
Next: Competitors for Pet'e OCR system