Can extra processing threads help in this case? [MFC]

Prev: Improving Pete'r Application Performance
Next: Competitors for Pet'e OCR system

From: Peter Olcott on 22 Mar 2010 16:25

"Hector Santos" <sant9442(a)nospam.gmail.com> wrote in message
news:ea3d0gfyKHA.5040(a)TK2MSFTNGP02.phx.gbl...
> Peter Olcott wrote:
>
>> Try and explain exactly how cache can possibly help when
>> there is most often essentially no spatial or temporal
>> locality of reference.
>
>
> Its called WINDOWS Virtual Memory Caching technology.
>
> This is not DOS. You are not dealing directly with the
> CHIP here.

I know that. I also know the inherent memory access patterns
of my algorithm.

Joe keeps bringing up how complex the actual underlying
memory access patterns are when one also considers cache.

I keep brining up that there can be no complex underlying
memory access patterns if because of lack of spatial and
temporal locality of reference cache can mostly not be used.

I am beginning to think you two guys are stuck in "refute
mode", yet I remain open to the possibility that it may be
me and neither of you.

>
> You need to stop reading stuff out, finding a new "buzz
> word" thinking you got a "AH HA" and believe it proves
> your erroneous understanding of Windows programming.
>
> --
> HLS

From: Joseph M. Newcomer on 22 Mar 2010 16:28

See below...
On Mon, 22 Mar 2010 14:40:57 -0500, "Peter Olcott" <NoSpam(a)OCR4Screen.com> wrote:

>
>Try and explain exactly how cache can possibly help when
>there is most often essentially no spatial or temporal
>locality of reference.
>
****
While caches work well with locality of reference, that is just a heuristic for predicting
cache effects. Locality of reference is not the point; maximizing cache hits is the
point. And this can happen, particularly on a shared L3 cache, based solely on the cache
replacement algorithm. We use locality of reference as the "easy" approach to determining
the likelikhood of cache hits, because it is easy to analyze in applications that process
regular data like matrices and arrays. But it is not the theoretical optimum approach. If
you took the time to understand how caches work this would be obvious to you.

Try to explain why you believe this when you have run no experiments that have any
meaning. The difference is that I am saying YOU HAVE NO DATA, and you are saying I KNOW
WHAT IS GOING TO HAPPEN, I DON'T NEED NO STINKIN' FACTS. I don't believe you really know
what is going to happen, you are just guessing. I know what I would do: as an egineer
(there's that nasty word again) I'd go out at GET the facts. Then, I could say "But I
have run this experiment, and it substantiates my theory" and that would be useful
knowledge. But you just blindly claim you "know" what is going to happen. I'm supposed
to take this seriously from someone who dosen't even understand why a Memory Mapped File
is going to give superior performance? Given your demonstrated lack of understanding of
operating systems, why should I believe ANY assertion you make, unless you have the data
to back it up? Hell, I wouldn't believe MY OWN theories about performance without data,
and 15 years of performance measurement have convinced me of one absolute fact: "Ask a
programmer when the performance bottleneck is in their code, and you will get a wrong
answer". That rule NEVER failed me in 15 years of real performance measurement of real
programs on real machines, and I believe it today. Botom line: only actual performance
data proves anything. Theories about where performance is going are universally wrong
unless supported by actual measurements. You have a p-baked theory, for p considerably
less than 0.5 (p==0.5 is half-baked), and you refuse to see test your theory. Not a
robust approach to building systems. You may be absolutely correct, but you cannot PROVE
it without data.
joe
****

>>>
>> Joseph M. Newcomer [MVP]
>> email: newcomer(a)flounder.com
>> Web: http://www.flounder.com
>> MVP Tips: http://www.flounder.com/mvp_tips.htm
>
Joseph M. Newcomer [MVP]
email: newcomer(a)flounder.com
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm

From: Joseph M. Newcomer on 22 Mar 2010 16:33

I have a NUMA machine, and AMD dual-chip dual-core (4-core) system running WIn32 (Vista),
so if you need some tests run, email the code to me.

Remember when he wanted his data allocating in CONTIGUOUS PHYSICAL memory? He is really
clueless about how operating systems work, but won't listen to ANYONE whose ideas don't
match his preconceived notions about how the world should work to maximize his
convenience. EVen if what we're trying to do is explain how reality works.

I wonder if he knows what TLB thrashing is?
joe
*****
On Mon, 22 Mar 2010 14:57:13 -0400, Hector Santos <sant9442(a)nospam.gmail.com> wrote:

>Joseph M. Newcomer wrote:
>
>>>>
>>>> He has been told that MMF can help him.
>>>>
>>>> --
>>>> HLS
>>> Since my process (currently) requires unpredictable access
>>> to far more memory than can fit into the largest cache, I
>>> see no possible way that adding 1000-fold slower disk access
>>> could possibly speed things up. This seems absurd to me.
>> ****
>> He has NO CLUE as to what a "memory-mapped file" actually is. This last comment indicates
>> total and complete cluelessness, plus a startling inabilitgy to understand that we are
>> making USEFUL suggestions because WE KNOW what is going on and he has no idea.
>
>
>What he doesn't realize is that his 4GB loading is already
>virtualized. He believes that all of that is in pure RAM. The pages
>fault prove that point but he doesn't understand what that means.
>
>He doesn't realize that his PC is techically a VIRTUAL MACHINE! He
>doesn't understand the INTEL memory segmentation framework. Maybe he
>this its DOS? That is why I said if he wants PURE RAM operations, he
>might be better off with a 16 bit DMPI DOS program or moving over to a
>MOTOROLA chip that will over offer a linear memory model - if that is
>still true today.
>
>> Like you, I'm giving up.
>
>
>There are two parts:
>
>First, I'm actually exploring scaling methods with the simulator I
>wrote for him. I have a version where I am exploring NUMA that will
>leverage 2003+ Windows technology. I am going to pencil in getting a
>test computer with a Intel XEON that offer NUMA.
>
>Second, get some good will out of this if I can convince this guy that
>he needs to change his application to better perform. Or at least
>understand this his old memory usage paradigm for processes does not
>apply under Windows. The only reason I can suspect for his ignorance
>is that he is not a programmer or at the very least, very primitive
>nature of programming knowledge. A real Windows programmer would
>under this this basic principles or at least explore what experts are
>saying. He is not even exploring anything!
>
>> I'm dropping out of this discussion.
>
>
>I should too.
Joseph M. Newcomer [MVP]
email: newcomer(a)flounder.com
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm

From: Peter Olcott on 22 Mar 2010 16:47

"Joseph M. Newcomer" <newcomer(a)flounder.com> wrote in
message news:rnjfq5ls8fpma0kvrc6odhuvqfignso8m5(a)4ax.com...
> See below...
> On Mon, 22 Mar 2010 14:40:57 -0500, "Peter Olcott"
> <NoSpam(a)OCR4Screen.com> wrote:
>
>
>>
>>Try and explain exactly how cache can possibly help when
>>there is most often essentially no spatial or temporal
>>locality of reference.
>>
> ****
> While caches work well with locality of reference, that is
> just a heuristic for predicting
> cache effects. Locality of reference is not the point;
> maximizing cache hits is the
> point. And this can happen, particularly on a shared L3
> cache, based solely on the cache
> replacement algorithm. We use locality of reference as
> the "easy" approach to determining
> the likelikhood of cache hits, because it is easy to
> analyze in applications that process
> regular data like matrices and arrays. But it is not the
> theoretical optimum approach. If
> you took the time to understand how caches work this would
> be obvious to you.
>
> Try to explain why you believe this when you have run no
> experiments that have any
> meaning. The difference is that I am saying YOU HAVE NO
> DATA, and you are saying I KNOW
> WHAT IS GOING TO HAPPEN, I DON'T NEED NO STINKIN' FACTS.
> I don't believe you really know

I do have data, and present this data many times and you two
simply blow it off.
Two processes take 2.75 times as long as one process. What
could this mean besides resource contention?

You tell me all about pages faults, yet the process monitor
reports zero page faults, and you continue to claim that its
all about page faults, and virtual memory. Pages faults
indicate victual memory usage right? A lack of page faults
indicates a lack of virtual memory usage right?

> what is going to happen, you are just guessing. I know
> what I would do: as an egineer
> (there's that nasty word again) I'd go out at GET the
> facts. Then, I could say "But I
> have run this experiment, and it substantiates my theory"
> and that would be useful
> knowledge. But you just blindly claim you "know" what is
> going to happen. I'` `m supposed
> to take this seriously from someone who dosen't even
> understand why a Memory Mapped File
> is going to give superior performance? Given your
> demonstrated lack of understanding of
> operating systems, why should I believe ANY assertion you
> make, unless you have the data
> to back it up? Hell, I wouldn't believe MY OWN theories
> about performance without data,
> and 15 years of performance measurement have convinced me
> of one absolute fact: "Ask a
> programmer when the performance bottleneck is in their
> code, and you will get a wrong
> answer". That rule NEVER failed me in 15 years of real
> performance measurement of real
> programs on real machines, and I believe it today. Botom
> line: only actual performance
> data proves anything. Theories about where performance is
> going are universally wrong
> unless supported by actual measurements. You have a
> p-baked theory, for p considerably
> less than 0.5 (p==0.5 is half-baked), and you refuse to
> see test your theory. Not a
> robust approach to building systems. You may be
> absolutely correct, but you cannot PROVE
> it without data.
> joe
> ****
>
>>>>
>>> Joseph M. Newcomer [MVP]
>>> email: newcomer(a)flounder.com
>>> Web: http://www.flounder.com
>>> MVP Tips: http://www.flounder.com/mvp_tips.htm
>>
> Joseph M. Newcomer [MVP]
> email: newcomer(a)flounder.com
> Web: http://www.flounder.com
> MVP Tips: http://www.flounder.com/mvp_tips.htm

From: Peter Olcott on 22 Mar 2010 16:52

"Joseph M. Newcomer" <newcomer(a)flounder.com> wrote in
message news:ecdfq5lb57qrou47d1ppaupsi6t2guu7nv(a)4ax.com...
> See below...
>
> On Mon, 22 Mar 2010 10:31:17 -0500, "Peter Olcott"
> <NoSpam(a)OCR4Screen.com> wrote:
>
>>
>>"Hector Santos" <sant9442(a)nospam.gmail.com> wrote in
>>message
>>news:%23Q4$1KdyKHA.404(a)TK2MSFTNGP02.phx.gbl...
>>> Joseph M. Newcomer wrote:
>>>
>>>
>>>> Note also if you use a memory-mapped file and two
>>>> processes share the same mapping object
>>>> there is only one copy of the data in memory! THis
>>>> has
>>>> not previously come up in
>>>> discussions, but could be critical to your performance
>>>> of
>>>> multiple processes.
>>>> joe
>>>
>>>
>>> He has been told that MMF can help him.
>>>
>>> --
>>> HLS
>>
>>Since my process (currently) requires unpredictable access
>>to far more memory than can fit into the largest cache, I
>>see no possible way that adding 1000-fold slower disk
>>access
>>could possibly speed things up. This seems absurd to me.
> ****
> He has NO CLUE as to what a "memory-mapped file" actually
> is. This last comment indicates

http://en.wikipedia.org/wiki/Memory-mapped_file
Apparently I do.

> total and complete cluelessness, plus a startling
> inabilitgy to understand that we are
> making USEFUL suggestions because WE KNOW what is going on
> and he has no idea.
>
> Like you, I'm giving up. There is only so long you can
> beat someone over the head with
> good ideas which they reject because they have no idea
> what you are talking about, but
> won't expend any energy to learn about, or ask questions
> about. Since he doesn't
> understand what shared sections are, or what they buy, and
> that a MMF is the way to get
> shared sections, I'm dropping out of this discussion. He
> has found a set of "experts" who
> agree with him (your example apparently doesn't convey the
> problem correctly), thinks
> memory-mapped files limit access to disk speed (not even
> understanding they are FASTER
> than ReadFile!) and has failed utterly to understand even
> the most basic concepts of an
> operagin system (thinking it is like an automatic
> transmission, where you can use it
> without knowing or caring about how it works, when what he
> is really doing is trying to
> build a competition racing machine and saying "all that
> stuff about the engine is
> irrelevant", whereas anyone who does competition racing
> (like my next-door neighbor did
> for years) knows why all this stuff is critical. If he
> were a racer, and we told him
> about power-shiftting (shifting a manual transmission
> without involving the clutch), he'd
> tell us he didn't need to understand that.
>
> Sad, really.
> joe
> ***
>>
> Joseph M. Newcomer [MVP]
> email: newcomer(a)flounder.com
> Web: http://www.flounder.com
> MVP Tips: http://www.flounder.com/mvp_tips.htm

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Prev: Improving Pete'r Application Performance
Next: Competitors for Pet'e OCR system