From: Peter Olcott on

"Hector Santos" <sant9442(a)nospam.gmail.com> wrote in message
news:OXYJ1RiyKHA.928(a)TK2MSFTNGP05.phx.gbl...
>
> Peter Olcott wrote:
>
>>>> Which is it (1) or (2) ??? Any hem hawing will be
>>>> taken as intentional deceit
>>>
>>> None of the above:
>>>
>>> Your process AT THAT MOMENT does not need PAGE
>
> >> anything because it was already in your WORKING SET.
>
>>>
>>> Look, YOU SIMPLE PROGRAM IS ALWAYS USING VIRTUALIZE
>>> MEMORY! ALWAYS!
>>
>> You just contradicted yourself, but, there was no
>> intentional deceit.
>> (1) It is ALWAYS using virtual memory
>> (2) There was an instance where is was not using virtual
>> memory
>
>
> I don't see that as a contradiction at all. Your process
> gets 4GB Virtual Memory. There was a moment in space and
> time that your program did not need the OS to page data
> into your WORKING SET which is the virtual memory in
> active use by your program.

It was not a moment in time it was a 12 hour time period.
The only reason that It ended was that I was convinced that
twelve hours was enough.

>
> For example, program has allocated 2GB. It may look like
> your program has access to 2GB and thats the overall idea,
> but its virtualized. When you reference something in the
> 2GB that is not in the WORKING SET, then you get a PAGE
> FAULT which tells the system to go get from the page.sys
> file the data you need.
>
> Look it says it right here in MSDN:
>
> http://support.microsoft.com/kb/555223
>
> In modern operating systems, including Windows,
> application
> programs and many system processes *ALWAYS* reference
> memory using
> virtual memory addresses which are automatically
> translated to real
> (RAM) addresses by the hardware. Only core parts of
> the operating
> system kernel bypass this address translation and use
> real memory
> addresses directly.
>
> Virtual Memory is always in use, *EVEN* when the memory
> required
> by all running processes does not exceed the amount of
> RAM
> installed on the system.
>
> REPEAT THE FIRST SENTENCE IN EACH PARAGRAPH 1000 TIMES!
>
> How ignorance can you be? When did you start using
> Windows? or program for it?
>
> --
> HLS

For all practical purposes virtual memory is not being used
(meaning that its use is not impacting performance) whenever
zero or very few page faults are occurring.


From: Hector Santos on
Peter Olcott wrote:

>> I don't see that as a contradiction at all. Your process
>> gets 4GB Virtual Memory. There was a moment in space and
>> time that your program did not need the OS to page data
>> into your WORKING SET which is the virtual memory in
>> active use by your program.
>
> It was not a moment in time it was a 12 hour time period.
> The only reason that It ended was that I was convinced that
> twelve hours was enough.


You know, you need to really stop this horse stuff of yours. Its
right there , straight from the horse mouth. Why did you choose to
ignore this? I'll show it again:

http://support.microsoft.com/kb/555223

In modern operating systems, including Windows, application
programs and many system processes *ALWAYS* reference memory using
virtual memory addresses which are automatically translated to real
(RAM) addresses by the hardware. Only core parts of the operating
system kernel bypass this address translation and use real memory
addresses directly.

Virtual Memory is always in use, *EVEN* when the memory required
by all running processes does not exceed the amount of RAM
installed on the system.

WHY ARE YOU IGNORING THIS?

You got 8GB on your box and your process is wanting 4GB - it is STILL
virtualize no matter what you do or say.

The whole point of this thread that you SAID you can not run a 2nd
process because kills your system.

Well of course, because now you need 8GB for 2 processes!

If you don't single source the data as a sharable memory, then YOU
WILL NEVER be able to run but only 1 process on your machine.

Thats not because of the physical limitations of the machine. Its
because your PROGRAM is FLAWED and NOT designed for any kind of
scalability or usage but as a single process application - nes pas!

I illustrated and proved to you with posted code how to optimized the
process so that YOU can scale and leverage the power of your machine.

Right now you are utilizing it and using it incorrectly! You really
wasted your money, and if think the solution is to scale out, well,
thats your problem, because you are dumping money down the drain for
nothing.

--
HLS
From: Peter Olcott on

"Hector Santos" <sant9442(a)nospam.gmail.com> wrote in message
news:u0ggbTiyKHA.928(a)TK2MSFTNGP05.phx.gbl...
> Peter Olcott wrote:
>
>> And of course you know that a second thread would work
>> just fine because you know that my process is not memory
>> bandwidth intensive.
>
>
> yes, we know that. The simulator, real code with shared
> memory and multiple threads, proved this and if you took
> the time to explore it, you will see for yourself.
>
> --
> HLS

void Process()
{
KIND num;
for(int r = 0; r < repeat; r++)
for (WORD i=0; i < size; i++)
num = data[i];
}

Not at all representative of my process, thus proves nothing
about my process. Your process could derive pure spatial
locality of reference whereas mine would not. I do not move
to the next sequential memory location, my memory access
(from the cache point of view) is nearly purely random. If
you had a list of 10,000 memory locations that are all very
far from each other, then your process would approximate
mine.

You might also look at the generated code, the optimizer
tends to eliminate code such as your test case. Also you
can't not use the optimizer, because this Could skew the
test from memory intensive to CPU intensive. Maybe you could
proceed through the list and swap the values of the current
item for the value of the preceding item. and loop through
again and again.


From: Peter Olcott on

"Joseph M. Newcomer" <newcomer(a)flounder.com> wrote in
message news:l45gq55hlc3sn35e2q6vq1ur6dbvsqvqr5(a)4ax.com...
> See below...
>
> On Mon, 22 Mar 2010 16:59:34 -0500, "Peter Olcott"
> <NoSpam(a)OCR4Screen.com> wrote:
>
>>
>>"Hector Santos" <sant9442(a)nospam.gmail.com> wrote in
>>message
>>news:%23F2oLmgyKHA.5360(a)TK2MSFTNGP06.phx.gbl...
>>> Peter Olcott wrote:
>>>
>>>> Joe kept insisting and continues to insist that my data
>>>> is not resident in memory.
>>>
>>>
>>> If you have a 32 bit Windows OS, you are limited to just
>>> 2GB RAW ACCESS and 4GB of VIRTUAL MEMORY.
>>
>>Yes, and that is another thing. I kept saying that I have
>>a
>>64bit OS, and Joe kept forming his replies in terms of a
>>32-bit OS.
> ****
> And how long did I keep saying "Unless you are running a
> WIn32 process in Win64" but you
> did not clarify that you were running on Win64. So in the
> absence of any explicit
> statement I had to assume you were running in Win32.
> ****
>>
>>>
>>> If your process is loading 4GB, you are using virtual
>>> memory.
>>>
>>>> After loading my data and waiting twelve hours the
>>>> process monitor reports zero page faults, when I
>>>> execute
>>>> my process and run it to completion.
>>>
>>>
>>> You're lying, you told me you have PAGE FAULTS but it
>>> settle down to zero, which is NORMAL. But start a 2nd
>>> process and you will get page faults.
>>
>>I only get the page faults until the data is loaded. After
>>the data is loaded I get essentially no more page faults,
>>even after waiting twelve hours before running my process
>>to
>>completion. After proving that my data is resident in RAM
>>Joe continues to chide me for claiming that my data is
>>resident in RAM.
> ****
> If you used a memory-mapped file correctly, yu would have
> very low-cost page faults
> because you would be mapping to existing pages. But you
> seem to not want to hear that
> memory-mapped files will improve performance, particularly
> in a multiple-process
> environment.
> joe
> ****

I don't want to hear about memory mapped files because I
don't want to hear about optimizing virtual memory usage
because I don't want to hear about virtual memory until it
is proven beyond all possible doubt that my process does not
(and can not be made to be) resident in actual RAM all the
time.

Since a test showed that my process did remain in actual RAM
for at least twelve hours, this is sufficient evidence to
show that all of these lines of reason have at least for the
moment become completely moot. The only thing that could
make them less than completely moot would be proof that my
process can not remain resident in RAM all the time.

>>
>>You guys just playing head games with me?
> ****
> We are trying to help you, in spite of your best efforts
> to tell us we are wrong. You
> insist that simplistic experiments which gave you a single
> data point give you a basis for
> extrapolating an entire family of performance information,
> and we are saying "You don't
> KNOW until you've MEASURED" and you insist that
> measurement is not relevant because you
> MUST be right. All I'm saying is that you MIGHT be right,
> and once you do the
> measurements, you might find out that you are completely
> WRONG, which works to your
> advantage. So run the damn expeimet, already!
> joe
>
> ****
>>
>>>
>>> I also asked, now 5 times, to provide the MEMORY LOAD
>>> percentage which I even provided with a simple C program
>>> that you can compile, and you did not:
>>>
>>> // File: V:\bin\memload.cpp
>>>
>>> #include <stdio.h>
>>> #include <windows.h>
>>>
>>> void main(char argc, char *argv[])
>>> {
>>> MEMORYSTATUS ms;
>>> ms.dwLength = sizeof(ms);
>>> GlobalMemoryStatus(&ms);
>>> printf("Memory Load: %d%%",ms.dwMemoryLoad);
>>> }
>>>
>>> Why can't you even do that?
>>>
>>>> How does this not prove Joe is wrong (At least in the
>>>> specific instance of one execution of my process)?
>>>> (1) The process monitor is lying.
>>>> (2) Page faults do not measure virtual memory usage.
>>>
>>> There are now what 4-5 participants in the thread who
>>> are
>>> telling your thinking is wrong and lack a understanding
>>> of
>>> the Windows and Intel hardware.
>>>
>>> lets get a few more like this guy with a somewhat layman
>>> description:
>>>
>>> http://blogs.sepago.de/helge/2008/01/09/windows-x64-all-the-same-yet-very-different-part-1/
>>>
>>> and the #1 guy at Microsoft today!
>>>
>>> http://blogs.technet.com/markrussinovich/archive/2008/07/21/3092070.aspx
>>>
>>> If you DEFY what Mark Russinovich is saying here, you
>>> are
>>> CRAZY!
>>>
>>> --
>>> HLS
>>
> Joseph M. Newcomer [MVP]
> email: newcomer(a)flounder.com
> Web: http://www.flounder.com
> MVP Tips: http://www.flounder.com/mvp_tips.htm


From: Hector Santos on
Peter Olcott wrote:

>>
>>> And of course you know that a second thread would work
>>> just fine because you know that my process is not memory
>>> bandwidth intensive.
>>
>> yes, we know that. The simulator, real code with shared
>> memory and multiple threads, proved this and if you took
>> the time to explore it, you will see for yourself.
>>
>
> void Process()
> {
> KIND num;
> for(int r = 0; r < repeat; r++)
> for (WORD i=0; i < size; i++)
> num = data[i];
> }
>
> Not at all representative of my process, thus proves nothing
> about my process.


This is a MAXIMUM MEMORY ACCESS you can every reach. Your application
memory access will be lese stressful.

> Your process could derive pure spatial
> locality of reference whereas mine would not.


and I followed up with a RANDOM access memory access:

void Process()
{
KIND num;
for(int r = 0; r < repeat; r++)
for (WORD i=0; i < size; i++)
DWORD j = (rand() % size);
num = data[j];
}

and provided all the results on that to SHOW that randomness, which is
closer to your unpredictable memory access theory, produced better
results. I even gave you some tips on using pareto's principle
because I don't believe YOUR application is unpredictable YOU seem to
think it is.

> I do not move

> to the next sequential memory location, my memory access
> (from the cache point of view) is nearly purely random.


See above. Again, the serialize access simulation represents the
worst case scenario that will contradict your theory that there is a
major bottle neck with memory access contention with multiple threads.

> If you had a list of 10,000 memory locations that are all very
> far from each other, then your process would approximate
> mine.


The simulator had MAXULONG/6 items of DWORD (4 bytes) array, ~1`.4GB
for a 2GB machine which is 75% of memory capacity, you only have a 50%
memory need - FOR 1 PROCESS. So this simulator is FAR worst case than
yours for MEMORY ACCESS.

> You might also look at the generated code, the optimizer
> tends to eliminate code such as your test case.


Not the case here, and EVEN THEN, there is still 10 loops, MAXULONG/6
items accessed.

The FACT is, it is being read because the MEMORY LOAD and the working
set increases.

The bottom line the code shows your process is scalable when coded
properly to leveraged the technology in the Windows OS with
multi-core hardware.

Your design presumptions that it is memory bound for multi-thread
processing was incorrect.

--
HLS