Interesting presentation [Computer Architecture]

Prev: Multi-core lag for Left 4 Dead 1 and 2 and Quake 4 on AMD X23800+ processor... why ?
Next: Which is the most beautiful and memorable hardware structure in a CPU?

From: Stephen Fuld on 5 Apr 2010 16:20

On 4/5/2010 9:19 AM, nmm1(a)cam.ac.uk wrote:
> In article<hpd162$n2s$1(a)news.eternal-september.org>,
> Stephen Fuld<SFuld(a)Alumni.cmu.edu.invalid> wrote:
>>
>> To your list of advantages of paging, I would add the elimination of
>> memory fragmentation.
>
> Eh? Not merely does it not eliminate it, it can make it worse!
> The dogma that page tables must not be visible to the programmer
> has made a lot of memory tuning nigh-on impossible for as long as
> demand paging has been around.
>
> As the Atlas/Titan showed, virtual memory without demand paging
> eliminates memory fragmentation just as well as demand paging does.

I didn't know anything about the Atlas Titan so I looked it up in
Wikipedia. The article at

http://en.wikipedia.org/wiki/Titan_%28computer%29

clearly states that the Titan used real, as opposed to virtual or paged
memory.

Did they get it wrong?

Let me be clear about what I mean by memory fragmentation. If you break
a program into 4K byte chunks, then as long as the physical memory is a
multiple of 4K in size, you can always fit X full pages in that memory,
where X is memory size divided by 4K of course. If you have variable
sized chunks, then you can get into the situation where the total amount
of memory available to load a program is sufficient, but it is broken up
into multiple non-contiguous pieces and thus there is insufficient
contiguous space to load another program. This leaves you with the
alternatives of not fully utilizing the memory or relocating programs by
some means to make the available space contiguous. It is in this sense
that fixed size pages eliminate the potential fragmentation of variable
sized segments.

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

From: nmm1 on 5 Apr 2010 16:34

In article <hpdgmn$d7j$1(a)news.eternal-september.org>,
Stephen Fuld <SFuld(a)Alumni.cmu.edu.invalid> wrote:
>>>
>>> To your list of advantages of paging, I would add the elimination of
>>> memory fragmentation.
>>
>> Eh? Not merely does it not eliminate it, it can make it worse!
>> The dogma that page tables must not be visible to the programmer
>> has made a lot of memory tuning nigh-on impossible for as long as
>> demand paging has been around.
>>
>> As the Atlas/Titan showed, virtual memory without demand paging
>> eliminates memory fragmentation just as well as demand paging does.
>
>I didn't know anything about the Atlas Titan so I looked it up in
>Wikipedia. The article at
>
>http://en.wikipedia.org/wiki/Titan_%28computer%29
>
>clearly states that the Titan used real, as opposed to virtual or paged
>memory.
>
>Did they get it wrong?

Yes. I was merely a user of it, so am repeating what I was told.

>Let me be clear about what I mean by memory fragmentation. If you break
>a program into 4K byte chunks, then as long as the physical memory is a
>multiple of 4K in size, you can always fit X full pages in that memory,
>where X is memory size divided by 4K of course. If you have variable
>sized chunks, then you can get into the situation where the total amount
>of memory available to load a program is sufficient, but it is broken up
>into multiple non-contiguous pieces and thus there is insufficient
>contiguous space to load another program. This leaves you with the
>alternatives of not fully utilizing the memory or relocating programs by
>some means to make the available space contiguous. It is in this sense
>that fixed size pages eliminate the potential fragmentation of variable
>sized segments.

Yes, that's what I understood you to mean.

The approach taken was that each program was given a contiguous
section of real memory, but could access it only using the virtual
address. That obviously eliminates memory fragmentation just as
effectively as demand paging.

As with compacting garbage collectors and arbitrary-sized vectors
versus lists, you are exchanging the infrequent need for expensive
compaction from the frequent use of non-local and non-systematic
memory locations. And the latter form of fragmentation can be very,
very bad news for performance.

Regards,
Nick Maclaren.

From: "Andy "Krazy" Glew" on 5 Apr 2010 18:03

On 4/4/2010 6:08 PM, Robert Myers wrote:
> Andy "Krazy" Glew wrote:
>
>> ... you must remember that the guys doing the hardware (and, more
>> importantly, microcode and firmware, which is just another form of
>> software) for such multilevel mmory systems have the same ideas and
>> read the same papers.
>
> So is the lesson: If you're not IBM or Intel, why waste time talking
> about it? AMD is too busy playing catchup with two for one deals (no
> mention of it not being a *real* twelve core die). License Intel's Atom IP?
>
> Robert.

No lesson. Just musing.

If you are Microsoft, or some other company that controls much of the OS and software (e.g. Oracle/Sun, IBM?) you may
consider doing paging/swapping inside the OS/SW stack. As now. You will have the benefits of having the paging/swapping
algorithm be software aware; you will have the deficits, in performance and organizational coupling, mentioned earlier.

If you are a hardware company paging is attractive. Sure, maybe you want to talk about hints to promote more efficient
SW/OS use.

Which approach will win? I don't know. I just know what approach I would work on if I were employed by a HW company,
and what I would work on if I were employed by a SW/OS company.

And my meta-observation has been that in the last 25 years, approaches that lead to organizational decoupling - that
allow the hardware team to innovate independently of the SW team - seem to have predominated.

From: Anne & Lynn Wheeler on 5 Apr 2010 23:22

anton(a)mips.complang.tuwien.ac.at (Anton Ertl) writes:
> The first machine with virtual memory was the Atlas, where a
> fixed-point register add took 1.59 microseconds according to
> <http://en.wikipedia.org/wiki/Atlas_Computer>. The page does not tell
> how fast the drum store (used for secondary storage) was, but the
> drums were bigger and rotated slower than today's disk drives. E.g.,
> <http://en.wikipedia.org/wiki/IBM_System/360> mentions the IBM 2301,
> which had 3500rpm. If the (earlier) Atlas drum rotated at the same
> speed, we get a rotational latency of 8571 microseconds, i.e., 5391
> instructions. To that add the time to read in the page (a full
> rotation would be 10782 instructions, but maybe they could store more
> than one page per track, and thus have faster read times). Later
> mainframe paging probably had similar speed ratios.

as recently mentioned in different mailing list/thread ....
http://www.garlic.com/~lynn/2010g.html#22 Mainframe Executive article on the death of tape

original cp67 just did single page transfer per i/o ... so 2301 would
saturate at about 80 page/sec. i redid several things in cp67
.... including chaining multiple page transfers in single i/o
.... chaining in rotational order. this resulted in still half rotational
delay per i/o ... but tended to be amortized over several page
transfers. this resulted in being able to drive 2301 up to nearly 300
page transfers per second (each i/o took longer ... but the queue delay
was significantly reduced under heavy load ... since it had almost four
times the peak thruput).

as an aside ... 2301 drum and 2303 drum were very similar ... except
2301 would read/write four heads in parallel ... getting four times the
data transfer rate (and track sizes were four time largers ... but only
1/4 the number of tracks).

so the old/original service time/latency was essentially 1/80th second
(13mills). with chained requests ... the service time increases because
of doing multiple transfers/second ... but avg. service time becomes
approx. 1/300th second (3mills).

--
42yrs virtualization experience (since Jan68), online at home since Mar1970

From: EricP on 6 Apr 2010 01:29

Anne & Lynn Wheeler wrote:
>
> original cp67 just did single page transfer per i/o ... so 2301 would
> saturate at about 80 page/sec. i redid several things in cp67
> .... including chaining multiple page transfers in single i/o
> .... chaining in rotational order. this resulted in still half rotational
> delay per i/o ... but tended to be amortized over several page
> transfers. this resulted in being able to drive 2301 up to nearly 300
> page transfers per second (each i/o took longer ... but the queue delay
> was significantly reduced under heavy load ... since it had almost four
> times the peak thruput).

I vaguely recall someone telling me that 370 VM had a page file
defragger process that would coalesce pages for long running
processes so they were contiguous in the page file,
to facilitate multi page read ahead without multiple seeks.

Does any of that sound familiar to you?

eric

First | Prev | Next | Last
Pages: 2 3 4 5 6 7 8 9 10 11 12 13 14
Prev: Multi-core lag for Left 4 Dead 1 and 2 and Quake 4 on AMD X23800+ processor... why ?
Next: Which is the most beautiful and memorable hardware structure in a CPU?