From: Andy Glew "newsgroup at on
On 8/5/2010 1:12 AM, Nick Maclaren wrote:
> In article<DcmdnTlSne9be8TRnZ2dnUVZ_hGdnZ2d(a)giganews.com>,
> Andy Glew<"newsgroup at comp-arch.net"> wrote:
>>
>> I'm sure that somebody has beaten me to this, but, let me point out that
>> this is NOT a performance problem caused by page faults.
>>
>> It is a performance problem caused by TLB misses.
>>
>> Page faults should be a much smaller performance problem. To a first
>> order, paging from disk almost never happens, except as part of program
>> startup or cold misses to a DLL. Probably the more common form of page
>> fault occurs with OS mechanisms such as COW, Copy-On-Write.
>
> I have said this before, but I use the term in its traditional sense,
> and TLB misses ARE page faults - just not ones that need data reading
> from disk! They are handled by the same interrupt mechanism, for a
> start. I agree with the facts of what you say, of course.

No they aren't.

On Intel P6-family machines - i.e. the majority of computers >= laptops
and desktops and servers (can't say majority of machines) - TLB misses
are handled by a hardware state machine that walks the page tables
during OOO execution. They don't drain the pipeline. They have OOO
semantics.

Page faults drain the pipeline, change flow of control.

TLB misses are cheap. Page faults are expensive.

At least, page faults can be cheaply overlapped, like some cache misses
can be.)



(Of course, I have thought about ways to make page faults, even
involving "disk" I/O (more like SSD) fast, using the same mechanisms as
we used for TLB misses.)

--

Oh, and by the way: page faults usually do not involve disk I/O. Page
faults are much more often used for permissions manipulation, COW.


> But there is another, and more fundamental, reason not to separate
> them. It leads people to think incorrectly about the designs where
> there are multiple levels of such things. And, if I read the tea-
> leaves correctly, they may be coming back.

This, I agree with.

You *can* separate TLB misses from page faults. You can use different
mechanisms, with different performance.

You can implement TLB misses with page fault like mechanisms.

You can also implement page faults using TLB-miss like mechanisms. Like
having the page fault be handled by a different thread, without blocking
the thread that is taking the page fault.


> Let's ignore the issue that there were once systems with page tables
> and no TLBs - and the original TLBs were nothing but caches for the
> page tables. The horse and cart has now been replaced by a motorised
> horse-box, of course.

Actually, this isn't necessarily true, as I learned to my surprise when

On the American side of the Atlantic, in particular IBM side, it appears
that TLBs, specifically CAMs, may have come before page tables.

A historical trajectory something like

* people realized that you could build CAMs, and realized that this
could allow them to remap memory

* but then they realized that CAMs had finite size, and that they needed
to build data structures that were larger than could fit in CAMs

* so they built first segment tables (Brit style), or page tables

* and the CAMs were hidden as TLBs and caches.

I was at a BOF at OScon (okay, I organized a 5 person BOF on
capabilities at OScon) where I mentioned CAMs in passing, and a famous
software researcher said that he had not realized that hardware CAMs
were still in use.



>
> But there were a fair number of systems like this: (a) a page could be
> in the TLB; or (b) in could be in main memory with a page table entry,
> so all that was needed was the TLB reloading; (c) or it could be in
> secondary memory, which needed a copy, the page table updating, and
> the TLB reloading; or (d) it could be on disk, which needed a call to
> the swapper/pager.

This is actually the majority of modern systems, at least letters (a),
(b), (d). (I added the letters to refer to Nick's post.).

Secondary memory (c) is not that common, but there are several examples.
Sometimes the copy involves decompression, sometimes remapping of page
tables (hopefully without an expensive TLB shootdown of invalid PTEs).

(c) and (d) are really the same, except
1) sometimes the action is fast enough to be done synchronously,
sometimes slow enough to require blocking the current process and
swiching to others
2) sometimes the action is done by hardware or microcode, without
giving control to the OS.

From: Nick Maclaren on

In article <yKqdnS-sCu9VIsfRnZ2dnUVZ_qadnZ2d(a)giganews.com>,
Andy Glew <"newsgroup at comp-arch.net"> wrote:
>
>> I have said this before, but I use the term in its traditional sense,
>> and TLB misses ARE page faults - just not ones that need data reading
>> from disk! They are handled by the same interrupt mechanism, for a
>> start. I agree with the facts of what you say, of course.
>
>No they aren't.

It's a matter of terminology.

>On Intel P6-family machines - i.e. the majority of computers >= laptops
>and desktops and servers (can't say majority of machines) - TLB misses
>are handled by a hardware state machine that walks the page tables
>during OOO execution. They don't drain the pipeline. They have OOO
>semantics.

Interesting. Intel is following IBM, then. Many of the other
architectures I have used have not done that.

>(Of course, I have thought about ways to make page faults, even
>involving "disk" I/O (more like SSD) fast, using the same mechanisms as
>we used for TLB misses.)

It's certainly been done.

>Oh, and by the way: page faults usually do not involve disk I/O. Page
>faults are much more often used for permissions manipulation, COW.

A fair cop. Things have changed.

>You *can* separate TLB misses from page faults. You can use different
>mechanisms, with different performance.

Agreed. And it seems that Intel now does.

>On the American side of the Atlantic, in particular IBM side, it appears
>that TLBs, specifically CAMs, may have come before page tables.

I would need to have to do further research to find out what the
order was; certainly, most of the early action was this side of the
pond. The Atlas used something that was neither TLBs nor page
tables.

>> But there were a fair number of systems like this: (a) a page could be
>> in the TLB; or (b) in could be in main memory with a page table entry,
>> so all that was needed was the TLB reloading; (c) or it could be in
>> secondary memory, which needed a copy, the page table updating, and
>> the TLB reloading; or (d) it could be on disk, which needed a call to
>> the swapper/pager.
>
>This is actually the majority of modern systems, at least letters (a),
>(b), (d). (I added the letters to refer to Nick's post.).

Yes.

>(c) and (d) are really the same, except
> 1) sometimes the action is fast enough to be done synchronously,
>sometimes slow enough to require blocking the current process and
>swiching to others
> 2) sometimes the action is done by hardware or microcode, without
>giving control to the OS.

Yes and no. For reading, I agree in principle that those are the
distinguishing characteristics, but there is another and even more
important one for writing. Paging onto disk very often needs the
resource limits checking, often with extra signalling, and even with
callbacks into user code.


Regards,
Nick Maclaren.
From: Andy Glew "newsgroup at on
On 8/5/2010 7:51 AM, Nick Maclaren wrote:
> In article<yKqdnS-sCu9VIsfRnZ2dnUVZ_qadnZ2d(a)giganews.com>,
> Andy Glew<"newsgroup at comp-arch.net"> wrote:
>>
>>> I have said this before, but I use the term in its traditional sense,
>>> and TLB misses ARE page faults - just not ones that need data reading
>>> from disk! They are handled by the same interrupt mechanism, for a
>>> start. I agree with the facts of what you say, of course.
>>
>> No they aren't.
>
> It's a matter of terminology.
>
>> On Intel P6-family machines - i.e. the majority of computers>= laptops
>> and desktops and servers (can't say majority of machines) - TLB misses
>> are handled by a hardware state machine that walks the page tables
>> during OOO execution. They don't drain the pipeline. They have OOO
>> semantics.
>
> Interesting. Intel is following IBM, then. Many of the other
> architectures I have used have not done that.

Since P6, circa 1996.

From: MitchAlsup on
On Aug 5, 9:51 am, n...(a)gosset.csi.cam.ac.uk (Nick Maclaren) wrote:
> In article <yKqdnS-sCu9VIsfRnZ2dnUVZ_qadn...(a)giganews.com>,
> Andy Glew  <"newsgroup at comp-arch.net"> wrote:

> >On Intel P6-family machines - i.e. the majority of computers >= laptops
> >and desktops and servers (can't say majority of machines) - TLB misses
> >are handled by a hardware state machine that walks the page tables
> >during OOO execution.  They don't drain the pipeline.  They have OOO
> >semantics.
>
> Interesting.  Intel is following IBM, then.  Many of the other
> architectures I have used have not done that.

Do ANY OoO machines with hardware TLB reloaders DRAIN the pipeline to
refill a TLB?

{None of mine ever did, nor did I see any reason to do so. In fact, I
typically used the L1 Cache refill mechanism to cause the data path to
spit up the same logical address so I could finish reloading the TLB.
A drained pipe would not have left this address in a convienent
place.}

Mitch
From: Nick Maclaren on
In article <94ad8368-5fbd-4c8a-a016-492a2a6ab11f(a)f42g2000yqn.googlegroups.com>,
MitchAlsup <MitchAlsup(a)aol.com> wrote:
>
>> >On Intel P6-family machines - i.e. the majority of computers >=3D laptop=
>s
>> >and desktops and servers (can't say majority of machines) - TLB misses
>> >are handled by a hardware state machine that walks the page tables
>> >during OOO execution. =A0They don't drain the pipeline. =A0They have OOO
>> >semantics.
>>
>> Interesting. =A0Intel is following IBM, then. =A0Many of the other
>> architectures I have used have not done that.
>
>Do ANY OoO machines with hardware TLB reloaders DRAIN the pipeline to
>refill a TLB?

I doubt it - that would be plain bonkers! But quite a few machines
simply raised an interrupt on a TLB miss and left the rest to the
software. As you will remember, that was dogma in the early days
of the new RISC architectures!

I can no longer remember the results of my timing tests, but I had
some intended to check that systems weren't TOO catastrophic on
such things. I can remember quite a lot being dire including, if
I recall, post-P6 Intel systems. But it's a while ago now, and I
may well be misremembering - my tests analysed a couple of dozen
aspects, and a lot I just noted as "we can live with that".


Regards,
Nick Maclaren.