From: Terje Mathisen "terje.mathisen at on
Eric Northup wrote:
> On Aug 10, 3:07 pm, EricP<ThatWouldBeTell...(a)thevillage.com> wrote:
>> Since an x86 can touch 4 pages, 2 instruction, 2 data, in a single
>> instruction, that is the lower limit for a unified set associative TLB.
>> Or you can split it into ITLB and DTLB and use 2 way assoc. in each.
>
> The worst case x86 instruction I know of actually takes 6 or 9 pages
> mapped simultaneously to make forward progress. A movsd instruction
> (using some prefixes, to bulk the instruction to be multi-byte) can be
> arranged with each of the instruction pointer, ESI (source) and EDI
> (destination) misaligned such that all three span page boundaries.

OK.

> That takes you to 6 pages; I think you have to abuse segmentation from
> 16 bit protected mode to go to 9 - use segments which have limit=64KB,
> and base addresses which are one byte below the beginning of a 4KB
> page, and arrange for CS:EIP, DS:ESI, and ES:EDI to all point two
> bytes below their segment limits. This way, the $(segment):0xFFFE and
> $(segment):0xFFFF bytes live on distinct pages, and you get a 3rd
> page / pointer when you wrap around to $(segment):0x0000.

That one is simple: "Don't do that!"

Intentionally misaligning segments so as to start on (very) odd
addresses seems like something nobody would ever do.

The only "mainstream" 16-bit protected mode OS for x86 I know about
would be early OS/2 versions, and they always (afair) maintained at
least 16-byte alignment, so as to stay compatible with the realmode
4-bit shift of segment addresses.

Terje
--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
From: robertwessel2 on
On Aug 11, 3:24 am, Terje Mathisen <"terje.mathisen at tmsw.no">
wrote:
> Eric Northup wrote:
> > On Aug 10, 3:07 pm, EricP<ThatWouldBeTell...(a)thevillage.com>  wrote:
> >> Since an x86 can touch 4 pages, 2 instruction, 2 data, in a single
> >> instruction, that is the lower limit for a unified set associative TLB..
> >> Or you can split it into ITLB and DTLB and use 2 way assoc. in each.
>
> > The worst case x86 instruction I know of actually takes 6 or 9 pages
> > mapped simultaneously to make forward progress.   A movsd instruction
> > (using some prefixes, to bulk the instruction to be multi-byte) can be
> > arranged with each of the instruction pointer, ESI (source) and EDI
> > (destination) misaligned such that all three span page boundaries.
>
> OK.
>
> > That takes you to 6 pages; I think you have to abuse segmentation from
> > 16 bit protected mode to go to 9 - use segments which have limit=64KB,
> > and base addresses which are one byte below the beginning of a 4KB
> > page, and arrange for CS:EIP, DS:ESI, and ES:EDI to all point two
> > bytes below their segment limits.  This way, the $(segment):0xFFFE and
> > $(segment):0xFFFF bytes live on distinct pages, and you get a 3rd
> > page / pointer when you wrap around to $(segment):0x0000.
>
> That one is simple: "Don't do that!"
>
> Intentionally misaligning segments so as to start on (very) odd
> addresses seems like something nobody would ever do.
>
> The only "mainstream" 16-bit protected mode OS for x86 I know about
> would be early OS/2 versions, and they always (afair) maintained at
> least 16-byte alignment, so as to stay compatible with the realmode
> 4-bit shift of segment addresses.


Err... Windows 3.x? And Xenix, and a couple of other unix ports to
the 286 existed. NetWare 2.x also ran 16 b it segmented mode
(excepting some of the earliest versions which ran in real mode - 2.0
only, IIC).

At least Windows and Xenix also maintained 16 byte alignment.

Of course the pure 16 bit versions of those didn't support paging,
which makes the TLB miss issue rather moot. Windows 3.x running in
"#86 Enhanced" mode did support paging in the underlying 386 DOS
extender, and ran the (basically unchanged) 16 bit (p-mode) Windows
kernel on top of that.

For the life of me I can't remember if Windows 3.x in "Standard" mode
(286 protected mode) supported virtual memory (via 286 segment
swapping) or not, but OS/2 1.x and Xenix certainly did.

Err... Windows 3.x? And Xenix, and a couple of other Unix ports to
the 286 existed. NetWare 2.x also ran 16 bit segmented mode
(excepting some of the earliest versions which ran in real mode - 2.0
only, IIC).

At least Windows and Xenix also maintained 16 byte alignment. I'd be
surprised if the others didn't as well.

Of course the pure 16 bit versions of those didn't support paging,
which makes the TLB miss issue rather moot. Windows 3.x running in
"#86 Enhanced" mode did support paging in the underlying 386 DOS
extender, and ran the (basically unchanged) 16 bit (p-mode) Windows
kernel on top of that.

For the life of me I can't remember if Windows 3.x in "Standard" mode
(286 protected mode) supported virtual memory (via 286 segment
swapping) or not, but OS/2 1.x and Xenix certainly did.
From: Tim McCaffrey on
In article
<ea66678a-39a0-43ec-9499-d07607452b44(a)v15g2000yqe.googlegroups.com>,
digitaleric(a)gmail.com says...
>
>On Aug 10, 3:07 pm, EricP <ThatWouldBeTell...(a)thevillage.com> wrote:
>> Since an x86 can touch 4 pages, 2 instruction, 2 data, in a single
>> instruction, that is the lower limit for a unified set associative TLB.
>> Or you can split it into ITLB and DTLB and use 2 way assoc. in each.
>
>The worst case x86 instruction I know of actually takes 6 or 9 pages
>mapped simultaneously to make forward progress. A movsd instruction
>(using some prefixes, to bulk the instruction to be multi-byte) can be
>arranged with each of the instruction pointer, ESI (source) and EDI
>(destination) misaligned such that all three span page boundaries.
>That takes you to 6 pages; I think you have to abuse segmentation from
>16 bit protected mode to go to 9 - use segments which have limit=64KB,
>and base addresses which are one byte below the beginning of a 4KB
>page, and arrange for CS:EIP, DS:ESI, and ES:EDI to all point two
>bytes below their segment limits. This way, the $(segment):0xFFFE and
>$(segment):0xFFFF bytes live on distinct pages, and you get a 3rd
>page / pointer when you wrap around to $(segment):0x0000.


Or there is PUSH/POP mem: 2 possible Code TLB misses, 4 possible for the
data (if the stack is misaligned).

- Tim