From: Alexei A. Frounze on
On Aug 19, 7:26 am, "Wolfgang Kern" <nowh...(a)never.at> wrote:
> Alexei A. Frounze wrote:
>
> [about VM86 speed ...]
>
> >> Let's see if I can explain it in short words, mmh, may wont work ..:)
> >> so better a timing example:
> >> My link routines reside together with the stack in the HMA region,
> >> so both RM and PM32 can access data and call everything there by
> >> sharing one single stack with CS (BASE = 1.MB; TOP = end of HMA).
> >> I don't use paging, but if I would, it had the HMA area physical
> >> mapped anyway.
> >> I call BIOS services form within PM32sys this way:
>
> [...the example...]
>
> >> I measured just 90..135 in alive tests.
> > And this is also your interrupt latency.
>
> ?? Don't know what it got to with INT
> But right, if I'd use SW-interrupts they would take ~90 cycles(naked).

You disable the interrupts while switching between RM and PM. That's
what it got to do.

> > Most importantly, since in the above picture I see no reference to the
> > PIC, I assume that you don't reprogram it (at least, across RM<->PM
> > switches), which probably means that in PM you share the same
> > interrupt vectors between exceptions and IRQs:
> > int 8: IRQ0 (timer), #DF
> > int 9: IRQ1 (keyboard), FPU overrun 386-
> > int 10: IRQ2 (8259 PIC's cascade IRQ), #TS
> > int 11: IRQ3 (COM2/4), #NP
> > int 12: IRQ4 (COM1/3), #SS
> > int 13: IRQ5 (LPT/SB/?), #GP
> > int 14: IRQ6 (FDD), #PF
> > int 15: IRQ7 (LPT/SB/?), #DF
>
> My TRUE realmode IDT is still at 0000:0000 and I have all IRQs
> reprogrammed (to 50...5f),

So, the ROM BIOS' IRQ ISRs are still at int 8...0xf, 0x70...0x77, you
route IRQs in the PIC to ints 0x50...0x5f. Then your new RM ISRs at
ints 0x50...0x5f far-jump to ISRs at ints 8...0xf, 0x70...0x77. OK, if
there's only one IRQ whose ISR the BIOS invokes using int instruction,
int 0x71 for IRQ9, then it could work that way w/o mixing IRQs and
exceptions.

> and handle all INT 00..7f (includes all
> exceptions) by the system for both PM32 and true Real mode anyway
> [real mode exceptions differ from PM quite a bit].

True, there's no exception error code in RM.

> But this wont/mustn't be altered to call BIOS functions because I
> use INT6D instead of INT10 (even there is no exception 10h in RM).

It should be OK doing int 0x10.

> > If this is so, then either handling of the two things is complicated
> > by distinguishing between IRQs and exceptions (which worsens interrupt
> > latency further) or exception handling is generally non-functional in
> > PM. I'd still want to catch #DF, #NP, #SS and #GP even when TSS and
> > paging aren't used.
>
> I catch all exceptions for RM, PM16/32 and even Big Real and enter or
> just display the debugger box if it can't be handled by the system.
>
> > Do you not use all of these: timer, COMs, LPT/SB/?
>
> All of them can be used if present and desired.
>
> > Or do you hope to never get any exception in this range?
>
> NO! I'm not at all a believer :)

I see :)

> >> The BIOS then can access data and I/O unrestricted, without being
> >> delayed by I/O permission-, priviledge checks and segment/paging
> >> translation that occure within VM86.
> >> So there this 'lost' 190 cycles (vs.2*>240 PL3<->VM86-task switches :)
> >> are easy gained back.
> > Now, what exactly do you mean by "PL3<->VM86-task switches"?
>
> Swap stack from PL3 to PL0, swap again to enter VM86, grant pages ...
> and the whole back to return.
>
> > Do you actually perform a task switch (jump/call/int/iret to/from TSS)?
>
> Not neccessarily a full hardware task-switch, but it needs at least some
> priviledge-checked reads and writes to/from the current Task-segment.
>
> Perhaps I missed a faster way, so how long will it take you to call
> any 16 bit BIOS function from PM32 using VM86 ?
> (I may once need because om long mode..)

If you go to 64-bit mode, then the only way to execute the BIOS code
would be to do what you're doing now (switch all the way back to RM
and then back to 64-bit mode which will be much much slower) or use
the new AMD/intel virtualization technologies to set up a VM to run in
v86.

But in legacy 32-bit PM you would only need (if the rest is set up) to
do:
- get to ring 0 if not there already
- push some stuff onto the stack as if this was pushed by an interrupt/
exception occurring in a v86 task
- IRET to v86 mode w/o changing tasks (unless you use tasks in the
system for something else but just transitioning between levels)
- handle #GPs occurring in v86 (into/int3/int xx instructions will
cause this)

And, of course, you must put there a bit of code to make sure upon
successful execution of the ISR you return from v86. I use an
instruction at known location that always causes an exception and
forces an exit from v86. In the exception handler I check that I got
the exception at that location. It could be MOV EAX, CR0 or UD2 or INT
or something similar.

So, in the case there're no INT instructions executed in v86 mode and
this all is done from ring 0 it would be the cost of:
- pushes
- IRET to v86
- handling of the exception that gets me out of v86

I haven't measured cycles since I intend to use paging. Setting paging
up and tearing it down if I were to switch between RM and PM in my
opinion would kill any benefit of not using v86.

Alex

From: Bx.C / x87asm on
> I haven't measured cycles since I intend to use paging. Setting paging
> up and tearing it down if I were to switch between RM and PM in my
> opinion would kill any benefit of not using v86.

just curious...

when dealing with paging,... has anyone ever tried keeping only the 1st MB
(and the page that the PM-to-RM code is on) identity-mapped and skipping the
step of identity-mappying the rest of memory when jumping back into RM
temporarily?
--
Bx.C



From: Alexei A. Frounze on
On Aug 19, 2:13 pm, "Bx.C / x87asm" <email.a...(a)is.invalid> wrote:
> > I haven't measured cycles since I intend to use paging. Setting paging
> > up and tearing it down if I were to switch between RM and PM in my
> > opinion would kill any benefit of not using v86.
>
> just curious...
>
> when dealing with paging,... has anyone ever tried keeping only the 1st MB
> (and the page that the PM-to-RM code is on) identity-mapped and skipping the
> step of identity-mappying the rest of memory when jumping back into RM
> temporarily?
> --
> Bx.C

I remember a long time ago (around 2000) having a hang (or an
exception but the machine was hung anyway) because of not zeroing CR3
before switching to RM. And 1st meg was identity mapped. I don't
remember what CPU that was.

Alex

From: Wolfgang Kern on

Alexei A. Frounze wrote:

[about VM86 speed ...]

>> [...the example...]

>>>> I measured just 90..135 in alive tests.
>>> And this is also your interrupt latency.

>> ?? Don't know what it got to with INT
>> But right, if I'd use SW-interrupts they would take ~90 cycles(naked).

> You disable the interrupts while switching between RM and PM.
> That's what it got to do.

I see, but this IRQ-disabled period is quite short
(<80 raw cycles, measured: 35..max.50 cycles)

[...]
>> Perhaps I missed a faster way, so how long will it take you to call
>> any 16 bit BIOS function from PM32 using VM86 ?
>> (I may once need because om long mode..)

> If you go to 64-bit mode, then the only way to execute the BIOS code
> would be to do what you're doing now (switch all the way back to RM
> and then back to 64-bit mode which will be much much slower) or use
> the new AMD/intel virtualization technologies to set up a VM to run in
> v86.

Yes, I'm already aware of major changes in my Os when going the long way.

> But in legacy 32-bit PM you would only need (if the rest is set up)
> to do:
> - get to ring 0 if not there already
> - push some stuff onto the stack as if this was pushed by an interrupt/
> exception occurring in a v86 task
> - IRET to v86 mode w/o changing tasks (unless you use tasks in the
> system for something else but just transitioning between levels)
> - handle #GPs occurring in v86 (into/int3/int xx instructions will
> cause this)

Interesting, how to you determine between a bug-#GP and a VM86-#GP ?
By the offending selector only ?

> And, of course, you must put there a bit of code to make sure upon
> successful execution of the ISR you return from v86. I use an
> instruction at known location that always causes an exception and
> forces an exit from v86. In the exception handler I check that I got
> the exception at that location. It could be MOV EAX, CR0 or UD2 or INT
> or something similar.

Ah yes, a forced exception to return.
But wouldn't the same as enter VM86 work for return as well ?
PUSH.. |
IRET

> So, in the case there're no INT instructions executed in v86 mode and
> this all is done from ring 0 it would be the cost of:
> - pushes
> - IRET to v86
> - handling of the exception that gets me out of v86

So far I understood you push the INT10 vector and don't use INT10h.
Well, this could be faster than what I last tried with VM86.

> I haven't measured cycles since I intend to use paging. Setting paging
> up and tearing it down if I were to switch between RM and PM in my
> opinion would kill any benefit of not using v86.

Yes, even paging could cover the BIOS-ROMs area with physical mapping.

It would be really interesting to compare the overall performance
on a trivial thing like a palette-load in 256-bit VGA mode,
as this are 772 I/O accesses and 256 dword RAM reads.

__
wolfgang



From: Alexei A. Frounze on
On Aug 20, 7:54 am, "Wolfgang Kern" <nowh...(a)never.at> wrote:
> Alexei A. Frounze wrote:
>
> [about VM86 speed ...]
>
> >> [...the example...]
> >>>> I measured just 90..135 in alive tests.
> >>> And this is also your interrupt latency.
> >> ?? Don't know what it got to with INT
> >> But right, if I'd use SW-interrupts they would take ~90 cycles(naked).
> > You disable the interrupts while switching between RM and PM.
> > That's what it got to do.
>
> I see, but this IRQ-disabled period is quite short
> (<80 raw cycles, measured: 35..max.50 cycles)
>
> [...]
>
> >> Perhaps I missed a faster way, so how long will it take you to call
> >> any 16 bit BIOS function from PM32 using VM86 ?
> >> (I may once need because om long mode..)
> > If you go to 64-bit mode, then the only way to execute the BIOS code
> > would be to do what you're doing now (switch all the way back to RM
> > and then back to 64-bit mode which will be much much slower) or use
> > the new AMD/intel virtualization technologies to set up a VM to run in
> > v86.
>
> Yes, I'm already aware of major changes in my Os when going the long way.
>
> > But in legacy 32-bit PM you would only need (if the rest is set up)
> > to do:
> > - get to ring 0 if not there already
> > - push some stuff onto the stack as if this was pushed by an interrupt/
> > exception occurring in a v86 task
> > - IRET to v86 mode w/o changing tasks (unless you use tasks in the
> > system for something else but just transitioning between levels)
> > - handle #GPs occurring in v86 (into/int3/int xx instructions will
> > cause this)
>
> Interesting, how to you determine between a bug-#GP and a VM86-#GP ?
> By the offending selector only ?

Forgot about EFLAGS on the stack? :) Analyze its VM bit.

> > And, of course, you must put there a bit of code to make sure upon
> > successful execution of the ISR you return from v86. I use an
> > instruction at known location that always causes an exception and
> > forces an exit from v86. In the exception handler I check that I got
> > the exception at that location. It could be MOV EAX, CR0 or UD2 or INT
> > or something similar.
>
> Ah yes, a forced exception to return.
> But wouldn't the same as enter VM86 work for return as well ?
> PUSH.. |
> IRET

If you load SP with 1 and then do PUSH or if you load SP with 0xffff
and then do IRET, an exception is guaranteed and in principle you
could use that to get out of V86. ;) But no, other than an interrupt
or exception there's no way out and neither PUSH nor IRET under normal
conditions would result an exception.

> > So, in the case there're no INT instructions executed in v86 mode and
> > this all is done from ring 0 it would be the cost of:
> > - pushes
> > - IRET to v86
> > - handling of the exception that gets me out of v86
>
> So far I understood you push the INT10 vector and don't use INT10h.
> Well, this could be faster than what I last tried with VM86.

Yes, if you want the code behind int N to be executed, you can just
emulate the real-mode effect of INT N (PUSHF+FAR CALL) w/o having to
handle #GP resulting in from execution of INT N. Of course, if that
code in turn does INT X, then you still have to handle #GP for that
INT X. So you can save on exception handling a bit.

> > I haven't measured cycles since I intend to use paging. Setting paging
> > up and tearing it down if I were to switch between RM and PM in my
> > opinion would kill any benefit of not using v86.
>
> Yes, even paging could cover the BIOS-ROMs area with physical mapping.
>
> It would be really interesting to compare the overall performance
> on a trivial thing like a palette-load in 256-bit VGA mode,
> as this are 772 I/O accesses and 256 dword RAM reads.

I remember only one number about VGA: 1.6 GHz Pentium CPU doing 30 FPS
maximum in 640x480 4bpp mode. Paging effect with such a slow VGA
buffer access at the hardware level is just a joke. :) Non-planar
modes w/o those sliding VESA windows/banks should be fast, but I have
no numbers.

Alex

First  |  Prev  |  Next  |  Last
Pages: 6 7 8 9 10 11 12 13 14 15 16 17
Prev: NASM HelloWorld - DOS
Next: ELF loading