From: Wolfgang Kern on

Wannabee skrev:

[about time]

< http://szmyggenpv.com/downloads/ >

if the peekmessage.dispatch-using bitblit is the one you mean here,
then I wont be surprised that it is slow ...
Call the API for every single dot ?

I get a numeric figure of 970 (+/-2) on this test.
btw: I needed to an three finger salut to end it.


>>> Where is yours?
>> To direct access a flat VideoRAM it needs a 32-bit OS which allow
>> to write to this memory range (best without paging issues).

> ok. But I still dont understand why you cannot just extract that code
> insert it in the dos file, go to 32 bit flat mode and just do the blts
> and write the numbers. After 26 years building an OS I imagine you
> could do that inside of 10 minutes?

It is possible during a DOS600 session, but rare within a windoze DOSbox.
First it needs to scan the VESA-BIOS for supported modes and create
a list with mode specific data and capabilities.
Then it must have:
full PM32 support with forward/backward links
any kind of memory manager which allow access to PCI-ranges,
which also asks for a PCI device detector ...
So what you ask me here would be a tiny OS on top of DOS.

> I am pretty sure I could do that in a couple of hours, or a day,
> if I had the info.
> (even I never did any dos programming)

You can try ... ;)


> then how can I verify your findings?
> The diffrence of my app is between a AMD64 and a 1500mhz Athlon XP
> is 4900 copies per second, to just 460+ per second.

> using the OS BitBlt which I have considered fast, and which I have few
> alternatives to unless using hardware acceleration.

> Yours run on a much slower computer, but achives 1/5 of the AMD64 running
> at >2 gigahz

> I can hardly belive it. Your code is 6+ times faster, then the atlon xp

>> ok, the 32-bit colour part looks like:

>> usage:
>> MOV eax,00001019h |INT 7F ;set VESAmode to 1024*768,32
>> ecx= 0100 Ysize
>> ebx= 0100 Xsize
>> edx= 0 X+Yposition (Y in hw)
>> eax= 0 colour mask
>> esi= source ;btw: KESYS.bitmaps aren't stored upside down!
>> AND [vflag],0f0 ;clear all options
>> CALL draw_bmp
>> MOV eax 00001009h |INT 7F ;set VESAmode to 1024*768,8 again
>> _________
>> draw_bmp:
>> OR edi,ebx
>> OR edi,ecx

> what the heck is this? (above)

Just initialising regs and set Vmode,
or if you mean the two ORs, they check if both x+y are zero.

>> |JZ ret ;just in case
>> PUSH ebx
>> PUSH edx ;[esp]=Xpos [esp+2]=Ypos

>> ;clip_it:
>> MOVZX eax,w[esp+2] ;eax= Ypos

> Stack abuse? :D
Yes, classical 'LOCALs' in here ;)
I could replace it by MOV eax,edx |SHR eax,010
but the value is needed lateron too.


>> MOV edx,0300 ;max lines (altered by Vmode)
>> ADD eax,ebx
>> CMP eax,edx |Jc L1>
>> SUB eax,ebx |MOV ebx,edx |SUB ebx,eax |JS L9>
>> L1:
>> MOVZX eax,w[esp] ;Xpos
>> MOV edx,01000 ;scan line size (altered by Vmode)

> the Vmode change recode this one? SMC

Yes, this immediate constant values were altered on Vmode changes.

>> ADD eax,ecx
>> CMP eax,edx |Jc L2>
>> SUB eax,ecx |MOV ecx,edx |SUB ecx,eax |JS L9>
>> L2:
>> MOVZX eax,w[esp+2]
>> IMUL eax,edx ;y*line size
>> LEA edi,[eax+screen_start] ;from VESA-info,(altered by Vmode)

> nice.

>> MOVZX eax w[esp] ;+x for 8-bit, +4*x for 32bit
>> TEST[Vflag]40h ;indicates 8/32 bit colours
>> JZ draw8 ;not shown yet
>> TEST[Vmode]04h indicates colour mask active
>> JNZ draw_32_eax ;not shown yet
>> LEA edi,[edi+eax*4];

>> ;draw it:
>> PUSH ecx
>> L3:MOV eax,edi ;keep the line start
>> REP MOVSD
>> ADD eax,edx ;add scan line size
>> DEC ebx |MOV ecx[esp]
>> MOV edi,eax |JNZ L3<
>> POP ecx
>> L9: POP edx |POP ebx
>> ret: RET
>> ___________
>> You see it's not optimised at all,

> ? :D Looks very nice to me. short and excellent code I gather.

>> I could try to improve the loop
>> with MOVD/MOVNTQ or SSE 128-bit moves, even then any unaligned parts
>> may destroy the gain.

>>> If you want your OS out of the picture. why dont you just write
>>> it as a dos image instead?

>> It wont work in plain DOS because it must use 32-bit code to
>> access a flat VRAM (usually above 2GB).
>> EMM and XMS wont do well here, because IRQs become disabled for too
>> long and may lock up some hardware then.

> But shouldnt be all that hard still? To run a com, break the barried by
> your own code? Or am I speaking of ignorance here?
> I cant figure it could be much of a job for you?

As said above, it would need to write a tiny OS on top of DOS,
I've planned to release a new DOS6 based DEMO soon anyway

>>> btw, I still have the copy of you demo. Will it run on that?

>> I think this DEMO was a version.000 or 001, so it wont contain
>> the bitmap draw nor any 32-bit colour support.

> ok. I like your code, but would very much like to see it running
> with printed numbers (fps). (as fast as it can run) Since we cant do that
> I would just have to trust you ... (I am not really hardwired for that)
> :)

> So you pushing 1/5 of a AMD64 400mhz fsb performance on a 500mhz
> antique AMD?
> And 6 times that of a 266mhz fsb athlon?
> hmmm.....(teeth gnizzeling sounds) ..... Get out of here!

I don't know how to interprete your '970' message.
My estimation was about to be three times faster than windoze.

What have you expected when you compare a HLL-driven peek&pokeOS solution
with one written in machine code running in 'un'-protected mode without
paging ?

> Rewrite it to a dos image, that set up the flat mode, and vesa,
> and runs the app.
> I know you can do that easily. And I promise you, if you do that i read
> the code in hex.

Again as above, perhaps I do it one day.

> And I also will then restart the testing of the demo, if you want to.
> (I now have enough hardware for dedicating a machine to testing).

This olde Demo is almost obsolete yet, I'd wait for the new version.

> You want to prove a point, you have the means, (easily) so whats
> stopping you?

I don't need to sell my solution in this NG, and for me it's enough
to know that my code performs much faster than winoze/L'unix/or else.

__
wolfgang



From: Wolfgang Kern on

Dirk Wolfgang Glomp wrote:

>> It wont work in plain DOS because it must use 32-bit code to
>> access a flat VRAM (usually above 2GB).

> 32-bit code?
> I use the unreal/bigreal-mode with 16Bit-adressmode to access
> the linear framebuffer.

It's possible, but IRQs must be disabled then.

>> EMM and XMS wont do well here, because IRQs become disabled for too
>> long and may lock up some hardware then.

> When i use EMM-Register the IRQs become disabled?

I don't know what you mean by EMM-registers, but AFAIK EMM does not
support IRQs while in PM, XMS wont for sure.
Maybe supported starting with DOS6.22 and in winDOS, but not in DOS6.00
or earlier versions.

__
wolfgang



From: Dirk Wolfgang Glomp on
Am Mon, 14 Jan 2008 15:15:09 +0100 schrieb Wolfgang Kern:

> Dirk Wolfgang Glomp wrote:
>
>>> It wont work in plain DOS because it must use 32-bit code to
>>> access a flat VRAM (usually above 2GB).
>
>> 32-bit code?
>> I use the unreal/bigreal-mode with 16Bit-adressmode to access
>> the linear framebuffer.
>
> It's possible, but IRQs must be disabled then.

....or placed with a 32Bit IRQ-Table for RM?

>>> EMM and XMS wont do well here, because IRQs become disabled for too
>>> long and may lock up some hardware then.
>
>> When i use EMM-Register the IRQs become disabled?
>
> I don't know what you mean by EMM-registers, but AFAIK EMM does not
> support IRQs while in PM, XMS wont for sure.

Oh i was confuse. Forget it.

> Maybe supported starting with DOS6.22 and in winDOS, but not in DOS6.00
> or earlier versions.

It is possible to use FreeDos and the unreal-mode?

Dirk
From: Wolfgang Kern on

Dirk Wolfgang Glomp wrote:

>>>> It wont work in plain DOS because it must use 32-bit code to
>>>> access a flat VRAM (usually above 2GB).
>>> 32-bit code?
>>> I use the unreal/bigreal-mode with 16Bit-adressmode to access
>>> the linear framebuffer.
>> It's possible, but IRQs must be disabled then.

> ...or placed with a 32Bit IRQ-Table for RM?

The Big Real(or unreal) trick is to not alter CS by a far jump,
and as CS:IP is loaded from any IRQ it will act as it had entered
PM by a far-jump and will usually crash for sure on iret.

I've heard that one could write big-real interrupt routines beside
RM and PM routines, but this trick must have all code within
one 64 KB range and map PM16 to RM segment.

>>>> EMM and XMS wont do well here, because IRQs become disabled for too
>>>> long and may lock up some hardware then.

>>> When i use EMM-Register the IRQs become disabled?
>> I don't know what you mean by EMM-registers, but AFAIK EMM does not
>> support IRQs while in PM, XMS wont for sure.
> Oh i was confuse. Forget it.

OK. ;) I wasn't sure to have missed the existence of some registers...

>> Maybe supported starting with DOS6.22 and in winDOS, but not in DOS6.00
>> or earlier versions.

> It is possible to use FreeDos and the unreal-mode?

Never tried it myself, but I think it should be possible with IRQs disabled.
This unreal mode may show more problems than a trueRM<->PM link or VM86.

__
wolfgang



From: Dirk Wolfgang Glomp on
Am Tue, 15 Jan 2008 03:17:30 +0100 schrieb Wolfgang Kern:

> Dirk Wolfgang Glomp wrote:
>
>>>>> It wont work in plain DOS because it must use 32-bit code to
>>>>> access a flat VRAM (usually above 2GB).
>>>> 32-bit code?
>>>> I use the unreal/bigreal-mode with 16Bit-adressmode to access
>>>> the linear framebuffer.
>>> It's possible, but IRQs must be disabled then.
>
>> ...or placed with a 32Bit IRQ-Table for RM?
>
> The Big Real(or unreal) trick is to not alter CS by a far jump,
> and as CS:IP is loaded from any IRQ it will act as it had entered
> PM by a far-jump and will usually crash for sure on iret.
>
> I've heard that one could write big-real interrupt routines beside
> RM and PM routines, but this trick must have all code within
> one 64 KB range and map PM16 to RM segment.
>
>>>>> EMM and XMS wont do well here, because IRQs become disabled for too
>>>>> long and may lock up some hardware then.
>
>>>> When i use EMM-Register the IRQs become disabled?
>>> I don't know what you mean by EMM-registers, but AFAIK EMM does not
>>> support IRQs while in PM, XMS wont for sure.
>> Oh i was confuse. Forget it.
>
> OK. ;) I wasn't sure to have missed the existence of some registers...
>
>>> Maybe supported starting with DOS6.22 and in winDOS, but not in DOS6.00
>>> or earlier versions.
>
>> It is possible to use FreeDos and the unreal-mode?
>
> Never tried it myself, but I think it should be possible with IRQs disabled.
> This unreal mode may show more problems than a trueRM<->PM link or VM86.

Thanks.

Dirk
First  |  Prev  |  Next  |  Last
Pages: 1 2 3 4 5 6 7 8 9 10 11
Prev: A little ASM 6809 program
Next: what is rsrc.rc?