From: Chuck Crayne on
On Thu, 28 Aug 2008 20:31:07 +0200
Herbert Kleebauer <klee(a)unibwm.de> wrote:

> So, int80/eax=45 is brk and not sbrk. Therefore ebx has to be loaded
> with the absolute value of the data segment end and not with an
> increase value (like for sbrk).

As Frank pointed out, when cutting and pasting from my working code to
my message, I accidentally left out a line. The full example of
allocating and clearing an additional 65K is:

;allocate dynamic segment
mov eax,__NR_brk ;brk
xor ebx,ebx ;get current end
int 80h ;query kernel
mov [dydata],eax ;start of new allocation
mov ebx,eax ;previous end of data
add ebx,10000h ;65k bytes
mov eax,__NR_brk ;brk
int 80h ;kernel call
or eax,eax ;test return
jns clrseg ;clear allocation
mov ecx,memmsg ;no memory message
mov edx,memmsgl
jmp errexit ;display error and quit
;clear extra segment
clrseg: mov eax,0
mov ecx,4000h
mov edi,[dydata] ;start of allocation
rep stosd

This code is functionally equivalent to sbrk(0x10000).

--
Chuck
http://www.pacificsites.com/~ccrayne/charles.html

From: Rod Pemberton on
"Frank Kotler" <fbkotler(a)verizon.net> wrote in message
news:g96snp$fot$1(a)aioe.org...
> Herbert Kleebauer wrote:
> > But how could they write the C library if there is no documentation
> > of the int80 interface.
>
> AFAIK, the C library doesn't use the int 80h interface...

At some point, the C library has to call the int 80h or sysenter interface
1) because C libraries require OS functionality to in order be implemented
and 2) because modern OSes don't forfit OS functionality to applications
like DOS does, i.e., they can't bypass the OS's central API. E.g., The libc
C code for POSIX's fork() declared in unistd.h must eventually call the
Linux system call _NR_fork declared in asm-i386/unistd.h or
asm-x86_64/unistd.h. Since you guys run Linux, you should be able to
explain to me in depth how this works... yes?

A recent Linux kernel has about 290 syscalls. The original Linux had about
40 completed syscalls. PJ Plaguer's Standard C library uses 18 OS
functions. Redhat's newlib uses 19 OS functions. PDP-7 and 11
implementations used at least 11 OS functions. The following post of mine
was FORTH related, but also has a section comparing the number of
"interface" functions used by C libraries, CP/M, Linux, LLVA, LISP, DOS C
compilers, etc:

http://groups.google.com/group/comp.lang.forth/msg/10872cb68edcb526


Rod Pemberton

From: Herbert Kleebauer on
Chuck Crayne wrote:
> Herbert Kleebauer <klee(a)unibwm.de> wrote:

> As Frank pointed out, when cutting and pasting from my working code to
> my message, I accidentally left out a line. The full example of
> allocating and clearing an additional 65K is:
>
> ;allocate dynamic segment
> mov eax,__NR_brk ;brk
> xor ebx,ebx ;get current end
> int 80h ;query kernel
> mov [dydata],eax ;start of new allocation
> mov ebx,eax ;previous end of data
> add ebx,10000h ;65k bytes
> mov eax,__NR_brk ;brk
> int 80h ;kernel call
> or eax,eax ;test return
> jns clrseg ;clear allocation
> mov ecx,memmsg ;no memory message
> mov edx,memmsgl
> jmp errexit ;display error and quit
> ;clear extra segment
> clrseg: mov eax,0
> mov ecx,4000h
> mov edi,[dydata] ;start of allocation
> rep stosd
>
> This code is functionally equivalent to sbrk(0x10000).


Where did you get this information? Seems my Linux version
didn't read that document and therefore behave different.

You can call int80/eax=45 with 2^32 different values in ebx. If
Linux "likes" (from the man page: "when that value is reasonable")
the value in ebx it sets the end of the data segment to this value.
If it doesn't "like" it, it doesn't modify the end of the data segment.
In any case it returns the new end of data segment and never ENOMEM.

I wrote a simple program which test all 2^32 values in ebx. Here a few
information about the program:

Address of first instruction byte in the code segment: $08048000
Address of last instruction byte in the code segment: $0804829c
Address of first data byte in the data segment: $080492a0
Address of last data byte in the data segment: $080492a3

int80/eax=45 ebx= $00000000 - $0804829c: fail eax=$0804a000
ebx= $0804829d - $bf970000: success eax=ebx
ebx= $bf970001 - $ffffffff: fail eax=$0804a000

This means, the initial value for the end of data segment is set
to $0804a000 (the last byte in the data segment rounded to the
next 4k boundary). Accepted values for the new end of data segment
are the address of the first byte after the last instruction byte
(which is before the start of the data segment) up to the stack
(this value is slightly different with each run because somehow
Linux uses a random value for the stack pointer).

Now, without a documentation (specification) of the correct
behaviour one can not decide whether this is the intended behaviour
or an implementation bug. If it is a bug, then maybe with the next
version it really returns ENOMEM and not the old value for the
invalid input ebx=0, and then your code will not work anymore.

You should also not test for negative values in eax as an error
indicator (because values up to $bf...... are success) but
check for ENOMEM. But as long the implementation never returns
ENOMEM this also wouldn't make much sense.


> This particular call cannot fail. But if we followed it with
>
> add ebx,10000h ;65k bytes
> mov eax,__NR_brk ;brk
> int 80h ;kernel call
>
> and the requested memory was not available, then it would return the
> negative value of ENOMEM in EAX.

This has nothing to do with the available memory but only with the
available virtual address space (we use paging).
From: H. Peter Anvin on
Frank Kotler wrote:
> Herbert Kleebauer wrote:
>
> ...
>> But how could they write the C library if there is no documentation
>> of the int80 interface.
>
> AFAIK, the C library doesn't use the int 80h interface...
>

Yes it does. If you have a very recent glibc, then it probably uses the
vdso interface, but that's functionally identical.

You might find my klibc -- an extrememly minimal Linux libc --
interesting for a more streamlined interface from the C API layer to the
kernel system calls.

http://git.kernel.org/?p=libs/klibc/klibc.git;a=summary

-hpa
From: NathanCBaker on
On Aug 29, 8:17 pm, "H. Peter Anvin" <h...(a)zytor.com> wrote:
> Frank Kotler wrote:
> > Herbert Kleebauer wrote:
>
> > ...
> >> But how could they write the C library if there is no documentation
> >> of the int80 interface.
>
> > AFAIK, the C library doesn't use the int 80h interface...
>
> Yes it does.

I believe Frank is alluding to a recent discussion with Randall Hyde
on the aoaprogramming list about his use of the C library as a "back
end" for newer versions of the HLA Stdlib. IIRC, Randy maintained
that the C library is more tightly integrated with the kernal and not
just a simple "wrapper" for the int80 interface.

Nathan.