|
Prev: CBM Repair Problem
Next: scanning manuals
From: BruceMcF on 13 Apr 2008 19:18 On Apr 12, 8:26 pm, Harry Potter <maspethro...(a)aol.com> wrote: > Do you include *every* function in a library, or just the necessary > ones? cc65 seems to include justb the necessary ones, although some > fat (i.e. the initmainargs assembler function, which processes the > command line, shortened to pushing two NULLs on the stack) could still > be cut. No, just the ones you reference, directly or indirectly. The killer with the modern desktops was calling a function, which itself calls five functions, which themselves call twenty functions, yadda yadda yadda, and the program is massive. That was fixed with dynamic linking libraries, which can share the same library code. But then one dynamic link library (.dll or .so) calls another one, and you are off to the races again. One of the ongoing developments in support of floppy disk and other Tiny Linuxen is developing C libraries that are more fine grained in terms of what is dragged in with the functions that you use. The main thing for your situation is, if you call a function that can also cook dinner and wash dishes afterward, like printf, then *even if you don't ask it to do that*, the capability comes along for the ride. So as a rule of thumb, use the function call that does the least in addition to what you need, and it is likely to pull less additional stuff in with it.
From: Harry Potter on 18 Apr 2008 18:47 On Apr 13, 3:44 pm, Harry Potter <maspethro...(a)aol.com> wrote: > On Apr 13, 3:00 pm, BruceMcF <agil...(a)netscape.net> wrote: > > > Rather than try to replace cc65, improve it. Then each feature on your > > list is added to what cc65 already provides. > > I understand. Thank you. Let me add to that. I have some ideas that I think would improve compiled C code: * Calculation optimizations: All work would be done at compile time, and only actual data modifications would be written to code. * Stack optimizations: Instead of using the stack to save intermediate results, I could use, on the inner-most ones, a zeropage environment variable. * Pointer/index: If possible, zeropage pointers would be referenced directly or through an index. * __fastcall and register declarations: I don't know if other 6502 C compilers do this, but I plan to allocate three words in zp for register and __fastcall declarations and use unused environment space for excess register declarations. What do you think?
From: BruceMcF on 18 Apr 2008 23:24 On Apr 18, 6:47 pm, Harry Potter <maspethro...(a)aol.com> wrote: > On Apr 13, 3:44 pm, Harry Potter <maspethro...(a)aol.com> wrote: > > > On Apr 13, 3:00 pm, BruceMcF <agil...(a)netscape.net> wrote: > > > > Rather than try to replace cc65, improve it. Then each feature on your > > > list is added to what cc65 already provides. > > > I understand. Thank you. > > Let me add to that. I have some ideas that I think would improve > compiled C code: > > * Calculation optimizations: All work would be done at compile > time, and only actual data modifications would be written to code. Yes, that would be a normal optimization ... the first optimizations are the easiest, and then the more aggressively you pursue it, the smaller the returns, but given that the target is going to fit into a 64 address space it should be possible to be fairly aggressive. > * Stack optimizations: Instead of using the stack to save > intermediate results, I could use, on the inner-most ones, a zeropage > environment variable. OTOH, if the programmer has the best idea which inner routines are going to be called the most often, a register local variable allocation might be higher priority. Its a matter of balancing the zp used by the compiler and the zp locations left free to assembly language routines of the user ... given, of course, large chunks are in use by the Kernal. > * Pointer/index: If possible, zeropage pointers would be referenced > directly or through an index. With the 65816, this is easier, because it has the richer set of stack indexed addressing ... with a 6502 C, often the stack pointer is copied to the X index through TSX and the X-indexed address mode is used for local variables ... $101,X; $102,X and so on ... which means that zero page pointers through an index may involve register juggling. OTOH, a small register set means that its probably a bounded problem with an unambiguous solution, and you use whichever way is the best ... but there may have to be a speed/size priority setting. > * __fastcall and register declarations: I don't know if other 6502 > C compilers do this, but I plan to allocate three words in zp for > register and __fastcall declarations and use unused environment space > for excess register declarations. This seems like the biggest bang for the buck ... straightforward to manage and much easier to manage, including direct use as pointers rather than copying from the stack to a work location in the zero page for indirect addressing. Where is the environment in cc65? > What do you think? I think its going more deeply into C compiler programming than I'm a mind to go, but especially the last one looks compelling.
From: Michael J. Mahon on 19 Apr 2008 02:32 BruceMcF wrote: > On Apr 18, 6:47 pm, Harry Potter <maspethro...(a)aol.com> wrote: > >>On Apr 13, 3:44 pm, Harry Potter <maspethro...(a)aol.com> wrote: >> >> >>>On Apr 13, 3:00 pm, BruceMcF <agil...(a)netscape.net> wrote: >> >>>>Rather than try to replace cc65, improve it. Then each feature on your >>>>list is added to what cc65 already provides. >> >>>I understand. Thank you. >> >>Let me add to that. I have some ideas that I think would improve >>compiled C code: >> >>* Calculation optimizations: All work would be done at compile >>time, and only actual data modifications would be written to code. > > > Yes, that would be a normal optimization ... the first optimizations > are the easiest, and then the more aggressively you pursue it, the > smaller the returns, but given that the target is going to fit into a > 64 address space it should be possible to be fairly aggressive. Certainly doing constant computations at compile time is a good start--and it permits the use of well-named constants in expressions without compromising run-time efficiency. >>* Stack optimizations: Instead of using the stack to save >>intermediate results, I could use, on the inner-most ones, a zeropage >>environment variable. > > > OTOH, if the programmer has the best idea which inner routines are > going to be called the most often, a register local variable > allocation might be higher priority. Its a matter of balancing the zp > used by the compiler and the zp locations left free to assembly > language routines of the user ... given, of course, large chunks are > in use by the Kernal. For a 6502, zero page *is* the "registers". Nothing of interest to a compiler--like pointers or integers--will fit in the processor's registers. >>* Pointer/index: If possible, zeropage pointers would be referenced >>directly or through an index. > > > With the 65816, this is easier, because it has the richer set of stack > indexed addressing ... with a 6502 C, often the stack pointer is > copied to the X index through TSX and the X-indexed address mode is > used for local variables ... $101,X; $102,X and so on ... which means > that zero page pointers through an index may involve register > juggling. Using a 256-byte stack for both return addresses and parameters is really pushing it. ;-) The parameter & local variable stack should almost certainly be a memory structure pointed to by a zero-page pointer. Having it grow "against" the heap is pretty standard. > OTOH, a small register set means that its probably a bounded problem > with an unambiguous solution, and you use whichever way is the > best ... but there may have to be a speed/size priority setting. Zero page has room for several "general registers" that could hold interesting data types. >>* __fastcall and register declarations: I don't know if other 6502 >>C compilers do this, but I plan to allocate three words in zp for >>register and __fastcall declarations and use unused environment space >>for excess register declarations. > > > This seems like the biggest bang for the buck ... straightforward to > manage and much easier to manage, including direct use as pointers > rather than copying from the stack to a work location in the zero page > for indirect addressing. And an option for inlining small functions can also be a big winner, and a very effective time-space tradeoff. -michael NadaPong: Network game demo for Apple II computers! Home page: http://members.aol.com/MJMahon/ "The wastebasket is our most important design tool--and it's seriously underused."
From: BruceMcF on 19 Apr 2008 11:54
On Apr 19, 2:32 am, "Michael J. Mahon" <mjma...(a)aol.com> wrote: > For a 6502, zero page *is* the "registers". Nothing of interest to > a compiler--like pointers or integers--will fit in the processor's > registers. Yes, but at the same time, much of low level interfacing with the Kernal from C involves treating zero page locations as either a global integer variable or global pointer variable. I took it that he was referring to that kind of use of the zero page ... the zero page *used as* memory that is quicker to get to and quicker to use when dereferencing a pointer. > Using a 256-byte stack for both return addresses and parameters is > really pushing it. Yeah, I can see a pointer to a software stack structure passed on the hardware stack before making the 6502 call. With that, you only use the TSX register one time, at the start of a called function, to set up the local variable pointer in the zp, and possibly at the end if that slot in the hardware stack is used, for ANSI C returns, as the pointer to the return value, or for K&R returns, for the return value itself. So X would be free during the routine, but Y would be in heavy use. The general point stands ... without the built in stack frame address modes of the 65816, there's going to be one of the two index registers that is in use to reference local variables. > And an option for inlining small functions can also be a big winner, > and a very effective time-space tradeoff. Yes, I overlooked this entirely, but but there are a lot of optimizations available for a zp routine that are not available outside the zp. |