From: Lucretia on
On May 9, 9:48 pm, Robert A Duff <bobd...(a)shell01.TheWorld.com> wrote:
> Presumably by storing information (symbol tables and whatnot) in
> permanent disk files. If you do that, I think you should design it as a
> _pure_ optimization. That is, the system should behave exactly as if
> everything is compiled from source every time, except that it's faster.
> Don't mimic the Ada 83 compilers that had a notion of "compiling things
> into the library", where a source file sitting right there in the source
> directory is ignored, unless "compiled into...".

Yes, I have no intention of having to add stuff to the library before
it can compile. I'm looking at a hybrid of what GNAT does combined
with the ability to use precompiled units in binary form.

> If you do this, you need to design a permanent (on disk) representation
> that is small. I think it's possible, but it's not easy -- the
> "obvious" representation of symbol tables will be 10 times larger than
> the source code. If it's 10 times larger, then it defeats the purpose
> -- reading it in from disk will be slower than re-analyzing the original
> source code.
>
> You also suggested storing the info in the object files. Yes, that is
> possible, given a reasonable object file format that allows arbitrary
> information to be stored, in addition to the actual machine code.

DWARF?

> Building a complete Ada implementation is a daunting task (many
> person-years). You said you're doing a subset of Ada 2005, and I assume
> you're doing this "just for fun". Keep your subset small, or you will
> never finish.

Nope, just a subset.

Luke.

From: Lucretia on
On May 9, 10:15 pm, Gautier <gaut...(a)fakeaddress.nil> wrote:
> Hello,
>
> You seem to look for a model where each unit compilation ands into a file that
> contains, roughly, the following information (with overlaps):
> - the specification, "digested"
> - the contents of the .ali file
> - unit dependencies
> - time stamps to check need of recompilations
> - the compiled code (the .o object file itself).
>
> This model exists and works fine (esp., less files and a way faster
> compilation!), it is the one of Turbo Pascal's .TPU files (~1988...) and its
> descendants (TPW, Delphi). You just have to take a close look at it. No need
> to reinvent anything...

I'd forgotted about this. Do you have any pointers?

Thanks,
Luke.

From: Dmitry A. Kazakov on
On 9 May 2007 10:21:20 -0700, Lucretia wrote:

> On May 9, 5:51 pm, "Dmitry A. Kazakov" <mail...(a)dmitry-kazakov.de>
> wrote:
>>> But I'm interested in seeing how I can get the compiler to not have to
>>> reparse other source files (when with'd) in order to compile a unit.
>>
>> Why should it? Certainly, you would have some intermediate representation
>> which includes symbolic tables and other stuff the compiler creates after
>> semantic analysis before code generation. A compiled library unit will have
>> this stored in some appropriate format.
>
> True, but is it possible to include dependency info within the object
> itself?

Sure, it is just another "with". Each "with" loads (maps into the memory)
the corresponding precompiled unit (compilation context). Some minor
relocation stuff will be needed, checksums and time stamping as Gautier has
mentioned. The symbolic tables (actually contexts) can be organized as a
tree-stack to reduce loading overhead. "Use" plants a tree in the forest. I
did such thing once. I guess it should make table look-ups slower. So the
net effect on the compilation speed is unclear, before you actually build
the thing.

>> An interesting issue is cross-platform library units, I was playing with
>> this idea, but came to no conclusion.
>
> LLVM would be capable of this.

Then is not a cross-platform, I mean the only platform here is the VM.
Otherwise, it could be doable. Though many static things which are
platform-dependent will cease to be compile-time static. So you will be
unable to have certain things fully precompiled.

> Why? Are you saying that it would be better to store libraries in IR
> rather than binary form?

IR = infra red?

>>> I don't want to have to have
>>> tons of source files lying around for a static/shared library, I
>>> honestly don't see the need. Surely, it's possible to build a compiler
>>> that can get the information it needs from the library itself (or at
>>> least a companion library rather than a ton of different ALI files)?
>>
>> Or a database, for that matter...
>
> Like the old Ada library?

It wasn't that bad. DEC Ada had a library, if I correctly remember. With an
integrated source control system it might become more attractive than the
GNAT model. If I designed Ada tool-chain (the compiler is only a part of),
I would probably choose this.

> I was actually thinking, if I couldn't embed
> the other information required by the compiler into the final object,
> then a separate "lib" containing only this informaton would be
> necessary.

You mean symbolic info for the debugger?

>>> Now, I'd really like to hear from people who have implemented an Ada
>>> compiler and people who have used other compilers (I've only used
>>> GNAT). Basically, I'm interested in how other implementations handle
>>> the library. Note that there are 2 ideas of library here:
>>
>>> 1) The standard Ada library.
>>> 2) Link/shared libraries found on operating systems.
>>
>> I remember RSX-11 in which macro and object libraries were handled by the
>> same librarian tool.
>
> RSX-11 is before my time, got any more info on this?

It was too primitive to be used for Ada. But the idea is basically same.

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de
From: Duncan Sands on
> > LLVM would be capable of this.
>
> Then is not a cross-platform, I mean the only platform here is the VM.
> Otherwise, it could be doable. Though many static things which are
> platform-dependent will cease to be compile-time static. So you will be
> unable to have certain things fully precompiled.

LLVM, in spite of the name, is not a virtual machine like java. It is
a compiler, or rather a compiler library, with a platform independent
IR. The gcc C and C++ front-ends have been ported to LLVM, i.e. the
LLVM optimizers and IR are used instead of the gcc ones, and I'm currently
porting the Ada front-end to it. The LLVM IR produced is not platform
independent, simply because the gcc trees produced by the front-ends are
not platform independent.

Ciao,

Duncan.
From: Stefan Bellon on
Gautier wrote:

> - time stamps to check need of recompilations

I've always wondered about the focussing on time stamps. I think the way
to do it would be to calculate a hash sum (md5, sha1, ...) on the token
stream without comments. This way you wouldn't have to recompile if you
do layout and commentary changes, and even if you touch the file, you
don't inadvertently trigger a recompilation.

In fact, this is the idea of the "compilercache" project for C and C++:
It intercepts calls to gcc/g++, builds a hash value on the
concatenation of the command line (minus a few switches that do not
influence code generation) and the preprocessed source and then stores
the resulting object file in the cache with the name of the hash value.
If the same hash value is to be compiled again, it is fetched from the
cache and a lot of compilation time is saved.

--
Stefan Bellon
First  |  Prev  |  Next  |  Last
Pages: 1 2 3 4 5 6 7 8 9
Prev: JGNAT - HTML generation
Next: GtkAda Tree_View properties