From: George Neuner on
On Thu, 15 Jul 2010 12:29:52 -0700 (PDT), jacko <jackokring(a)gmail.com>
wrote:

>On Jul 15, 7:35�pm, George Neuner <gneun...(a)comcast.net> wrote:
>
>> I don't see how the "FIFOO" described in the link would help cache
>> pollution in GC (any GC). �Since marking and copying are both
>> read-only (no modify) operations, the best solution is to fetch the
>> value(s) into the L1 cache bypassing the other levels, and for copying
>> or compacting GC to write it out again bypassing the other levels. �I
>> don't know of a workable solution for sweeping which does modify
>> structures.
>
>Marking involves a WRITE.

Yes ... but not necessarily a write to the object. A well designed
tracer will use a separate data structure - typically a bit map - to
mark objects. This both avoids dirtying VM pages unnecessarily and
improves the locality of the working set while tracing is happening
(which may be concurrent with mutator execution).


>Mark sweep involves marking the whole active list node set. Three
>total traversals or two if a binary bool marker is used.

No. General mark-sweep does not involve 3 traversals. If the marking
queue (or stack) is not exhausted, the entire process involves 1 walk
of the entire heap + a walk through the live data ... yielding at most
2 traversals.

If the mark queue is exhausted during the trace, there may be some
additional small walks the sum of which is bounded by the ratio of the
mark queue to the size of the heap.

You need to brush up on GC ... here is a link to the bible:
http://www.amazon.com/Garbage-Collection-Algorithms-Automatic-Management/dp/0471941484




>> ISTM that using an object table to indirectly reference objects with
>> multiple parents is a better solution ... delete/sweep of an object
>> stops where it chases a pointer into the object table.
>
>Apart from TABLE being associated with heap compaction/fragmentation
>issues, could you elaborate. To me it seems like any node which has
>more than one referrer, referers to it through a handle in the table.
>What of just one referrer and the extra bool? How is in table
>detected? How are circular references handled or avoided? (by a handle
>of course, but...) How is self reference handled or avoided? If I
>understand you it's about kind of a join node inserted transparently
>agregated in the table with all the other join nodes, and including a
>referer count and a node class indication.

The point is either a) to prevent data structures from being shared or
self-referencing, or b) to permit sharing/self-reference but be able
to easily detect it so you don't accidentally delete shared structure.

Using an object "root" table accomplishes either goal. Basically you
divide your data structures into groups of objects that comprise
(potentially) shareable "blocks" and deal with them at the block level
rather than as individual objects. Whenever you encounter the need to
share part of an existing data structure, you lift the shared part
into a separate root level object and create a table entry for it.

You steal 1 bit of each pointer to distinguish whether the value is a
regular pointer or a pointer to a shared root object. Depending on
whether your goal is "preventing" or just "detecting", the marked
pointer value itself either can refer to a root table entry or can
point directly to the shared structure.
(read more about "1-bit" reference counting in Jones & Lins).

You can mix regular and marked pointers in data structures. Traversal
goes as expected: if the marked pointer is direct you just ignore the
mark bit - if marked pointers are really handles then you need to
fetch the table entry to continue. However, if you make a copy of a
regular (unmarked) pointer you must turn the target into a root object
and install marked pointers at both the old and new locations.

When deleting a structure (list, tree, whatever), you can safely
delete/recycle anything that is referred to by a regular pointer, but
when you encounter a marked pointer you have to stop and deal with a
potentially shared object. You can deal with sharing by including a
reference count in the table entry or by periodically tracing from the
table and purging any root objects that aren't referenced by others.


>So that forces some kind of count and class per node rather than per
>structure. So more than 2 machine words per node?

No. You have a table entry which is one or two words per group of
linked objects plus 1 bit in each pointer to distinguish a regular
pointer from a shared pointer.


This is explained pretty thoroughly in Jones & Lins, although the
discussions of reference counting and uses of object tables are not
directly connected. Jones & Lins is a concept survey rather than a
tutorial so you frequently have to put the pieces together.

You might also take a look at Henry Baker's paper: "Linear Logic and
Permutation Stacks -- The Forth Shall Be First" in which he describes
1-bit refcounting and dealing with shared objects without knowing how
many owners they have.


Of course, this is all academic and unnecessary if you have "real" GC.
Avoidance of sharing is an unnecessary burden on a programmer and
reference counting in almost any form is little more than a hack.

George
From: George Neuner on
On Thu, 15 Jul 2010 12:46:20 -0700 (PDT), jacko <jackokring(a)gmail.com>
wrote:

< big snip >

You're repeating yourself.

George
From: jacko on
> Of course, this is all academic and unnecessary if you have "real" GC.
> Avoidance of sharing is an unnecessary burden on a programmer and
> reference counting in almost any form is little more than a hack.

Real time... AND this is important, because all circular references
can be virtualized via a FIFOO into a self reference the cycle
detection process becomes O(n) for self reference on the FIFOO size n
with no WRITE marking, by one traversal of the singular FIFOO as any
indirect chase is mearly a tail test. An these can be totally
decounted or added ordered in the head ring for next time round speed
up. This can give priority to found loops, especially on 'dead'
FIFOOs.

For mutually recursive n-ary references, things are more complex, but
an active referer chain stored instead of the active head count in the
list hanging off the tail of the ring of heads may help,

Cheers Jacko

From: jacko on
Also trating the link pointer as a system 'kernal?' variable with no
user level setting, can remove TLB phases from operations.
From: jacko on
On Jul 16, 6:09 pm, George Neuner <gneun...(a)comcast.net> wrote:
> On Thu, 15 Jul 2010 12:46:20 -0700 (PDT), jacko <jackokr...(a)gmail.com>
> wrote:
>
>  <  big snip  >
>
> You're repeating yourself.
>
> George

C is old, an has the big C.