From: Tim Little on
On 2010-07-12, dhruvbird <dhruvbird(a)gmail.com> wrote:
> On the contrary, pure LFU is based on access count only

Pure LFU is based on access count for *all* objects (not just those in
cache), for the *whole period* of the cache's operation (not just
since an object last entered cache). That usually involves
impractically large space, often much larger than the object cache
itself.

Various practical adaptations of LFU exist that address this by
attempting to *estimate* frequency of use, while limiting space to a
fixed size. These usually require more advanced algorithms and data
structures.

Your fixed-space algorithm has O(1) behaviour but is *not* LFU. It
does not even credibly attempt to estimate frequency of access.


- Tim
From: dhruvbird on
On Jul 13, 7:10 am, Tim Little <t...(a)little-possums.net> wrote:
> On 2010-07-12, dhruvbird <dhruvb...(a)gmail.com> wrote:
>
> > On Jul 12, 4:04 pm, Tim Little <t...(a)little-possums.net> wrote:
> >> On 2010-07-12, dhruvbird <dhruvb...(a)gmail.com> wrote:
>
> >> > While I agree with the point you make about LFU working in a certain
> >> > way, this is exactly how all currently implemented LFU algorithms
> >> > work.
>
> >> No, LFU caches periodically age entries so that the frequencies
> >> are updated.
>
> > On the contrary, pure LFU is based on access count only and the
> > algorithm you mention is actually a variation which involves ageing.
> >http://en.wikipedia.org/wiki/Least_Frequently_Used
>
> "This computer science article is a stub.  You can help Wikipedia by
> expanding it."  That rather markedly lowers my estimation of the
> reliability of that article.
>
> But even there it refers to use *frequency*, not just use count while
> in cache.  The single reference for that page states "correct
> implementation of LFU requires that replacement decisions are made
> based on frequency access information (populatity), *for all the
> objects accessed since the beginning of a cache's operation*."
> [Emphasis mine]
>
> Your algorithm does not do this, and so is not LFU.
>
> >http://en.wikipedia.org/wiki/Cache_algorithms#Least-Frequently_Used
>
> A vague 1-line summary that points to the above stub.

From: http://docs.jboss.org/jbosscache/1.4.0/TreeCache/en/html/eviction_policies.html
"Node usage starts at 1 when a node is first added. Each time it is
visted, the node usage counter increments by 1. This number is used to
determine which nodes are least frequently used."

Think about "frequently" as in "Frequently Asked Questions" rather
than "frequency" or "rate of occurrence".

Regards,
-Dhruv.
From: dhruvbird on
On Jul 13, 7:31 am, Tim Little <t...(a)little-possums.net> wrote:
> On 2010-07-12, dhruvbird <dhruvb...(a)gmail.com> wrote:
>
> > On the contrary, pure LFU is based on access count only
>
> Pure LFU is based on access count for *all* objects (not just those in
> cache), for the *whole period* of the cache's operation (not just
> since an object last entered cache).  That usually involves
> impractically large space, often much larger than the object cache
> itself.

Exactly, which is why the count is approximated for only those objects
in the cache. I hope you see the practical value of evicting the least
"frequently" used items from the cache (along with their counts).

>
> Various practical adaptations of LFU exist that address this by
> attempting to *estimate* frequency of use, while limiting space to a
> fixed size.  These usually require more advanced algorithms and data
> structures.

Not really. A simple binary heap suffices for most cases. The
resultant complexity is O(log n). My solution however reduces this
complexity (while keeping the behaviour the same) to O(1).

>
> Your fixed-space algorithm has O(1) behaviour but is *not* LFU.  It
> does not even credibly attempt to estimate frequency of access.

Think about "frequently" as in "Frequently Asked Questions" rather
than "frequency" or "rate of occurrence".

Regards,
-Dhruv.
From: Tim Little on
On 2010-07-13, dhruvbird <dhruvbird(a)gmail.com> wrote:
> From: http://docs.jboss.org/jbosscache/1.4.0/TreeCache/en/html/eviction_policies.html
> "Node usage starts at 1 when a node is first added. Each time it is
> visted, the node usage counter increments by 1. This number is used to
> determine which nodes are least frequently used."

Also from that page: "wakeUpIntervalSeconds. This is the interval (in
seconds) to process the node events and also to perform sweeping for
the size limit and age-out nodes." So it does ageing, which allows
counts to estimate frequency of use. Yours does not estimate
frequency of use, and old but infrequently used nodes may never be
evicted.


- Tim
From: Dann Corbit on
In article <da10bed9-6d80-4f30-a6b4-0fc8bb1fa3b7
@x18g2000pro.googlegroups.com>, dhruvbird(a)gmail.com says...
>
> On Jul 12, 4:04�pm, Tim Little <t...(a)little-possums.net> wrote:
> > On 2010-07-12, dhruvbird <dhruvb...(a)gmail.com> wrote:
> >
> >
> > > While I agree with the point you make about LFU working in a certain
> > > way, this is exactly how all currently implemented LFU algorithms
> > > work.
> >
> > No, LFU caches periodically age entries so that the frequencies
> > are updated.
>
> On the contrary, pure LFU is based on access count only and the
> algorithm you mention is actually a variation which involves ageing.
> http://en.wikipedia.org/wiki/Least_Frequently_Used
> http://en.wikipedia.org/wiki/Cache_algorithms#Least-Frequently_Used

Did you bother to read those entries?

"Least frequently used
In computer science, the term "Least Frequently Used" (LFU) refers to a
cache algorithm for memory management. The expiration policy removes
entities from the cache that are used the least. If the use frequency of
each entity is the same, then they are expired by the Least Recently
Used (LRU) algorithm."

Here is the entry for the LRU algorithm:

"Least Recently Used (LRU): discards the least recently used items
first. This algorithm requires keeping track of what was used when,
which is expensive if one wants to make sure the algorithm always
discards the least recently used item. General implementations of this
technique require to keep "age bits" for cache-lines and track the
"Least Recently Used" cache-line based on age-bits. In such
implementation, every time a cache-line is used, the age of all other
cache-lines changes. LRU is actually a family of caching algorithms with
members including: 2Q by Theodore Johnson and Dennis Shasha and LRU/K by
Pat O'Neil, Betty O'Neil and Gerhard Weikum."

It is clear that time is involved.