From: Norbert Juffa on
"robert bristow-johnson" <rbj(a)audioimagination.com> wrote in message
news:21b0a3de-4c48-4364-90d1-540dbba8bc8c(a)q12g2000yqj.googlegroups.com...
> On Jul 23, 2:42 pm, "Norbert Juffa" <ju...(a)earthlink.net> wrote:
> > "Steve Pope" <spop...(a)speedymail.org> wrote in messagenews:i2cg29$hq6$1(a)blue.rahul.net...
> > > Norbert Juffa <ju...(a)earthlink.net> wrote:
> >
> > >>Below I demonstrate how this could be done for log2() in 8.24 fixed
> > >>point. Our input x is of the form 2^e*m, 1 <= m < 2, m = 1 + f. Using
> > >>a "count leading zeros" operation, we quickly establish e and f. Given
> > >>f, 0 <= f < 1, we compute the index i of the closest, smaller, table
> > >>entry in a table of values of log2(x) on [1,2). The difference between
> > >>the actual argument and the function argument corresponding to i shall
> > >>be d. Using the function value from the table, f0, as well as the next
> > >>two table entries, f1 and f2, we compute the coefficients a, b of a
> > >>parabola that fits through all three function values.
> >
> > > The above is a valuable technique, but times have changed and in many
> > > DSP scenarios these days one can just plug the mantissa into
> > > a (larger) LUT and be done with it. This applies to log(), square(),
> > > and similar functions.
> >
> ...
> >
> > For a table lookup method to work efficiently on most embedded processors
> > requires a high likelihood that the table elements are present in L1 cache.
> > Often there will be tables for division, sqrt(), exp(), log(), all of whom
> > would want to fit into L1 at the same time, leaving room for other data as
> > well. The L1 may be as small as 8KB or even 4KB, so keeping lookup tables
> > small seems very much desirable.
> >
> > To achieve accuracy to full 8.24 precision one needs a table of 258*4 bytes
> > (~1KB) each for both exp() and log() with quadratic interpolation. Table
> > storage will increase significantly if linear interpolation is used, even
> > if an economical scheme such as bipartite tables is used.
>
> i had to look up "bipartite table" to get an idea about what you're
> talking about. would this paper:
>
> http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.36.6368&rep=rep1&type=pdf
>
> be a good initial description of the technique? it's a little bit
> intriguing, it's based on the notion that the first derivative of a
> function changes much less in a segment of its domain than the "zeroth
> derivative" of the function.
>
> but i don't see quadratic interpolation in this. how is quadratic
> interpolation done? do you continue to divide the input word into
> three parts, use two tables, and have a quadratic function rather than
> a linear one (referring to Eq. 6 in the paper i cited above) for the
> least significant portion of the sum?


That paper looks like a reasonable starting point if you want to get acquainted
with bipartite tables. I first learned about them through papers by David Matula,
whose work is cited by the paper you dug up.

Bipartite tables are basically a more economical form of linear interpolation,
that allow for smaller storage and for rapid evaluation (two table lookups and
an addition). For hardware implementations, the two table lookup can happen in
parallel and only an adder is needed (multipliers being relatively costly). To
do linear interpolation without a multiplier, one keeps two tables. A base table
that tabulates function values at fairly wide intervals, and a table of offsets
relative to those base entries, sampled on narrower intervals in between.

Obviously the deltas stored in the second table are small and require fewer bits
of storage. Further compression is achieved by averaging the slopes of multiple
intervals of the first table, and thus using one section of the second table in
combination with multiple sections of the first table.

Bipartite tables are mostly a popular scheme for hardware implementations. For
example, we used it at AMD implement the reciprocal and reciprocal square root
HW approximations for the 3DNow! technology. With reasonable amounts of ROM
storage, 16-17 bit accuracy could be achieved. One could use bipartite tables
in software, but the storage savings are more limited since one can adjust the
storage per element at best in increments of one byte. For example one could
use ints for the base table and chars or shorts for the offset table.

For 24-bit accuracy, the ROM size required for bipartite tables becomes somewhat
prohibitive, which is why modern GPUs use quadratic interpolation to generate
several functions accurate to apprximately single precision, such as reciprocal,
reciprocal square root, exp2(), log2(), sin(), and cos(). See:

Stuart Oberman and Michael Siu. "A high-performance area-efficient multifunction
interpolator". In Proceedings of the 17th IEEE Symposium on Computer Arithmetic,
pages 272-279, July 2005.


> > I would be interested in learning in which scenarios requiring relative high
> > accuracy (such as here) quadratic interpolation is dispensable, and linear
> > interpolation or straight lookup are feasible.
>
> i need more context to understand what you are seeking to answer. if [...]

I was mostly hoping to hear about use cases where quadratic interpolation is
no longer necessary since Steve stated that

"in many DSP scenarios these days one can just plug the mantissa into
a (larger) LUT and be done with it."

Obviously that's true if the accuracy requirements are fairly low, but for
scenarios like the one posed by the original poster (8.24 precision), I have
a difficult time imagining how one could arrive at an efficient solution with
straight table lookup, or even linear interpolation. But I have been out of
CPU design work for 10+ years, and out of embedded SW work for 5+ year so
the crossover point for switching from straight LUT to linear interpolation
and from linear to quadratic interpolation could have shifted considerably.

-- Norbert


From: Steve Pope on
robert bristow-johnson <rbj(a)audioimagination.com> wrote:

>On Jul 24, 1:46�am, spop...(a)speedymail.org (Steve Pope) wrote:

>Steve, even if you have IEEE doubles, but no std math lib, what would
>you do to get exp() and log()? how would you implement the stdlib
>exp() and log() or sqrt() or sin() or tan() or atan(), if you had to?

>isn't that of significance to some degree?

I can see where it would be significant, but I haven't had to do
this. This undoubtedly is a function of the exact mix of projects
I happen to have worked on but, no, I typically am using either a
C/C++ math library or I am working with target bit widths (or
in a few cases, target short floating-point formats) which vary
from point to point in a system, and aren't amenable to building
a library around.

>or is the source code (say
>for the gnu std math lib) published somewhere? if so, i would like to
>see what order of polynomial they use and the coefficients.
>
>r b-j

Steve
From: Norbert Juffa on
> "robert bristow-johnson" <rbj(a)audioimagination.com> wrote in message
> news:03c06b7f-7ce4-4925-b1b6-fe8b01e91268(a)c10g2000yqi.googlegroups.com...
> On Jul 24, 1:46 am, spop...(a)speedymail.org (Steve Pope) wrote:
> > I do not doubt the validity of the algorithms (for interpolating
> > down to a point where it reduces your table lookup size).
> >
> > What I doubt is the significance, these days, of the vanishing
> > grey area between what you'd want to just do in IEEE double precision
> > floating point, and what you'd want to design to a target bit-width
> > in hardware. The idea of an in-between, general-purpose format
> > is what I'm not sure is still there anymore.
>
> dunno what you mean by an "in-between, general-purpose format" but,
> Steve, even if you have IEEE doubles, but no std math lib, what would
> you do to get exp() and log()? how would you implement the stdlib
> exp() and log() or sqrt() or sin() or tan() or atan(), if you had to?
>
> isn't that of significance to some degree? or is the source code (say
> for the gnu std math lib) published somewhere? if so, i would like to
> see what order of polynomial they use and the coefficients.
>
> r b-j

Best I know, the glibc source distribution includes math library sources:

http://ftp.gnu.org/gnu/glibc/

Source code for two well-regarded math libraries is available via netlib:

http://netlib.org/fdlibm/
http://netlib.org/cephes/

There are probably many more math libraries available for perusal on the
internet, possibly including the open source versions of various vendors'
math libraries.

If you want to "roll your own", there are a number of books that provide
a reasonable starting point. A fairly recent one is

Jean-Michel Muller, "Elementary Functions: Algorithms and Implementation,
2nd ed." (Birkh�user, 2005)


-- Norbert