From: Jerry Avins on
On 7/21/2010 9:00 PM, Steve Pope wrote:
> dsp.study<learner.study(a)n_o_s_p_a_m.gmail.com> wrote:
>
>> This is on a mips processor which doesn't have native floating pt support.
>> I have converted from floating pt to fixed pt (8.24) i.e. 8 bits for int
>> part and 24 for fraction.
>>
>> I'd looking at log10 - any pointers would be highly appreciated.
>
> I'd recommend implementing log2 in fixed point, and recoding
> anything to use it instead of log10.

Even without recoding, they differ only by a simple multiplicative
constant. Does .30103 seem familiar?

Jerry
--
Engineering is the art of making what you want from things you can get.
�����������������������������������������������������������������������
From: Steve Pope on
Jerry Avins <jya(a)ieee.org> wrote:

>On 7/21/2010 9:00 PM, Steve Pope wrote:

>> I'd recommend implementing log2 in fixed point, and recoding
>> anything to use it instead of log10.

>Even without recoding, they differ only by a simple multiplicative
>constant. Does .30103 seem familiar?

Sure, one can put in this factor every time one had used a log10(),
but you will have less round-off error accumulating if you don't do this.

Whether it's worth redoing your code depends on how sensitive
it is to such things.

Steve
From: Norbert Juffa on
"Jerry Avins" <jya(a)ieee.org> wrote in message news:fZO1o.21428$lS1.13758(a)newsfe12.iad...
> On 7/21/2010 9:00 PM, Steve Pope wrote:
>> dsp.study<learner.study(a)n_o_s_p_a_m.gmail.com> wrote:
>>
>>> This is on a mips processor which doesn't have native floating pt support.
>>> I have converted from floating pt to fixed pt (8.24) i.e. 8 bits for int
>>> part and 24 for fraction.
>>>
>>> I'd looking at log10 - any pointers would be highly appreciated.
>>
>> I'd recommend implementing log2 in fixed point, and recoding
>> anything to use it instead of log10.
>
> Even without recoding, they differ only by a simple multiplicative constant. Does .30103 seem familiar?
>
> Jerry

If you can accomodate a reasonably hefty table and your CPU has a fast
integer multiply (with the ability to get either a double-wide product
or with a MULHI instruction available), quadratic interpolation in a
table of function values is hard to beat in my experience.

Below I demonstrate how this could be done for log2() in 8.24 fixed
point. Our input x is of the form 2^e*m, 1 <= m < 2, m = 1 + f. Using
a "count leading zeros" operation, we quickly establish e and f. Given
f, 0 <= f < 1, we compute the index i of the closest, smaller, table
entry in a table of values of log2(x) on [1,2). The difference between
the actual argument and the function argument corresponding to i shall
be d. Using the function value from the table, f0, as well as the next
two table entries, f1 and f2, we compute the coefficients a, b of a
parabola that fits through all three function values. The value of
log2(m) is then computed as f0 + ((a * d) + b) * d, which is finally
added to the integer part of the result, i.e. log2(2^e) = e, computed
earlier.

The same technique can be used for exp2(), and pow(x,y) can then be
constructed from those two in the straightforward manner. I am aware
that this leads to loss of accuracy if x is near 1 and y is large, but
in many situations this is good enough.

Note that the log implementation below is not fully accurate to the
8.24 precision, but it gets close.

-- Norbert


#include <stdlib.h>
#include <stdio.h>
#include <math.h>

/* This is a single machine instruction on many platforms */
__inline int clz (unsigned int x)
{
/* Use Robert Harley's table-based algorithm */
static const unsigned char clz_tab[64] = {
32, 31, 31, 16, 31, 30, 3, 31, 15, 31, 31, 31, 29, 10, 2, 31,
31, 31, 12, 14, 21, 31, 19, 31, 31, 28, 31, 25, 31, 9, 1, 31,
17, 31, 4, 31, 31, 31, 11, 31, 13, 22, 20, 31, 26, 31, 31, 18,
5, 31, 31, 23, 31, 27, 31, 6, 31, 24, 7, 31, 8, 31, 0, 31
};
x |= x >> 1;
x |= x >> 2;
x |= x >> 4;
x |= x >> 8;
x |= x >> 16;
x *= 7U*255U*255U*255U;
return (int)clz_tab[x >> 26];
}

/* One or two machine instructions on most platforms */
__inline long long int mul_wide (int x, int y)
{
return ((long long int)x) * y;
}

/* compute log2(x) in 8.24 fixed-point format, x != 0 */
unsigned log2_8_24 (unsigned x)
{
/* log2(x) on interval [1, 2+2/128] in 1.31 format */
static const unsigned int log_tab[130] = {
0x00000000, 0x016fe50b, 0x02dcf2d1, 0x04473475,
0x05aeb4dd, 0x07137eae, 0x08759c50, 0x09d517ef,
0x0b31fb7d, 0x0c8c50b7, 0x0de42120, 0x0f397609,
0x108c588d, 0x11dcd197, 0x132ae9e2, 0x1476a9fa,
0x15c01a3a, 0x170742d5, 0x184c2bd0, 0x198edd07,
0x1acf5e2e, 0x1c0db6ce, 0x1d49ee4c, 0x1e840be7,
0x1fbc16b9, 0x20f215b7, 0x22260fb6, 0x23580b65,
0x24880f56, 0x25b621f9, 0x26e2499d, 0x280c8c76,
0x2934f098, 0x2a5b7bf9, 0x2b803474, 0x2ca31fc9,
0x2dc4439b, 0x2ee3a575, 0x30014ac6, 0x311d38e6,
0x32377512, 0x33500472, 0x3466ec15, 0x357c30f3,
0x368fd7ee, 0x37a1e5d4, 0x38b25f5a, 0x39c14924,
0x3acea7c0, 0x3bda7fa9, 0x3ce4d544, 0x3dedace6,
0x3ef50ad2, 0x3ffaf335, 0x40ff6a2e, 0x420273ca,
0x43041403, 0x44044ec5, 0x450327eb, 0x4600a33e,
0x46fcc47a, 0x47f78f4c, 0x48f10751, 0x49e93016,
0x4ae00d1d, 0x4bd5a1d8, 0x4cc9f1ab, 0x4dbcffee,
0x4eaecfeb, 0x4f9f64de, 0x508ec1fa, 0x517cea63,
0x5269e12f, 0x5355a96d, 0x5440461c, 0x5529ba33,
0x5612089a, 0x56f93433, 0x57df3fd0, 0x58c42e3d,
0x59a80239, 0x5a8abe79, 0x5b6c65aa, 0x5c4cfa6c,
0x5d2c7f59, 0x5e0af6ff, 0x5ee863e5, 0x5fc4c886,
0x60a02757, 0x617a82c3, 0x6253dd2c, 0x632c38ed,
0x64039858, 0x64d9fdb7, 0x65af6b4b, 0x6683e34f,
0x675767f5, 0x6829fb69, 0x68fb9fce, 0x69cc5741,
0x6a9c23d6, 0x6b6b079c, 0x6c39049b, 0x6d061cd3,
0x6dd2523d, 0x6e9da6ce, 0x6f681c73, 0x7031b512,
0x70fa728c, 0x71c256ba, 0x72896373, 0x734f9a83,
0x7414fdb5, 0x74d98eca, 0x759d4f81, 0x76604191,
0x772266ad, 0x77e3c082, 0x78a450b8, 0x796418f2,
0x7a231ace, 0x7ae157e3, 0x7b9ed1c7, 0x7c5b8a07,
0x7d17822f, 0x7dd2bbc4, 0x7e8d3846, 0x7f46f932,
0x80000000, 0x80b84e23
};

int a, b, c, d, e, f, i, t, r, f0, f1, f2;

c = clz(x);
e = 7 - c;
f = x << (c + 1);
i = (unsigned int)f >> 25;
d = f - (i << 25);
f0 = log_tab[i];
f1 = log_tab[i+1];
f2 = log_tab[i+2];
t = (f1 - f0) << 1;
a = (f2 - f0) - t;
b = t - a;
t = (int)(mul_wide(a, d) >> 25) + b;
t = (int)(mul_wide(t, d) >> 26);
r = f0 + t;
return (e << 24) + (((unsigned int)r + 64) >> 7);
}

int main (void)
{
unsigned int arg, ref, res, err;
unsigned int cnt[4] = {0, 0, 0, 0};

for (arg = 1; arg != 0; arg++) {
ref = (int)(2.4204406323122971e+007 * log (arg / 16777216.0) + 0.5);
res = log2_8_24 (arg);
err = abs (res - ref);
if (err > 3) {
printf ("arg=%08x res=%08x ref=%08x\n", arg, res, ref);
} else {
cnt[err]++;
}
if (!(arg & 0x07ffffff)) {
printf ("arg=%08x\n", arg);
}
}
printf ("errcnt 0: %u 1: %u 2: %u 3: %u\n",
cnt[0], cnt[1], cnt[2], cnt[3]);
return EXIT_SUCCESS;
}




From: Rick Lyons on
On Tue, 20 Jul 2010 06:15:44 -0500, "dsp.study"
<learner.study(a)n_o_s_p_a_m.gmail.com> wrote:

>Hi,
>
>I have successfully converted add, sub, mul and div operations to fixed pt
>and see ~10 times improvement.
>
>I'm not sure how to go about doing pow and log functions...can some one
>please suggest an algo or pt me to an open source code for reference.
>
>thanks,
>Marty

Hi Marty,
Clay S. Turner has written a neat article, titled
"A Fast Binary Logarithm Algorithm". That article, most
appropriate for fixed-point processors, will appear in
the Sept. 2010 "DSP Tips & Tricks" column of the IEEE
Signal Processing magazine.

Good Luck,
[-Rick-]
From: Steve Pope on
Norbert Juffa <juffa(a)earthlink.net> wrote:

>Below I demonstrate how this could be done for log2() in 8.24 fixed
>point. Our input x is of the form 2^e*m, 1 <= m < 2, m = 1 + f. Using
>a "count leading zeros" operation, we quickly establish e and f. Given
>f, 0 <= f < 1, we compute the index i of the closest, smaller, table
>entry in a table of values of log2(x) on [1,2). The difference between
>the actual argument and the function argument corresponding to i shall
>be d. Using the function value from the table, f0, as well as the next
>two table entries, f1 and f2, we compute the coefficients a, b of a
>parabola that fits through all three function values.

The above is a valuable technique, but times have changed and in many
DSP scenarios these days one can just plug the mantissa into
a (larger) LUT and be done with it. This applies to log(), square(),
and similar functions.

Steve