From: Norbert Juffa on 23 Jul 2010 14:42 "Steve Pope" <spope33(a)speedymail.org> wrote in message news:i2cg29$hq6$1(a)blue.rahul.net... > Norbert Juffa <juffa(a)earthlink.net> wrote: > >>Below I demonstrate how this could be done for log2() in 8.24 fixed >>point. Our input x is of the form 2^e*m, 1 <= m < 2, m = 1 + f. Using >>a "count leading zeros" operation, we quickly establish e and f. Given >>f, 0 <= f < 1, we compute the index i of the closest, smaller, table >>entry in a table of values of log2(x) on [1,2). The difference between >>the actual argument and the function argument corresponding to i shall >>be d. Using the function value from the table, f0, as well as the next >>two table entries, f1 and f2, we compute the coefficients a, b of a >>parabola that fits through all three function values. > > The above is a valuable technique, but times have changed and in many > DSP scenarios these days one can just plug the mantissa into > a (larger) LUT and be done with it. This applies to log(), square(), > and similar functions. > > Steve For a table lookup method to work efficiently on most embedded processors requires a high likelihood that the table elements are present in L1 cache. Often there will be tables for division, sqrt(), exp(), log(), all of whom would want to fit into L1 at the same time, leaving room for other data as well. The L1 may be as small as 8KB or even 4KB, so keeping lookup tables small seems very much desirable. To achieve accuracy to full 8.24 precision one needs a table of 258*4 bytes (~1KB) each for both exp() and log() with quadratic interpolation. Table storage will increase significantly if linear interpolation is used, even if an economical scheme such as bipartite tables is used. Even for hardware implementations the use of quadratic interpolation may well be preferred to keep ROM sizes reasonable. I would be interested in learning in which scenarios requiring relative high accuracy (such as here) quadratic interpolation is dispensable, and linear interpolation or straight lookup are feasible. -- Norbert
From: Clay on 23 Jul 2010 15:19 On Jul 23, 2:42 pm, "Norbert Juffa" <ju...(a)earthlink.net> wrote: > "Steve Pope" <spop...(a)speedymail.org> wrote in messagenews:i2cg29$hq6$1(a)blue.rahul.net... > > Norbert Juffa <ju...(a)earthlink.net> wrote: > > >>Below I demonstrate how this could be done for log2() in 8.24 fixed > >>point. Our input x is of the form 2^e*m, 1 <= m < 2, m = 1 + f. Using > >>a "count leading zeros" operation, we quickly establish e and f. Given > >>f, 0 <= f < 1, we compute the index i of the closest, smaller, table > >>entry in a table of values of log2(x) on [1,2). The difference between > >>the actual argument and the function argument corresponding to i shall > >>be d. Using the function value from the table, f0, as well as the next > >>two table entries, f1 and f2, we compute the coefficients a, b of a > >>parabola that fits through all three function values. > > > The above is a valuable technique, but times have changed and in many > > DSP scenarios these days one can just plug the mantissa into > > a (larger) LUT and be done with it. This applies to log(), square(), > > and similar functions. > > > Steve > > For a table lookup method to work efficiently on most embedded processors > requires a high likelihood that the table elements are present in L1 cache. > Often there will be tables for division, sqrt(), exp(), log(), all of whom > would want to fit into L1 at the same time, leaving room for other data as > well. The L1 may be as small as 8KB or even 4KB, so keeping lookup tables > small seems very much desirable. > > To achieve accuracy to full 8.24 precision one needs a table of 258*4 bytes > (~1KB) each for both exp() and log() with quadratic interpolation. Table > storage will increase significantly if linear interpolation is used, even > if an economical scheme such as bipartite tables is used. Even for hardware > implementations the use of quadratic interpolation may well be preferred to > keep ROM sizes reasonable. > > I would be interested in learning in which scenarios requiring relative high > accuracy (such as here) quadratic interpolation is dispensable, and linear > interpolation or straight lookup are feasible. > > -- Norbert- Hide quoted text - > > - Show quoted text - Years ago I made a light meter that showed relative intensity in terms of F stops and density. Both of these are logarithmic quantities and I used a Mot 68HC05 micro and I went for the accuracy since the computation time being a microsecond or 10mSec just didn't matter! The idea was to force all of the error into the sensor and not have the processor's math add to the final error. Simplicity and tiny code was the desirable thing. My 4 arithematic floating point routines took only 512 bytes! When I added the input, output and log functions, I was still under 1k bytes! Clay
From: robert bristow-johnson on 24 Jul 2010 00:41 On Jul 23, 2:42 pm, "Norbert Juffa" <ju...(a)earthlink.net> wrote: > "Steve Pope" <spop...(a)speedymail.org> wrote in messagenews:i2cg29$hq6$1(a)blue.rahul.net... > > Norbert Juffa <ju...(a)earthlink.net> wrote: > > >>Below I demonstrate how this could be done for log2() in 8.24 fixed > >>point. Our input x is of the form 2^e*m, 1 <= m < 2, m = 1 + f. Using > >>a "count leading zeros" operation, we quickly establish e and f. Given > >>f, 0 <= f < 1, we compute the index i of the closest, smaller, table > >>entry in a table of values of log2(x) on [1,2). The difference between > >>the actual argument and the function argument corresponding to i shall > >>be d. Using the function value from the table, f0, as well as the next > >>two table entries, f1 and f2, we compute the coefficients a, b of a > >>parabola that fits through all three function values. > > > The above is a valuable technique, but times have changed and in many > > DSP scenarios these days one can just plug the mantissa into > > a (larger) LUT and be done with it. This applies to log(), square(), > > and similar functions. > .... > > For a table lookup method to work efficiently on most embedded processors > requires a high likelihood that the table elements are present in L1 cache. > Often there will be tables for division, sqrt(), exp(), log(), all of whom > would want to fit into L1 at the same time, leaving room for other data as > well. The L1 may be as small as 8KB or even 4KB, so keeping lookup tables > small seems very much desirable. > > To achieve accuracy to full 8.24 precision one needs a table of 258*4 bytes > (~1KB) each for both exp() and log() with quadratic interpolation. Table > storage will increase significantly if linear interpolation is used, even > if an economical scheme such as bipartite tables is used. i had to look up "bipartite table" to get an idea about what you're talking about. would this paper: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.36.6368&rep=rep1&type=pdf be a good initial description of the technique? it's a little bit intriguing, it's based on the notion that the first derivative of a function changes much less in a segment of its domain than the "zeroth derivative" of the function. but i don't see quadratic interpolation in this. how is quadratic interpolation done? do you continue to divide the input word into three parts, use two tables, and have a quadratic function rather than a linear one (referring to Eq. 6 in the paper i cited above) for the least significant portion of the sum? > Even for hardware implementations the use of quadratic > interpolation may well be preferred to keep ROM sizes reasonable. if it's just the two tables, why stop at quadratic? > I would be interested in learning in which scenarios requiring relative high > accuracy (such as here) quadratic interpolation is dispensable, and linear > interpolation or straight lookup are feasible. i need more context to understand what you are seeking to answer. if it's still just two parallel table lookups (with sign-extension and addition following), there is no reason that the "a1" coefficient lookup need be a linear function of "x2" as is depicted in Eq. 6 of that paper. in fact, it need not be quadratic. furthermore, you can toss all assumptions to the wind and make some kinda MATLAB program to optimally determine *both* of the a1 and a0 tables in such a way to minimize error. BTW, for all of the fiddling around you need to do to implement this bipartite table lookup, possible sign extension, and addition in software, you might be able to perform a decent polynomial approximation for as few instruction cycles. the only reason i see LUT as superior to a, say, 4th-order or 6th-order polynomial approximation is in sheer speed when that is important and there is plenty of memory to burn. r b-j
From: Steve Pope on 24 Jul 2010 01:46 I do not doubt the validity of the algorithms (for interpolating down to a point where it reduces your table lookup size). What I doubt is the significance, these days, of the vanishing grey area between what you'd want to just do in IEEE double precision floating point, and what you'd want to design to a target bit-width in hardware. The idea of an in-between, general-purpose format is what I'm not sure is still there anymore. Steve
From: robert bristow-johnson on 24 Jul 2010 02:55 On Jul 24, 1:46 am, spop...(a)speedymail.org (Steve Pope) wrote: > I do not doubt the validity of the algorithms (for interpolating > down to a point where it reduces your table lookup size). > > What I doubt is the significance, these days, of the vanishing > grey area between what you'd want to just do in IEEE double precision > floating point, and what you'd want to design to a target bit-width > in hardware. The idea of an in-between, general-purpose format > is what I'm not sure is still there anymore. dunno what you mean by an "in-between, general-purpose format" but, Steve, even if you have IEEE doubles, but no std math lib, what would you do to get exp() and log()? how would you implement the stdlib exp() and log() or sqrt() or sin() or tan() or atan(), if you had to? isn't that of significance to some degree? or is the source code (say for the gnu std math lib) published somewhere? if so, i would like to see what order of polynomial they use and the coefficients. r b-j
First
|
Prev
|
Next
|
Last
Pages: 1 2 3 4 5 Prev: Need omnivorous carrier recovery method Next: Announcing ScopeIIR 5.0 |