Prev: A naive conjecture about computer architecture and artificial intelligence
Next: 2nd call - Applied Computing 2010: until 26 July 2010
From: Andy Glew on 10 Jul 2010 02:13 On 7/9/2010 8:09 PM, Robert Myers wrote: > nmm1(a)cam.ac.uk wrote: >> >> Yes. There is a lot of good mathematical work on this, in the area >> of real (i.e. Kolmogorov) complexity theory. Most of it can be >> summarised for practical use as "if you have a genuinely random >> finite state machine, you are up the creek without a paddle, so >> consider what empirical, semi-mathematical heuristics you can use." >> > The theory and practice of identifying patterns in semi-random time > series must be very well developed, if only because of those > front-running supercomputers humming away on Wall Street. Is there a > related subfield (Andy apparently has developed one for his own use) > where the available resources are extremely constrained? How much of > this is genuinely new ground and how much might already exist in some > other context? Haven't developed. Want to develop. Trevor Mudge and his students have Analysis of Branch Prediction via Data Compression (1996) http://cardit.et.tudelft.nl/~heco/lit/chen96.ps.gz by I-Cheng K. Chen , John T. Coffey , Trevor N. Mudge in Proceedings of the 7th International Conference on Architectural Support for Programming Languages and Operating Systems Limits to Branch Prediction (1996) by Trevor N. Mudge , I-Cheng K. Chen , John T. Coffey In Proceedings of the Seventh International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS VII) however, although the paper's titles look promising, I do not think they have truly hit the limits. --- Kolmogorov theory also looks promising, but I have not found as much as I would like in the area of analysis of semi-random, as opposed to truly random.
From: nmm1 on 10 Jul 2010 04:39 In article <dwRZn.6659$KT3.5929(a)newsfe13.iad>, Robert Myers <rbmyersusa(a)gmail.com> wrote: >> >> Yes. There is a lot of good mathematical work on this, in the area >> of real (i.e. Kolmogorov) complexity theory. Most of it can be >> summarised for practical use as "if you have a genuinely random >> finite state machine, you are up the creek without a paddle, so >> consider what empirical, semi-mathematical heuristics you can use." >> >The theory and practice of identifying patterns in semi-random time >series must be very well developed, if only because of those >front-running supercomputers humming away on Wall Street. Is there a >related subfield (Andy apparently has developed one for his own use) >where the available resources are extremely constrained? How much of >this is genuinely new ground and how much might already exist in some >other context? > >The empirical, semi-mathematical heuristics must have themselves come >from a learning predictor somewhere, even if it is the brain of a >computer architect. No, no, not at all. Firstly, the whole point of such things IS that there is no theory - if anything works, fine, and trial-and-error is the only way to find out. And, worse, the very approaches to try are selected by guesswork. Secondly, all that those people who gamble with our money need is a small margin (say, 5%) - and that would be of no use in this context. Regards, Nick Maclaren.
From: Terje Mathisen "terje.mathisen at on 11 Jul 2010 17:14 George Neuner wrote: > On Fri, 09 Jul 2010 22:03:31 +0200, Terje Mathisen<"terje.mathisen at > tmsw.no"> wrote: > >> MitchAlsup wrote: >>> But beyond this, the predictors are really swimming up stream agains >>> the current of random placement of semi-random sized objects allocated >>> on a heap. Especially when subject to garbage collection. >> >> I think this is a key issue: >> >> Garbage collectors seems to me to be _the_ proper spot to implement data >> location optimizations: >> >> Since most garbage-collectable languages allow the collector to have >> full knowledge of all pointer data items, it should also be capable of >> detecting left/right pointers in a tree structure right? >> >> If so, one possible approach would be to try to minimize the total >> distance covered by said pointers, measured as the offset between the >> address of the pointer itself and the object it points to. > > Coalescing tree data structures during GC has been studied quite a > bit. There seems to be consensus that, where possible, the best thing > to do is to maintain the address order of object allocations because > that tends to be a good predictor of object lifetimes. But > maintaining object address ordering is relatively complex due to the > need to sort addresses, so where it is infeasible, regrouping > according to depth first traversal has been shown to yield slightly > better locality than breadth first (at least for the test cases that > have been studied). [big snip] > Not to mention that GC itself, regardless of algorithm(s), is a fairly > bad fit to LRU paging because the collection process potentially may > touch and drag in large amounts of non-resident data - displacing data > actively in use. There have been proposals for various extensions to > VMM APIs to give processes more direct control over their own paging, > but, of course, these may be at odds with the global policies the VMM > system may be trying to implement. Thanks for an enlightening post! Terje -- - <Terje.Mathisen at tmsw.no> "almost all programming can be viewed as an exercise in caching"
From: Robert Myers on 12 Jul 2010 12:58 Terje Mathisen wrote: > > Since 60-70% of the decoding time was spent in the IMDCT (Inverse > Modified Discrete Cosine Transform), the initial suggestion from Jeff > Roberts was to simply write a SIMD version of the existing IMDCT (which > was believed to be optimal, but written in plain C). > I've been out of the ridiculously fast Fourier transform business for a while, but there were a lot of specialized tricks you could play that could reduce the operations count if you were willing to search beyond the recursive neatness of divide and conquer. I don't know how well-known they ever were or how useful they ever were outside some very specialized (and, at the time, classified) contexts. Robert.
From: Terje Mathisen "terje.mathisen at on 12 Jul 2010 13:52
Robert Myers wrote: > Terje Mathisen wrote: > >> >> Since 60-70% of the decoding time was spent in the IMDCT (Inverse >> Modified Discrete Cosine Transform), the initial suggestion from Jeff >> Roberts was to simply write a SIMD version of the existing IMDCT >> (which was believed to be optimal, but written in plain C). >> > > I've been out of the ridiculously fast Fourier transform business for a > while, but there were a lot of specialized tricks you could play that > could reduce the operations count if you were willing to search beyond > the recursive neatness of divide and conquer. I don't know how > well-known they ever were or how useful they ever were outside some very > specialized (and, at the time, classified) contexts. The IMDCT is less regular than FFT, and a bit harder to play tricks with, but for short lengths, like the fixed 17-sample transform used by mp3, you basically unroll it completely, then combine as many operations as possible. Ogg Vorbis can work with more or less any pair of power of two lengths, the default has been to use 256 and 2048, but a decoder has to be prepared to handle any length. The rewrite combined the two outermost recursion levels, combining some operations and getting rid of some. It is possible to implement IMDCT with a regular FFT, but the required pre and post processing made that less interesting. OTOH, afaik it should definitely be possible to plug length=256 and vector=4 into the FFTW synthesizer and get a very big, completely unrolled, minimum-operation count, piece of code out of it. The three problems with this would be (a) code size (b) the need to handle length=2048 (as well as any other power of two) (c) licensing. Terje -- - <Terje.Mathisen at tmsw.no> "almost all programming can be viewed as an exercise in caching" |