|
From: Martin Knoblauch Revuelta on 5 Jun 2008 10:09 On 3 jun, 22:33, grpad...(a)gmail.com wrote: > Regarding your (MKR) previous post on June 3: > > I agree that using prime values for gaps will make a difference, but > I still think someone at least as clever as me can find some patterns > that will defeat your latest implementation. I did not expect to > find > a pattern as small as 324 elements long to defeat the previously > posted > algorithm, and there may be shorter patterns still. There are :) Before the primes thing, I was using the next gap table: 1 2 3 4 5 6 7 8 10 12 15 19 24 30 38 48 61 78 100 128 164 210 269 Note that 61 is the only odd gap in a long run of even numbers. There is a killer sequence of 225 elements (using the pattern you suggested): {0, 113, 1, 114, 2, 115, ...... 110, 223, 111, 224, 112} The sequence remains untouched until gap 61. Then it is broken into 3 pieces that follow a similar pattern: {0, 31, 1, 32, ...... 28, 59, 29, 60, 30, 61, 113, 62, 114, 63, ...... 162, 111, 163, 112, 164, 195, 165, ...... 222, 192, 223, 193, 224, 194} The next gaps, being even, don't change anything. After the last pass, two elements in the center remain unsorted. Note that all even gaps contributed to the disaster, including those used after the gap 61. That's why I prefer primes. I believe that primes solve the problem of _this_kind_ of killer sequences. Though, there can be other kinds of killer sequences, of course :( > I prefer the elegance of providing an algorithm that generates the > table. I think starting with a large enough gap and then making sure > each two neighboring gaps are relatively prime and close would > suffice. I don't think so. In the example I exposed above, non neighboring gaps 'worked' together to spoil the whole thing. > Also, the table will not address the issue of arbitrarily large > arrays, > so it will not settle the worst case issue for all n. Well, the table I provided would work for every n<2^32 (if proved to be correct). That's not bad for a 32 bits machine ;) A table of four times that size (twice that number of elements, double size each one) would do the job for a 64 bits machine. > If your latest > implementation runs fast enough on current real world examples though, > the arbitrary large array issue will be little more than a quibble. The advantage is not speed, but predictability in terms of memory access pattern (if proved to be correct). > I came across Thomas Baudel's sort visualizer on the web, which had a > version of combsort as well as other sorts which you could race > against > one another. What struck me was the (reasonable) assumption on > timing: > compares are assumed to take 5 times longer than swaps. This depends on what you are sorting and how complex is the sorting criterion: - Integers or floating point numbers (usually comparable as integers) - Big structures (sort by just one field / by several fields) - An integer index to a fixed array of big structures (") The cache can also make a difference. Heapsort for instance is anti- cache, while quicksort uses it very efficiently (on average). > If compares > were cheaper than swaps, I think combsort would shine. Quicksort and heapsort can be tuned for saving comparisons or assignments. I guess combsort will never beat them in speed. Batcher's odd-even mergesort is, as combsort, a sorting network. Therefore, it is also deterministic (fixed memory access pattern). It makes less comparisons and swaps than combsort up to about 1000 elements. > Maybe one can > "poke holes" in the bubblesort pass to make the algorithm faster but > still effective. I don't know what you mean here. Do you mean saving unnecesary comparisons? We still can't assure the correctness of the algorithm as it is now... I'd leave that by now. Well, maybe thinkink about it would help in proving the correctness... > The first gaps Knuth verified for sorting 32 elements ( 29,23, etc.): > was that using bubblesort passes or full sort? Sorry, the line wrap provoked a little confusion. Knuth formulated the zero-one principle. I applied it myself, for verifying the algorithm up to 32 elements. Thanks to the zero-one principle, we can verify a sorting network in O(2^n) time instead of O(n!) time, which is what a naive implementation would take. I made an implementation that 'packs' the 32 zero-one elements (bits) in a single integer, and then sorts them using bit shifts and masks (following the algorithm of course). Trying all 2^32 combinations only requires counting from 0 to ~0UL, and sorting the bits of each number. This takes about two hours and a half in my computer. O(2^n) is still a very high complexity. If I had time, I would try to make a faster verifier. Sorting networks have a very interesting property: if you verify a sorting network of N lines, you can trivially build smaller correct networks of any size lesser than N (just removing some lines, together with the comparators linked to them). Therefore, if we prove that the table is correct for size N, we can safely use the same table for sorting shorter sequences. > At this stage, I think testing gap sequences on examples similar to > my 324 element example would be profitable. I think that the usage of prime gaps wipes that problem away. We should try to find something different. As I said, I would try to write an automatic verifier. > Also, one should take > advantage of a certain kind of recursion that occurs: if an array > is partitioned ( i.e. there is an index i so that for all k<=i and > all j > i, a[k] < a[j] ) then combsort acts on both parts in parallel, > with a few extra comparisons between the parts thrown in which do not > swap any elements. It suggested to me a hybrid of quick sort and > combsort. However, it seems to be hard to find a nice partition > quickly. Yes, and this is specially interesting if you plan to implement the sorting network in hardware. > Another avenue is to start generating words based on the given gap > sequence. [..] That's a very interesting approach, but the more I think of it, the more I get convinced that prime gaps avoid the failure. > I see this > argument applying to your current implementation, so I do not see a > subquadratic worst case bound on this implementation either. I would rather watch big/small elements in the middle, moved in the wrong direction several times in a row, because other bigger/smaller elements were in the opposite side of their final place. > I recommend the generating word approach as it seems a direct way to > generate the killer sequences that you need but hope are not there. If I had time... Anyway, I would also try the verifier approach. Once verified for an acceptable big size, we could try to remove comparisons, or change their order for partitioning and parallelization. Thanks again, Martin
From: Ajinkya on 6 Jun 2008 05:17 Can anyone suggest a good link for information on how Comb sort works ? The wikipedia page is of no help as it doesnot explain how the shrink factor is used in sorting.... Am i missing something obvious ? -Ajinkya
From: Martin Knoblauch Revuelta on 6 Jun 2008 05:49 On 6 jun, 10:17, Ajinkya <kaleajin...(a)gmail.com> wrote: > Can anyone suggest a good link for information on how Comb sort > works ? > The wikipedia page is of no help as it doesnot explain how the shrink > factor is used in sorting.... > > Am i missing something obvious ? The clearest explanation is, perhaps, a bit of pseudocode: comb := size / shrinkfactor; while (comb >= 1) { i := 0; while (i+comb < size) { if (A[i] > A[i+comb]) swap them; i := i + 1; } comb := comb / shrinkfactor; } Additionally, the original algorithm repeats the last pass with comb 1 until the array is fully sorted. This final step seems to be unnecessary when the shrink factor is small enough (9/7=1.285714... in my experiments). You can see the algorithm in action here: http://www.cs.ubc.ca/spider/harrison/Java/sorting-demo.html I hope it helps. Regards, Martin
From: Martin Knoblauch Revuelta on 18 Jun 2008 11:19 Well, this discussion seems to be finished (at least by now). Thanks Gerhard for your suggestions. I'd like to provide a final version of the algorithm that works in every case, though it's time complexity is still not proven to be O(n log n): const unsigned long combtable_7_9_primes_and_insertion[] = { /* Note that small combs have been removed */ 11UL, 13UL, 16UL /* Not prime!! */, 19UL, 23UL, 29UL, 37UL, 47UL, 59UL, 73UL, 89UL, 113UL, 139UL, 173UL, 211UL, 271UL, 347UL, 443UL, 569UL, 727UL, 929UL, 1193UL, 1531UL, 1951UL, 2503UL, 3217UL, 4133UL, 5309UL, 6823UL, 8761UL, 11261UL, 14461UL, 18587UL, 23893UL, 30713UL, 39461UL, 50723UL, 65213UL, 83843UL, 107791UL, 138587UL, 178183UL, 229081UL, 294529UL, 378671UL, 486839UL, 625927UL, 804761UL, 1034659UL, 1330253UL, 1710311UL, 2198909UL, 2827159UL, 3634913UL, 4673447UL, 6008707UL, 7725469UL, 9932729UL, 12770651UL, 16419407UL, 21110647UL, 27142259UL, 34897153UL, 44867737UL, 57687089UL, 74169097UL, 95360267UL, 122606053UL, 157636313UL, 202675247UL, 260582447UL, 335034569UL, 430758701UL, 553832599UL, 712070479UL, 915519169UL, 1177096073UL, 1513409231UL, 1945811849UL, 2501758079UL, 3216546101UL, 4135559269UL, ~0UL }; void combsort_with_table_7_9_primes_and_insertion (unsigned A[], unsigned len) { unsigned comb, n, * p, * q, * r, * s; unsigned tmp; n = 0; while (len > combtable_7_9_primes_and_insertion[n]) n ++; while (n--) { comb = combtable_7_9_primes_and_insertion[n]; for (p=A, q=p+comb, r=p+len; q!=r; p++, q++) if (*p>*q) { tmp = *p; *p = *q; *q = tmp; } } // End with insertion sort for (s=A+1, r=A+len; s!=r; s++) { p = s - 1; q = s; if (*p>*q) { tmp = *q; do { *q = *p; q = p--; } while (q!=A && *p>tmp); *q = tmp; } } } Best regards, Martin
From: James.Barbetti on 25 Jun 2008 09:36
> I'd like to provide a final version of the algorithm that works in > every case, though it's time complexity is still not proven to be O(n > log n): Using primes is a step in the right direction. Finishing with insertion sort is even more important. However, combsort_with_table_7_9_primes_and_insertion still has an O(n*n) worst case (but it least it doesn't have an O(n*n) average, as many combsort implementations do -- See the comments I posted to http://en.wikipedia.org/?title=Talk:Comb_sort about four years ago). Demonstrating that combsort_with_table_7_9_primes_and_insertion has a O(n*n) worst case is messy (though formally proving it is probably even more difficult). My "bad input" example generator is given N and a gap sequence. It uses two auxiliary arrays (but could be reworked to use just the input array itself). It works by marking a contiguous block of elements in the middle of the array (these are to have zeroes in them after the comb passes have completed) and "running the comb passes backwards", using, say, the following rules for each pass: for each gap, g for (int i=0;i<n;++i) if (current[i]) if (i<g) previous[i]=1; //needn't have moved else if (i<n-g) previous[i+g]=1; //could have moved down else for (int j=i;j>=0;j-=g) previous[j]=1; swap(current, previous) (sneakier sets of rules are possible but those ones will work well enough) and finally, for (i=0;i<n;i++) array[i] = current[i] ? 0 : 1; I calculate a block size that yields roughly n/4 zeroes in the input. The cheapest way is to take the input from running the above routine with a block size of 1 and then going: int hackback = n / 4 / zeroes; int lastZero = n+hackback; int ones = 0; for (int i=n-1;i>=0;i--) if (A[i]==0) lastZero=i; else if (i+hackback>lastZero) A[i]=0; ....to "widen the holes" in the array. All that gives you an algorithm for generating a "bad input" that runs in O(n.logn) time. For martin's sequence, with n>8000 or so, I get a (# of moves required in the insertion phase) that is consistently above 0.0005*n*n (at least, I do out to n=50 million or so). The same thing will happen for most other sequence definitions that use log(n) comb passes (though the n*n coefficient will be lower as the gap[i+1]/gap[i] ratio approaches 1.0). I still think alternating between left-to-right and right-to-left sweeps is the way to go (though *not* *not* *not* reusing the previous gap value for the right-to-left sweeps; use a new value for every sweep).. With that approach (and prime gaps) gap[i+1]/gap[i] ratios less than sqrt(2)~=1.4 seem to work reasonably well (and I haven't been able to generate "killer inputs", hard as I've tried). And finish with insertion-sort, yes. Regards, James Barbetti |