From: Lasse Reichstein Nielsen on
Thomas 'PointedEars' Lahn <PointedEars(a)web.de> writes:

> Dr J R Stockton wrote:
>
>> The actual sort method should not be a bubble sort, taking N^2 time, but
>> one of the faster ones (Wikipedia &c). It should be possible to use the
>> built-in array sort.
>
> ACK, which I presume to be an implementation of Quicksort or better.

Likely quicksort, often with a fallback to insertion sort for shorter
segments.

>> If the required comparisons are not of string or numeric value of
>> elements, but some more complex function of elements, then one should in
>> a preliminary pass taking time N to compute a sort key for each row,
>> such that keys can be compared directly.
>
> What would be the advantage of computing static keys and comparing them as
> compared to comparing the live values?

If the comparison of two elements requires a significant transformation
of the elements before comparions, performing the transformation once,
at first, or caching the result the first time it's needed, might take
less time than doing it log(n) times (on average) during the sorting.

On the other hand, it complicates the algorithm if you need to keep
two arrays in sync (but that can be avoided by creating an array of
pairs of transformed elements and original indices, and sorting just
that on the transformed elements), and it requires more space, which
would probably be overkill for most smaller arrays.

In the case of the default compare function, it converts both arguments
to strings before comparing them lexicographically.
If that is an expensive operation (which it potentially is, if you
are comparing objects with an expensive toString function), it might
make sense to cache the results. In most cases, the values being compared
with the default compare function will be strings or numbers (and the
numbers really should use a custom compare function too), so any extra
work will be wasted.


>> That can be the .key of each rows[J]?
>>
>> One assumes that the number of columns will be no more than, say, 20;
>> and the number of rows no more than 1000 - OR that the user will
>> understand the need to wait for more than a second.

Hmm, I don't see why.

>> The swapping of rows can be done by exchanging TR elements in a manner
>> which is essentially just a pointer exchange, without any object
>> creation or pseudo-destruction. One should be able to sort the rows
>> array, with a comparison function that just compares rows[J].key.

However, the row container doesn't have a swap operation, so you will
have to go through the normal DOM methods. That won't be nearly as
fast as doing things directly in JavaScript.

> The object that the `rows' property of a table object refers to is not an
> Array instance, but a NodeList implementation; that is what would make this
> approach not considerably more efficient than the other one of reading table
> data into an array, sorting the array and recreating the table from the sort
> result. And are you actually suggesting to augment host objects here?

My guess (it was a long time since I did the test) is that working directly
on the DOM is significantly slower than creating an array in Javascript and
sorting that, and then put the DOM in the correct order (e.g., by removing all
rows and reinserting them in the correct order).


On problem with using the built-in sort is that it's not guaranteed to be
stable. When sorting a table, it's desireable that sorting first on column
and then another is stable wrt. the result of the first sort.
/L
--
Lasse Reichstein Holst Nielsen
'Javascript frameworks is a disruptive technology'

From: Lasse Reichstein Nielsen on
Dr J R Stockton <reply0950(a)merlyn.demon.co.uk> writes:

> In comp.lang.javascript message <ws0szeg3.fsf(a)gmail.com>, Sat, 12 Dec
> 2009 11:14:52, Lasse Reichstein Nielsen <lrn.unread(a)gmail.com> posted:
>>Thomas 'PointedEars' Lahn <PointedEars(a)web.de> writes:
>
> There is no need to have two arrays. Starting with an array of data,
> one transforms it into an array of objects each of which has a property
> data and a property key (initially absent). Before sorting on a column,
> one sets the keys accordingly; then the comparison function for sorting
> the array compares the keys.

True, this is viable in JavaScript beacause of the ease of creating
objects and the single threaded model.

> One may finally convert back to array of data; but that may not be
> necessary.
>
> Consider a singly linked list; it is in some order. The common doubly
> linked list is in two orders, forwards and backwards. And a multiply-
> linked list representing table rows can be in multiple orders; it can in
> effect be sorted both forwards and backwards on every column each in
> various manners. After the overhead of setting it up, any order becomes
> available without further sorting. And if a new row arrives, one can do
> an insertion on whichever sort order is being used, as that order is
> being read out.

Interesting idea, but it's really a data structure in its own right,
not just an array.

>>In the case of the default compare function, it converts both arguments
>>to strings before comparing them lexicographically.
>>If that is an expensive operation (which it potentially is, if you
>>are comparing objects with an expensive toString function), it might
>>make sense to cache the results. In most cases, the values being compared
>>with the default compare function will be strings or numbers (and the
>>numbers really should use a custom compare function too), so any extra
>>work will be wasted.
>
> Not necessarily. If the strings consist of English sentences, using
> only words in a known dictionary, then one can replace the words by
> their position numbers (in fixed-length base 36). Comparison will now
> be faster, and it is possible that the effort could be worth while.

So you convert some (known) strings to shorter, more easily comparable,
strings. That should work.
Unless the set of words to be sorted is extremely large, or known to
have many shared prefixes, I still doubt the overhead is worth it.
Randomly distributed strings have only 1/26 chance of agreeing on the
first letter, and up to 1/6 chance for 't' in English text (due to the
many "the"'s, no doubt).

....
> However the row sorting is done, to be efficient it must be in a manner
> that swaps by manipulating what are in effect pointers / indexes -to-
> rows, and not by copying the actual contents of a row - which will be
> obvious to the experienced programmer but not necessarily to all.

Indeed. Luckily innerHTML doesn't work well with tables in IE :)

/L
--
Lasse Reichstein Holst Nielsen
'Javascript frameworks is a disruptive technology'