From: [Jongware] on
arnuld wrote:
>> On Tue, 06 May 2008 15:32:58 +0200, [Jongware] wrote:
>
>> I wrote exactly that same program, using only a binary tree. Printing
>> the words was done with a recursive routine -- which is also a pretty
>> simple way :-)
>> Despite heavy use of generalized C functions such as strdup, strcmp,
>> etc., its performance was astounding; a matter of seconds, and most of
>> this time was input/output. My test file was an ASCII version of the
>> King James Bible, so now I know how often people used to say "thee" in
>> those times.
>
>
> I have written the a program that is *nearly* like it before I even posted
> my problem but that program stores and prints the words alphabetically and
> it uses a binary tree for that.
>
> But now I have to sort and print the input by the number of occurrences in
> the input which is not known in advance ( unlike alphabetical sorting
> where you already know p < s )
>
> any ideas that you can give me from your code ?

Ah, no, because that requirement wasn't there.
I guess using the binary tree for storing and counting is still the best
way to go (never mind the 9-line perl solutions ...).
That's because your *primary* requirement is reading and counting words
/fast/. You need the huge speed advantage of an alphabetic sorting
binary tree for this -- you tack on the count number for, well, I think
for /free/! (That's because for every word /w/ you already have to walk
down the tree to see if it's there. If it is, increase w->count; if it
isn't, insert it in the tree and initialize w->count to 1.)

So that's the first half. The second part, sorting on the count, has
nothing in common with this binary tree ... so ditch it! If you keep a
running count of how many words you have in your tree, you can create an
empty array and move your data into this, using a recursive function.
Then re-sort, using the count as primary sort key.

[Jw]
From: Ben Bacarisse on
"[Jongware]" <sorry(a)no.spam.net> writes:

> arnuld wrote:
>> I have written the a program that is *nearly* like it before I even posted
>> my problem but that program stores and prints the words alphabetically and
>> it uses a binary tree for that.
>>
>> But now I have to sort and print the input by the number of
>> occurrences in the input which is not known in advance ( unlike
>> alphabetical sorting
>> where you already know p < s )
>>
>> any ideas that you can give me from your code ?
>
> Ah, no, because that requirement wasn't there.
<snip>
> So that's the first half. The second part, sorting on the count, has
> nothing in common with this binary tree ... so ditch it! If you keep a
> running count of how many words you have in your tree, you can create
> an empty array and move your data into this, using a recursive
> function. Then re-sort, using the count as primary sort key.

Another way is to re-use the binary tree code. After reading and
counting the words, make a new tree (sharing the strings of course)
ordered by word count rather than by word. It is interesting to see
how much code you can share between the two parts of the program. I
have just done this as an exercise and I am happy to post the code if
this is not someones coursework.

--
Ben.
From: [Jongware] on
Ben Bacarisse wrote:
> "[Jongware]" <sorry(a)no.spam.net> writes:
>
>> arnuld wrote:
>>> I have written the a program that is *nearly* like it before I even posted
>>> my problem but that program stores and prints the words alphabetically and
>>> it uses a binary tree for that.
>>>
>>> But now I have to sort and print the input by the number of
>>> occurrences in the input which is not known in advance ( unlike
>>> alphabetical sorting
>>> where you already know p < s )
>>>
>>> any ideas that you can give me from your code ?
>> Ah, no, because that requirement wasn't there.
> <snip>
>> So that's the first half. The second part, sorting on the count, has
>> nothing in common with this binary tree ... so ditch it! If you keep a
>> running count of how many words you have in your tree, you can create
>> an empty array and move your data into this, using a recursive
>> function. Then re-sort, using the count as primary sort key.
>
> Another way is to re-use the binary tree code. After reading and
> counting the words, make a new tree (sharing the strings of course)
> ordered by word count rather than by word. It is interesting to see
> how much code you can share between the two parts of the program.

Ahh yes -- you might even be able to create /pointers/ to the relevant
functions, and use these twice in your main loop. Maybe not optimal
(then again, maybe it is), but still a nice exercise.

> I
> have just done this as an exercise and I am happy to post the code if
> this is not someones coursework.

<g> It's the mental state -- doing stuff by way of flexing your brain,
rather than because it's homework -- that confuses the homework doers.
You might receive a number of mails "can i c ur codes plz".

[Jw]
From: Ben Pfaff on
arnuld <sunrise(a)see.sigs.invalid> writes:

> I really don't understand the difference between a Hash-Table and a Binary
> Tree because I never came across them. I want to understand and implement
> them but when I tried to read it using "Introduction to Algorithms -
> Corment et al. 2/e", I see he is using some weired words like "satellite
> data"

About the one thing in common between hash tables and binary
trees is that they both can be used for data storage and
retrieval. I'd suggest typing those terms into a friendly search
engine and trying to understand them again.
--
Ben Pfaff
http://benpfaff.org