From: Rene Veerman on
Hi..

I'm building a newsscraper -> portal.
Fetching, parsing and storing many links to news items per hour was
not much of a problem.
Translations between languages can be done via google, so that wont be
much of a problem either i suspect.

I dont want to reveal too much of my business idea, but i do need to
do text-analysis, to group related items, and make "suggestions"
lists.
I've had a dabble with creating my own ontology structure (kinda like
a dictionary + thesaurus datamodel) by scraping existing ontology
websites, but needless to say natural text analysis is a huge field.
One that i'm a total noob in.

So in the first place, I'm looking for any free/paid useful existing
data-mining / text-analysis code that can be run easily from php.
TBH i dont even know my feature-requirements really, i'm interested to
know what's available.

In the second place, i'm looking for free and published-for-a-cost
data-mining / text-analysis papers/books that explain how to produce
useful results.

Thanks for your input.