From: Nathan Rixham on
Rene Veerman wrote:
> I've browsed wikipedia, sf.net and google for code & papers on what is
> commonly known as NLP.
> ...
> It's an interesting problem though, and probably a profitable one, so
> i'm going to spend some time trying to come up with something better
> from scratch.
>
> On Sun, Mar 14, 2010 at 12:04 AM, Rene Veerman <rene7705(a)gmail.com> wrote:
>>
>> I'm building a newsscraper -> portal.
>> So in the first place, I'm looking for any free/paid useful existing
>> data-mining / text-analysis code that can be run easily from php.
>> TBH i dont even know my feature-requirements really, i'm interested to
>> know what's available.
>>
>> In the second place, i'm looking for free and published-for-a-cost
>> data-mining / text-analysis papers/books that explain how to produce
>> useful results.
>>

wouldn't be diving right in to full on nlp for this ;) it's pretty easy
to do term/semantic extraction nowadays.

have you seen opencalais, alchemy, zemanta, yahoo term extraction or the
like?

honestly I've been doing this for years and would recommend hooking up
to the opencalais and zemanta api's - should you muddle your way towards
linked data in any way from there give me a shout and I'll give you some
pointers. There are already clients for PHP, as well as the normal cms
things like drupal, wordpress etc :)

regards!

ps: if you really want to get in to this kind of thing then
http://gate.ac.uk/ is a good starting (and ending) point