From: 9el on
I just got a project to do on PHP of scraping the body items from
static sites or just html sites.
Could you experts please suggest me some quick resources?

I have to make an WP plugin with the data as well.

Regards

Lenin

www.twitter.com/nine_L
From: 9el on
On Thu, Apr 30, 2009 at 3:33 AM, 9el <lenin(a)phpxperts.net> wrote:
> I just got a project to do on PHP of scraping the body items from
> static sites or just html sites.
> Could you experts please suggest me some quick resources?
>
> I have to make an WP plugin with the data as well.

Any expert there yet? Was looking for urgent advices on accomplishing the task.

Thanks

Lenin

www.twitter.com/nine_L
From: Shawn McKenzie on
9el wrote:
> On Thu, Apr 30, 2009 at 3:33 AM, 9el <lenin(a)phpxperts.net> wrote:
>> I just got a project to do on PHP of scraping the body items from
>> static sites or just html sites.
>> Could you experts please suggest me some quick resources?
>>
>> I have to make an WP plugin with the data as well.
>
> Any expert there yet? Was looking for urgent advices on accomplishing the task.
>
> Thanks
>
> Lenin
>
> www.twitter.com/nine_L

If you're just capturing and using the body, the load with
file_get_contents() and use preg_match() to select the body or
individual tags, etc... For more control, maybe try this:

$doc = new DOMDocument();
$doc->loadHTMLFile('http://example.com/page.html');

Then use: http://php.net/manual/book.dom.php

--
Thanks!
-Shawn
http://www.spidean.com
From: Lenin on
I thought I would get some more experts giving me more insight about the
methods of scraping.

I want to grab the body content of pages say of Wordpress but not through
RSS. I would assume the pages are static only. And try to scrape the body
content but avoiding sidebar, footer, header etc.

I tried with the DOM and its fun. But just wanting to know some expert
experience on specific to my problem.

Thanks in advance.
From: Lenin on
On Sat, May 2, 2009 at 10:01 PM,
<vincent.perie-oatpri(a)myoffice.mobistar.be>wrote:

> Je suis actuellement absent du bureau aussi !!!!<br><br>TEST !!!!!
>
> I dont get it why I get this automated mail everytime I send message to
this thread. :-/
 |  Next  |  Last
Pages: 1 2
Prev: Encrypting email
Next: fileinfo on RHEL5