Prev: Encrypting email
Next: fileinfo on RHEL5
From: 9el on 29 Apr 2009 17:33 I just got a project to do on PHP of scraping the body items from static sites or just html sites. Could you experts please suggest me some quick resources? I have to make an WP plugin with the data as well. Regards Lenin www.twitter.com/nine_L
From: 9el on 30 Apr 2009 09:03 On Thu, Apr 30, 2009 at 3:33 AM, 9el <lenin(a)phpxperts.net> wrote: > I just got a project to do on PHP of scraping the body items from > static sites or just html sites. > Could you experts please suggest me some quick resources? > > I have to make an WP plugin with the data as well. Any expert there yet? Was looking for urgent advices on accomplishing the task. Thanks Lenin www.twitter.com/nine_L
From: Shawn McKenzie on 30 Apr 2009 11:00 9el wrote: > On Thu, Apr 30, 2009 at 3:33 AM, 9el <lenin(a)phpxperts.net> wrote: >> I just got a project to do on PHP of scraping the body items from >> static sites or just html sites. >> Could you experts please suggest me some quick resources? >> >> I have to make an WP plugin with the data as well. > > Any expert there yet? Was looking for urgent advices on accomplishing the task. > > Thanks > > Lenin > > www.twitter.com/nine_L If you're just capturing and using the body, the load with file_get_contents() and use preg_match() to select the body or individual tags, etc... For more control, maybe try this: $doc = new DOMDocument(); $doc->loadHTMLFile('http://example.com/page.html'); Then use: http://php.net/manual/book.dom.php -- Thanks! -Shawn http://www.spidean.com
From: Lenin on 2 May 2009 11:48 I thought I would get some more experts giving me more insight about the methods of scraping. I want to grab the body content of pages say of Wordpress but not through RSS. I would assume the pages are static only. And try to scrape the body content but avoiding sidebar, footer, header etc. I tried with the DOM and its fun. But just wanting to know some expert experience on specific to my problem. Thanks in advance.
From: Lenin on 2 May 2009 12:40
On Sat, May 2, 2009 at 10:01 PM, <vincent.perie-oatpri(a)myoffice.mobistar.be>wrote: > Je suis actuellement absent du bureau aussi !!!!<br><br>TEST !!!!! > > I dont get it why I get this automated mail everytime I send message to this thread. :-/ |