Downloading and parsing web-stuff

This is a discussion on Downloading and parsing web-stuff within the PHP Language forums, part of the PHP Programming Forums category; Very basic: What is the easiest way in php to download the source code (HTML etc.) of a given URL (...


Go Back   Usenet Forums > PHP Programming Forums > PHP Language

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-22-2005
David Rasmussen
 
Posts: n/a
Default Downloading and parsing web-stuff

Very basic:

What is the easiest way in php to download the source code (HTML etc.)
of a given URL (say, http://www.google.com) and parse this code for
certain patterns?

I guess my question can be split in two:

1) How do I download a webpage (into a string or whatever)?

2) How can I do string manupulation, regexp matching, information
extraction etc. on the downloaded information?

/David

Reply With Quote
  #2 (permalink)  
Old 04-22-2005
BKDotCom
 
Posts: n/a
Default Re: Downloading and parsing web-stuff


David Rasmussen wrote:
> I guess my question can be split in two:
>
> 1) How do I download a webpage (into a string or whatever)?


$string = file_get_contents('http://some.url/blah');

> 2) How can I do string manupulation, regexp matching, information
> extraction etc. on the downloaded information?


now look at the docs for preg_match or ereg
I prefer preg_match

if ( preg_match('|<title>(.*?)</title>|',$string,$matches) )
{
print_r($matches);
}

Reply With Quote
  #3 (permalink)  
Old 04-22-2005
 
Posts: n/a
Default Re: Downloading and parsing web-stuff

Treat a full URL as a file.

$contents = implode( file("http://www.google.com/", ''\n") );

Then go to www.php.net/preg_match/ to read up on PCRE (Perl compatible
regular expressions). See also ereg_* functions.

HTH.

-Mike

--
Melt away the Cellulite with Cellulean!
http://www.MeltAwayCellulite.com/


"David Rasmussen" <david.rasmussen@gmx.net> wrote in message
news:42683c71$0$158$edfadb0f@dtext02.news.tele.dk. ..
> Very basic:
>
> What is the easiest way in php to download the source code (HTML etc.)
> of a given URL (say, http://www.google.com) and parse this code for
> certain patterns?
>
> I guess my question can be split in two:
>
> 1) How do I download a webpage (into a string or whatever)?
>
> 2) How can I do string manupulation, regexp matching, information
> extraction etc. on the downloaded information?
>
> /David
>



Reply With Quote
Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are Off
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On




All times are GMT +1. The time now is 11:26 AM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO 3.0.0