This is a discussion on Headers problem within the alt.comp.lang.php forums, part of the PHP Programming Forums category; Hello. I have to prepare some analyzer, which would analyze content of given url. So my PHP files are opening ...
|
|||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
|
|||
|
Hello.
I have to prepare some analyzer, which would analyze content of given url. So my PHP files are opening given url as a file, and read it into variable. The problem is that it doesn't work with all given pages. Some pages require to know what kind of browser (user agent) is opening page, and if there is no such information they don't display anything, and my variable is empty. I have tried to use curl with something like this: $file="my_url"; $agent="Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)"; $ch = curl_init(); curl_setopt($ch, CURLOPT_USERAGENT, $plik); $ret = curl_exec($ch); Nothing happens - the site is not readed. If I add in curl section something like this curl_setopt($ch, CURLOPT_URL, $plik); It shows me all the page, and I don't want to be shown, I wan't only to be read into string variable. If you know some way to do this, please help. Dominik |
|
|||
|
Kimmo Laine <antaatulla.sikanautaa@gmail.com.NOSPAM.invalid>
"Dominik" <no.spam@w.pl> kirjoitti viestissä:dma9j2$gdi$1@atlantis.news.tpi.pl... > Hello. > > I have to prepare some analyzer, which would analyze content of given url. > So my PHP files are opening given url as a file, and read it into > variable. > The problem is that it doesn't work with all given pages. Some pages > require > to know what kind of browser (user agent) is opening page, and if there is > no such information they don't display anything, and my variable is empty. > I > have tried to use curl with something like this: > > $file="my_url"; > > $agent="Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)"; > $ch = curl_init(); > curl_setopt($ch, CURLOPT_USERAGENT, $plik); > $ret = curl_exec($ch); > > Nothing happens - the site is not readed. If I add in curl section > something > like this > > curl_setopt($ch, CURLOPT_URL, $plik); > > It shows me all the page, and I don't want to be shown, I wan't only to be > read into string variable. > If you know some way to do this, please help. Couldn't you grab it with output buffering? if you had something like this: ob_start(); .... your code here ... $str = ob_get_contents(); ob_end_clean(); then I'd imagine that instead of showing the page, the source would be stored in $str... -- SETI @ Home - Donate your cpu's idle time to science. Further reading at <http://setiweb.ssl.berkeley.edu/> |
|
|||
|
Dominik wrote:
> If you know some way to do this, please help. > $agent = 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)'; $url = 'http://www.google.nl/'; $ch = curl_init(); curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_USERAGENT, $agent); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); $ret = curl_exec($ch); JW |