This is a discussion on PHP: extract links AND description from html within the alt.comp.lang.php forums, part of the PHP Programming Forums category; extracting just the links from a webpage is no problem for me -> regex /<a ([^>]*)>/i but ...
|
|||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
|
|||
|
extracting just the links from a webpage is no problem for me ->
regex /<a ([^>]*)>/i but now i want to extract the link and the discription that stands between the <a href=> and the </a> tag. as a result from the script that i'm searching for, i want to get the full <a href=http://www.blabla.com/test/d.html>DESCRIPTOIN</a> can anybody give me some hint, how to do this? |
|
|||
|
Nils Jansen wrote:
> as a result from the script that i'm searching for, i want to get the > full > > <a href=http://www.blabla.com/test/d.html>DESCRIPTOIN</a> > > can anybody give me some hint, how to do this? Try this (remark: array_combine is a PHP 5 specific function, see the manual entry for this function on php.net for a PHP 4 example); <?php // Fetch the content $file = file_get_contents("http://www.php.net/"); // Construct the regular expression // (does not accept image links) $reg = "#<a.*href\s*=\s*(\"|')?([^\"'>]+).*>([^<>]+)</a>#i"; // Parse $file if (preg_match_all($reg, $file, $matches)) { print "<pre>"; print_r(array_combine($matches[2], $matches[3])); print "</pre>"; } ?> HTH; JW |