This is a discussion on Eregi pattern matching - bit of a challenge I thinks within the alt.comp.lang.php forums, part of the PHP Programming Forums category; Hi,. I'm trying to detect any links that are contained within an html page using eregi pattern matching. I ...
|
|||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
|
|||
|
Hi,. I'm trying to detect any links that are contained within an html page
using eregi pattern matching. I was wondering if there are any pattern matching geniuses out there who could write a pattern that merges all the different manners in which a link could be wriiten, Current patterns I can think of include: <a href=x.com> no spaces betwen href, equals and url, no quotation marks around url <a href =x.com> space between href and equals, no space between equals and url, no quotation marks round url <a href= x.com> no space between href and equals, space between equals and url, no quotation marks around url <a href = x.com> space between href and equals, space between equals and url, no quotation marks round url <a href='x.com'> no spaces betwen href, equals and url, single quotation marks around url <a href ='x.com'> space between href and equals, no space between equals and url, single quotation marks round url <a href= 'x.com'> no space between href and equals, space between equals and url, single quotation marks around url <a href = 'x.com'> space between href and equals, space between equals and url, single quotation marks round url <a href="x.com"> no spaces betwen href, equals and url, double quotation marks around url <a href ="x.com"> space between href and equals, no space between equals and url, double quotation marks round url <a href= "x.com"> no space between href and equals, space between equals and url, double quotation marks around url <a href = "x.com"> space between href and equals, space between equals and url, double quotation marks round url <a href='x.com"> no spaces betwen href, equals and url, mismatched quotation marks around url - single open, double to close <a href ='x.com"> space between href and equals, no space between equals and url, mismatched quotation marks around url - single open, double to close <a href= 'x.com"> no space between href and equals, space between equals and url,mismatched quotation marks around url - single open, double to close <a href = 'x.com"> space between href and equals, space between equals and url, mismatched quotation marks around url - single open, double to close <a href="x.com'> no spaces betwen href, equals and url, mismatched quotation marks around url - double open, single to close <a href ="x.com'> space between href and equals, no space between equals and url, mismatched quotation marks around url - double open, single to close <a href= "x.com'> no space between href and equals, space between equals and url,mismatched quotation marks around url - double open, single to close <a href = "x.com'> space between href and equals, space between equals and url,mismatched quotation marks around url - double open, single to close I guess whats needed is something more advanced than eregi("href=\"/(.*)\">",string,$arryaholding_results)) I'd appreciate any help you could give, Thanks NimP |
|
|||
|
"NimP" <stu@sturobbie.co.uk> wrote:
> Hi,. I'm trying to detect any links that are contained within an html > page using eregi pattern matching. I was wondering if there are any > pattern matching geniuses out there who could write a pattern that > merges all the different manners in which a link could be wriiten, I'm sure there is an easier solution out there somewhere, but by going through your examples I came up with that (wouldn't validate an URL though): preg_match("/<a(\s)+href(\s)*=(\s)*(['\"])*([a-z0-9_\-\.])+(['\"])*>/i", $string, $matches); echo htmlentities($matches[0]); JOn |