Help me with a regular expression for PHP

This is a discussion on Help me with a regular expression for PHP within the PHP Language forums, part of the PHP Programming Forums category; I have no idea where to get help on RE stuff. Since it's for a PHP app I thought ...


Go Back   Usenet Forums > PHP Programming Forums > PHP Language

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 10-29-2006
cendrizzi
 
Posts: n/a
Default Help me with a regular expression for PHP

I have no idea where to get help on RE stuff. Since it's for a PHP app
I thought I would ask here to see if there was some RE pros. Basically
I'm doing some template stuff and I wanted to use a
preg_replace_callback function to call another function when the
criteria of the RE expression is matched but have no idea how to
accomplish it.

So I start with this:
/<(input|select|textarea)[^>]*name\s*\=\s*\"[_a-zA-Z0-9\s]*\"[^>]*>/

but need to modify it so it only matches if it has '{' characters in
the name but to not match if it does not.

So this would not match:
<input name="test">

But this would match:
<input name="test{0}">

Thanks much in advance.

Reply With Quote
  #2 (permalink)  
Old 10-30-2006
Pedro Graca
 
Posts: n/a
Default Re: Help me with a regular expression for PHP

cendrizzi wrote:
> So I start with this:
> /<(input|select|textarea)[^>]*name\s*\=\s*\"[_a-zA-Z0-9\s]*\"[^>]*>/


You'd better not use regular expressions to validate HTML.
The following line is perfectly valid HTML (I think in any version)

<input type="text" name="x><y" id="xy">

> but need to modify it so it only matches if it has '{' characters in
> the name but to not match if it does not.
>
> So this would not match:
> <input name="test">
>
> But this would match:
> <input name="test{0}">


Get the name. Verify it has '{' and '}' (in that order and once only?)

<?php
$name = get_name('<input name="test{0}">'); // 'test{0}'
if (name_is_valid($name)) {
// whatever
}

function get_name($html) {
return 'test{0}'; // sorry!
}

function name_is_valid($name) {
if (($p1 = strpos($name, '{')) === false) return false;
if (strpos($name, '{', $p1+1) !== false) return false;
if (($p2 = strpos($name, '}')) === false) return false;
if (strpos($name, '}', $p2+1) !== false) return false;
return $p1 < $p2;
}
?>

--
I (almost) never check the dodgeit address.
If you *really* need to mail me, use the address in the Reply-To
header with a message in *plain* *text* *without* *attachments*.
Reply With Quote
  #3 (permalink)  
Old 10-30-2006
cendrizzi
 
Posts: n/a
Default Re: Help me with a regular expression for PHP

It's not for validation. It's for some custom template stuff that
tells my stuff where to store the value of the form element in the
session. That may not make sense but it's what I need for my
application. So I use the ob_start, etc functions and use regular
expressions against the buffer to manipulate the html or change the
behaivor of certain elements. I could just get the name of each
element and check them using strpos or strstr for the '{' character but
I hoped I could use RE to check from the start if it had that so it
wouldn't require the extra string searches.

Hope that makes sense, it's always a bit of a challenge to explain
things clearly, especially if the program is quite a big one.

On Oct 29, 4:17 pm, Pedro Graca <hex...@dodgeit.com> wrote:
> cendrizzi wrote:
> > So I start with this:
> > /<(input|select|textarea)[^>]*name\s*\=\s*\"[_a-zA-Z0-9\s]*\"[^>]*>/You'd better not use regular expressions to validate HTML.

> The following line is perfectly valid HTML (I think in any version)
>
> <input type="text" name="x><y" id="xy">
>
> > but need to modify it so it only matches if it has '{' characters in
> > the name but to not match if it does not.

>
> > So this would not match:
> > <input name="test">

>
> > But this would match:
> > <input name="test{0}">Get the name. Verify it has '{' and '}' (in that order and once only?)

>
> <?php
> $name = get_name('<input name="test{0}">'); // 'test{0}'
> if (name_is_valid($name)) {
> // whatever
> }
>
> function get_name($html) {
> return 'test{0}'; // sorry!
> }
>
> function name_is_valid($name) {
> if (($p1 = strpos($name, '{')) === false) return false;
> if (strpos($name, '{', $p1+1) !== false) return false;
> if (($p2 = strpos($name, '}')) === false) return false;
> if (strpos($name, '}', $p2+1) !== false) return false;
> return $p1 < $p2;
> }
> ?>
>
> --
> I (almost) never check the dodgeit address.
> If you *really* need to mail me, use the address in the Reply-To
> header with a message in *plain* *text* *without* *attachments*.


Reply With Quote
  #4 (permalink)  
Old 10-30-2006
Pedro Graca
 
Posts: n/a
Default Re: Help me with a regular expression for PHP

cendrizzi top-posted and totally messed it up:
> I hoped I could use RE to check from the start if it had that so it
> wouldn't require the extra string searches.



<?php
$data = array(
'<input type="text" name="no!" id="test0"> ',
'<input type="text" name="no{!}" id="test0"> ',
'<input type="text" name="test0" id="test0"> ',
'<input type="text" name="test 0" id="test0"> ',
'<input type="text" name="test{0}" id="test0"> ',
'<input type="text" name="test {0}" id="test0"> ',
'<input type="text" name="test{0}test" id="test0"> ',
'<input type="text" name="test {0} test" id="test0">',
);
$rx = '/<(input|select|textarea)[^>]*' .
# 'name\s*\=\s*\"[_a-zA-Z0-9\s]*\"' . // your original version
'name\s*\=\s*\"[_a-zA-Z0-9\s]*{[_a-zA-Z0-9\s]*}[_a-zA-Z0-9\s]*\"' .
# ---^--- ---^---
'[^>]*>/';
### I think there's a few \ too many in there,
### I didn't look at it very attentively

foreach ($data as $val) {
echo $val, ' :: ';
if (preg_match($rx, $val)) {
echo 'M';
} else {
echo 'No m';
}
echo "atch.\n";
}
?>

--
I (almost) never check the dodgeit address.
If you *really* need to mail me, use the address in the Reply-To
header with a message in *plain* *text* *without* *attachments*.
Reply With Quote
  #5 (permalink)  
Old 10-30-2006
BKDotCom
 
Posts: n/a
Default Re: Help me with a regular expression for PHP


Pedro Graca wrote:
> The following line is perfectly valid HTML (I think in any version)
>
> <input type="text" name="x><y" id="xy">


I would have to disagree
<input type="text" name="x> is invalid: no closing quote around
name value
<y" id="xy"> is invalid. y" isn't a valid cname (only
alphanumeric?)

if you want 'x><y' as a value you'd need to use name="x&gt;&lt;y"

Reply With Quote
  #6 (permalink)  
Old 10-30-2006
BKDotCom
 
Posts: n/a
Default Re: Help me with a regular expression for PHP

I had a similar RE problem and never figured it out, or found an
answer. I basically ended up using two callbacks..or doing the 2nd
check (does it contain "x") in the first callback

Capture and send all name values to the first (whether or not they
contain the {)
check whether or not the name value contains "{" inside that

cendrizzi wrote:
> I have no idea where to get help on RE stuff. Since it's for a PHP app
> I thought I would ask here to see if there was some RE pros. Basically
> I'm doing some template stuff and I wanted to use a
> preg_replace_callback function to call another function when the
> criteria of the RE expression is matched but have no idea how to
> accomplish it.
>
> So I start with this:
> /<(input|select|textarea)[^>]*name\s*\=\s*\"[_a-zA-Z0-9\s]*\"[^>]*>/
>
> but need to modify it so it only matches if it has '{' characters in
> the name but to not match if it does not.
>
> So this would not match:
> <input name="test">
>
> But this would match:
> <input name="test{0}">
>
> Thanks much in advance.


Reply With Quote
  #7 (permalink)  
Old 10-30-2006
Chung Leong
 
Posts: n/a
Default Re: Help me with a regular expression for PHP


cendrizzi wrote:
> I have no idea where to get help on RE stuff. Since it's for a PHP app
> I thought I would ask here to see if there was some RE pros. Basically
> I'm doing some template stuff and I wanted to use a
> preg_replace_callback function to call another function when the
> criteria of the RE expression is matched but have no idea how to
> accomplish it.
>
> So I start with this:
> /<(input|select|textarea)[^>]*name\s*\=\s*\"[_a-zA-Z0-9\s]*\"[^>]*>/
>
> but need to modify it so it only matches if it has '{' characters in
> the name but to not match if it does not.
>
> So this would not match:
> <input name="test">
>
> But this would match:
> <input name="test{0}">
>
> Thanks much in advance.


Well, just change the [_a-zA-Z0-9\s]* part to [\w\s]*{[\w\s]*}. Of
course, you'll need to do proper capturing in order to form the
replacement string.

\w is equivalent to [_a-zA-Z0-9] by the way.

Reply With Quote
  #8 (permalink)  
Old 10-30-2006
cendrizzi
 
Posts: n/a
Default Re: Help me with a regular expression for PHP

No I didn't know that \w was the same. What do you mean by proper
capturing. I really am a 2 year old when it comes to RE stuff.

Thanks!

On Oct 29, 10:04 pm, "Chung Leong" <chernyshev...@hotmail.com> wrote:
> cendrizzi wrote:
> > I have no idea where to get help on RE stuff. Since it's for a PHP app
> > I thought I would ask here to see if there was some RE pros. Basically
> > I'm doing some template stuff and I wanted to use a
> > preg_replace_callback function to call another function when the
> > criteria of the RE expression is matched but have no idea how to
> > accomplish it.

>
> > So I start with this:
> > /<(input|select|textarea)[^>]*name\s*\=\s*\"[_a-zA-Z0-9\s]*\"[^>]*>/

>
> > but need to modify it so it only matches if it has '{' characters in
> > the name but to not match if it does not.

>
> > So this would not match:
> > <input name="test">

>
> > But this would match:
> > <input name="test{0}">

>
> > Thanks much in advance.Well, just change the [_a-zA-Z0-9\s]* part to [\w\s]*{[\w\s]*}. Of

> course, you'll need to do proper capturing in order to form the
> replacement string.
>
> \w is equivalent to [_a-zA-Z0-9] by the way.


Reply With Quote
  #9 (permalink)  
Old 10-30-2006
John Dunlop
 
Posts: n/a
Default Re: Help me with a regular expression for PHP

BKDotCom:

> Pedro Graca wrote:
>
> > The following line is perfectly valid HTML (I think in any version)
> >
> > <input type="text" name="x><y" id="xy">


Yes, yes it is. In any version.

> I would have to disagree


Run it through a validator. You'll find it's valid.

The 'name' attribute is defined as CDATA, so pretty much anything goes
if the attribute value is quoted, including literal less-than and
greater-than signs.

> <input type="text" name="x> is invalid: no closing quote around
> name value


Yes, as a start-tag _in itself_. That wasn't Pedro's example though;
his example was the whole

| <input type="text" name="x><y" id="xy">

> <y" id="xy"> is invalid. y" isn't a valid cname


As a tag in itself, it is invalid HTML, yes. It isn't invalid as part
of the example above.

> (only alphanumeric?)


Generic identifiers (aka, element type names) must begin with upper- or
lowercase letters.

> if you want 'x><y' as a value you'd need to use name="x&gt;&lt;y"


No. You only need to replace '<' and '>' with references where they
would be understood as something other than character data.

--
Jock

Reply With Quote
  #10 (permalink)  
Old 10-30-2006
Pedro Graca
 
Posts: n/a
Default Re: Help me with a regular expression for PHP

Chung Leong wrote:
> \w is equivalent to [_a-zA-Z0-9] by the way.


It is /almost/ equivalent:

~$ php -r 'echo (preg_match("/^\w+$/", "Graça"))?("yes"):("no"), "\n";'
yes
~$ php -r 'echo (preg_match("/^[_a-zA-Z0-9]+$/", "Graça"))?("yes"):("no"), "\n";'
no

--
I (almost) never check the dodgeit address.
If you *really* need to mail me, use the address in the Reply-To
header with a message in *plain* *text* *without* *attachments*.
Reply With Quote
Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are Off
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On




All times are GMT +1. The time now is 12:23 PM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO 3.0.0