This is a discussion on Help me with a regular expression for PHP within the PHP Language forums, part of the PHP Programming Forums category; I have no idea where to get help on RE stuff. Since it's for a PHP app I thought ...
|
|||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
|
|||
|
I have no idea where to get help on RE stuff. Since it's for a PHP app
I thought I would ask here to see if there was some RE pros. Basically I'm doing some template stuff and I wanted to use a preg_replace_callback function to call another function when the criteria of the RE expression is matched but have no idea how to accomplish it. So I start with this: /<(input|select|textarea)[^>]*name\s*\=\s*\"[_a-zA-Z0-9\s]*\"[^>]*>/ but need to modify it so it only matches if it has '{' characters in the name but to not match if it does not. So this would not match: <input name="test"> But this would match: <input name="test{0}"> Thanks much in advance. |
|
|||
|
cendrizzi wrote:
> So I start with this: > /<(input|select|textarea)[^>]*name\s*\=\s*\"[_a-zA-Z0-9\s]*\"[^>]*>/ You'd better not use regular expressions to validate HTML. The following line is perfectly valid HTML (I think in any version) <input type="text" name="x><y" id="xy"> > but need to modify it so it only matches if it has '{' characters in > the name but to not match if it does not. > > So this would not match: > <input name="test"> > > But this would match: > <input name="test{0}"> Get the name. Verify it has '{' and '}' (in that order and once only?) <?php $name = get_name('<input name="test{0}">'); // 'test{0}' if (name_is_valid($name)) { // whatever } function get_name($html) { return 'test{0}'; // sorry! } function name_is_valid($name) { if (($p1 = strpos($name, '{')) === false) return false; if (strpos($name, '{', $p1+1) !== false) return false; if (($p2 = strpos($name, '}')) === false) return false; if (strpos($name, '}', $p2+1) !== false) return false; return $p1 < $p2; } ?> -- I (almost) never check the dodgeit address. If you *really* need to mail me, use the address in the Reply-To header with a message in *plain* *text* *without* *attachments*. |
|
|||
|
It's not for validation. It's for some custom template stuff that
tells my stuff where to store the value of the form element in the session. That may not make sense but it's what I need for my application. So I use the ob_start, etc functions and use regular expressions against the buffer to manipulate the html or change the behaivor of certain elements. I could just get the name of each element and check them using strpos or strstr for the '{' character but I hoped I could use RE to check from the start if it had that so it wouldn't require the extra string searches. Hope that makes sense, it's always a bit of a challenge to explain things clearly, especially if the program is quite a big one. On Oct 29, 4:17 pm, Pedro Graca <hex...@dodgeit.com> wrote: > cendrizzi wrote: > > So I start with this: > > /<(input|select|textarea)[^>]*name\s*\=\s*\"[_a-zA-Z0-9\s]*\"[^>]*>/You'd better not use regular expressions to validate HTML. > The following line is perfectly valid HTML (I think in any version) > > <input type="text" name="x><y" id="xy"> > > > but need to modify it so it only matches if it has '{' characters in > > the name but to not match if it does not. > > > So this would not match: > > <input name="test"> > > > But this would match: > > <input name="test{0}">Get the name. Verify it has '{' and '}' (in that order and once only?) > > <?php > $name = get_name('<input name="test{0}">'); // 'test{0}' > if (name_is_valid($name)) { > // whatever > } > > function get_name($html) { > return 'test{0}'; // sorry! > } > > function name_is_valid($name) { > if (($p1 = strpos($name, '{')) === false) return false; > if (strpos($name, '{', $p1+1) !== false) return false; > if (($p2 = strpos($name, '}')) === false) return false; > if (strpos($name, '}', $p2+1) !== false) return false; > return $p1 < $p2; > } > ?> > > -- > I (almost) never check the dodgeit address. > If you *really* need to mail me, use the address in the Reply-To > header with a message in *plain* *text* *without* *attachments*. |
|
|||
|
cendrizzi top-posted and totally messed it up:
> I hoped I could use RE to check from the start if it had that so it > wouldn't require the extra string searches. <?php $data = array( '<input type="text" name="no!" id="test0"> ', '<input type="text" name="no{!}" id="test0"> ', '<input type="text" name="test0" id="test0"> ', '<input type="text" name="test 0" id="test0"> ', '<input type="text" name="test{0}" id="test0"> ', '<input type="text" name="test {0}" id="test0"> ', '<input type="text" name="test{0}test" id="test0"> ', '<input type="text" name="test {0} test" id="test0">', ); $rx = '/<(input|select|textarea)[^>]*' . # 'name\s*\=\s*\"[_a-zA-Z0-9\s]*\"' . // your original version 'name\s*\=\s*\"[_a-zA-Z0-9\s]*{[_a-zA-Z0-9\s]*}[_a-zA-Z0-9\s]*\"' . # ---^--- ---^--- '[^>]*>/'; ### I think there's a few \ too many in there, ### I didn't look at it very attentively foreach ($data as $val) { echo $val, ' :: '; if (preg_match($rx, $val)) { echo 'M'; } else { echo 'No m'; } echo "atch.\n"; } ?> -- I (almost) never check the dodgeit address. If you *really* need to mail me, use the address in the Reply-To header with a message in *plain* *text* *without* *attachments*. |
|
|||
|
Pedro Graca wrote: > The following line is perfectly valid HTML (I think in any version) > > <input type="text" name="x><y" id="xy"> I would have to disagree <input type="text" name="x> is invalid: no closing quote around name value <y" id="xy"> is invalid. y" isn't a valid cname (only alphanumeric?) if you want 'x><y' as a value you'd need to use name="x><y" |
|
|||
|
I had a similar RE problem and never figured it out, or found an
answer. I basically ended up using two callbacks..or doing the 2nd check (does it contain "x") in the first callback Capture and send all name values to the first (whether or not they contain the {) check whether or not the name value contains "{" inside that cendrizzi wrote: > I have no idea where to get help on RE stuff. Since it's for a PHP app > I thought I would ask here to see if there was some RE pros. Basically > I'm doing some template stuff and I wanted to use a > preg_replace_callback function to call another function when the > criteria of the RE expression is matched but have no idea how to > accomplish it. > > So I start with this: > /<(input|select|textarea)[^>]*name\s*\=\s*\"[_a-zA-Z0-9\s]*\"[^>]*>/ > > but need to modify it so it only matches if it has '{' characters in > the name but to not match if it does not. > > So this would not match: > <input name="test"> > > But this would match: > <input name="test{0}"> > > Thanks much in advance. |
|
|||
|
cendrizzi wrote: > I have no idea where to get help on RE stuff. Since it's for a PHP app > I thought I would ask here to see if there was some RE pros. Basically > I'm doing some template stuff and I wanted to use a > preg_replace_callback function to call another function when the > criteria of the RE expression is matched but have no idea how to > accomplish it. > > So I start with this: > /<(input|select|textarea)[^>]*name\s*\=\s*\"[_a-zA-Z0-9\s]*\"[^>]*>/ > > but need to modify it so it only matches if it has '{' characters in > the name but to not match if it does not. > > So this would not match: > <input name="test"> > > But this would match: > <input name="test{0}"> > > Thanks much in advance. Well, just change the [_a-zA-Z0-9\s]* part to [\w\s]*{[\w\s]*}. Of course, you'll need to do proper capturing in order to form the replacement string. \w is equivalent to [_a-zA-Z0-9] by the way. |
|
|||
|
No I didn't know that \w was the same. What do you mean by proper
capturing. I really am a 2 year old when it comes to RE stuff. Thanks! On Oct 29, 10:04 pm, "Chung Leong" <chernyshev...@hotmail.com> wrote: > cendrizzi wrote: > > I have no idea where to get help on RE stuff. Since it's for a PHP app > > I thought I would ask here to see if there was some RE pros. Basically > > I'm doing some template stuff and I wanted to use a > > preg_replace_callback function to call another function when the > > criteria of the RE expression is matched but have no idea how to > > accomplish it. > > > So I start with this: > > /<(input|select|textarea)[^>]*name\s*\=\s*\"[_a-zA-Z0-9\s]*\"[^>]*>/ > > > but need to modify it so it only matches if it has '{' characters in > > the name but to not match if it does not. > > > So this would not match: > > <input name="test"> > > > But this would match: > > <input name="test{0}"> > > > Thanks much in advance.Well, just change the [_a-zA-Z0-9\s]* part to [\w\s]*{[\w\s]*}. Of > course, you'll need to do proper capturing in order to form the > replacement string. > > \w is equivalent to [_a-zA-Z0-9] by the way. |
|
|||
|
BKDotCom:
> Pedro Graca wrote: > > > The following line is perfectly valid HTML (I think in any version) > > > > <input type="text" name="x><y" id="xy"> Yes, yes it is. In any version. > I would have to disagree Run it through a validator. You'll find it's valid. The 'name' attribute is defined as CDATA, so pretty much anything goes if the attribute value is quoted, including literal less-than and greater-than signs. > <input type="text" name="x> is invalid: no closing quote around > name value Yes, as a start-tag _in itself_. That wasn't Pedro's example though; his example was the whole | <input type="text" name="x><y" id="xy"> > <y" id="xy"> is invalid. y" isn't a valid cname As a tag in itself, it is invalid HTML, yes. It isn't invalid as part of the example above. > (only alphanumeric?) Generic identifiers (aka, element type names) must begin with upper- or lowercase letters. > if you want 'x><y' as a value you'd need to use name="x><y" No. You only need to replace '<' and '>' with references where they would be understood as something other than character data. -- Jock |
|
|||
|
Chung Leong wrote:
> \w is equivalent to [_a-zA-Z0-9] by the way. It is /almost/ equivalent: ~$ php -r 'echo (preg_match("/^\w+$/", "Graça"))?("yes"):("no"), "\n";' yes ~$ php -r 'echo (preg_match("/^[_a-zA-Z0-9]+$/", "Graça"))?("yes"):("no"), "\n";' no -- I (almost) never check the dodgeit address. If you *really* need to mail me, use the address in the Reply-To header with a message in *plain* *text* *without* *attachments*. |