This is a discussion on setlocale and regular expressions within the PHP Language forums, part of the PHP Programming Forums category; Hi everybody. I've got a problem with setting locale and regular expression. I can set them without any trouble - ...
|
|||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
|
|||
|
Hi everybody.
I've got a problem with setting locale and regular expression. I can set them without any trouble - I'm using LC_ALL category. In Perl setting locale LC_CTYPE works also with regex but I was trying to apply simple pattern [a-z] for polish characters just like this: setlocale(LC_ALL, 'pl_PL'); // testing purposes //$locale_info = localeconv(); //print_r($locale_info); //echo strftime("%A %e %B %Y", mktime(0, 0, 0, 16, 6, 2005)); if (preg_match('/^[a-z]{2,10}$/', 'aezztest')) { echo 'works ;-)'; } Maybe I'm doing sth wrong? Has anyone succeeded in using locale and regex? 'locale -a' gives me also pl_PL.iso88592 - but when I use it I get the same result as pl_PL. thanks in advance for any help best regards R |
|
|||
|
R <ruthless@poczta.onet.pl> wrote:
> if (preg_match('/^[a-z]{2,10}$/', 'aezztest')) > ^^^^ wow google did some > magic here ;-) > polish characters > meant to be here (in place of 'aezztest' Just in case you didn't know: Google's G2 thingy is extremly borken. > what ever ;-) the problem is that this regex isn't working... That's because [a-z] matches is the equivalanet to [\x61-\x7a] If you want to match lowercase extended characters in iso8859-2 encoding, you need to generate a character class like: [\xb1\xb3\xb9-\xbc] (and lots more I guess). |
|
|||
|
so the conclusion is:
Perl's locale != PHP's locale if in Perl polish locale is set, [a-z] matches both (all) characters: latin and polish Maybe if PHP uses PCRE it also should use locale when matching patterns? It would be extremely useful. cheers R |