setlocale and regular expressions

This is a discussion on setlocale and regular expressions within the PHP Language forums, part of the PHP Programming Forums category; Hi everybody. I've got a problem with setting locale and regular expression. I can set them without any trouble - ...


Go Back   Usenet Forums > PHP Programming Forums > PHP Language

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 06-16-2005
R
 
Posts: n/a
Default setlocale and regular expressions

Hi everybody.

I've got a problem with setting locale and regular expression.

I can set them without any trouble - I'm using LC_ALL category.

In Perl setting locale LC_CTYPE works also with regex but I was trying
to apply simple pattern [a-z] for polish characters just like this:

setlocale(LC_ALL, 'pl_PL');

// testing purposes
//$locale_info = localeconv();
//print_r($locale_info);
//echo strftime("%A %e %B %Y", mktime(0, 0, 0, 16, 6, 2005));

if (preg_match('/^[a-z]{2,10}$/', 'aezztest'))
{
echo 'works ;-)';
}

Maybe I'm doing sth wrong?
Has anyone succeeded in using locale and regex?
'locale -a' gives me also pl_PL.iso88592 - but when I use it I get the
same result as pl_PL.

thanks in advance for any help
best regards
R

Reply With Quote
  #2 (permalink)  
Old 06-16-2005
R
 
Posts: n/a
Default Re: setlocale and regular expressions

if (preg_match('/^[a-z]{2,10}$/', 'aezztest'))
^^^^ wow google did some
magic here ;-)
polish characters
meant to be here (in place of 'aezztest'

what ever ;-) the problem is that this regex isn't working...

Reply With Quote
  #3 (permalink)  
Old 06-17-2005
Daniel Tryba
 
Posts: n/a
Default Re: setlocale and regular expressions

R <ruthless@poczta.onet.pl> wrote:
> if (preg_match('/^[a-z]{2,10}$/', 'aezztest'))
> ^^^^ wow google did some
> magic here ;-)
> polish characters
> meant to be here (in place of 'aezztest'


Just in case you didn't know: Google's G2 thingy is extremly borken.

> what ever ;-) the problem is that this regex isn't working...


That's because [a-z] matches is the equivalanet to [\x61-\x7a]

If you want to match lowercase extended characters in iso8859-2
encoding, you need to generate a character class like:
[\xb1\xb3\xb9-\xbc]
(and lots more I guess).

Reply With Quote
  #4 (permalink)  
Old 06-17-2005
R
 
Posts: n/a
Default Re: setlocale and regular expressions

so the conclusion is:

Perl's locale != PHP's locale

if in Perl polish locale is set, [a-z] matches both (all) characters:
latin and polish

Maybe if PHP uses PCRE it also should use locale when matching
patterns?
It would be extremely useful.

cheers
R

Reply With Quote
Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are Off
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On




All times are GMT +1. The time now is 11:41 AM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO 3.0.0