This is a discussion on allowed characters in a string (stripping it) within the PHP Language forums, part of the PHP Programming Forums category; "Tim Roberts" <timr@probo.com> wrote in message news:dpf1m3hicifgnclafaplh3jtibdheacfvk@4ax.com... > "Steve" &...
|
|||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
|
|||
|
"Tim Roberts" <timr@probo.com> wrote in message news:dpf1m3hicifgnclafaplh3jtibdheacfvk@4ax.com... > "Steve" <no.one@example.com> wrote: >> >>>$name=preg_replace("/([a-zA-Z]|-|[$al])|./",'$1',$name); >> >>it's not expensive at all. and a dot is any single character...not a >>greedy >>wild card. the only reason he wouldn't want a dot is because it could be >>an >>'illegal' character that he's trying to get rid of anyway. as it is, he >>just >>didn't escape the dot so that it is the character (period) and not the >>directive (any single character). > > That statement as written will replace each character with itself, one by > one, repeatedly, for each character in $name. > > It is an expensive no-op. that is true, however each character is analyzed *as a single character*. there is no marker being set and a pattern being sought beyond that marker to see if there is another pattern match. markers are set, the replacement is made to those characters marked, the process is done. one of the least expensive operations one could ask of preg. may be a good idea to write a pattern you think would be less expense that does similar things...see if you can time-test compare the two. you can also measure memory consumption too. i don't think you'll find any significant consumption of resources running the above, esp. comparitively. |
|
|||
|
"Tim Roberts" <timr@probo.com> wrote in message news:i2g1m3983khduu7qghbp1vuitc8offjrj9@4ax.com... > Jerry Stuckle <jstucklex@attglobal.net> wrote: >>Tim Roberts wrote: >>> "Lo'oris" <looris@gmail.com> wrote: >>> >>>> I'd like to have a set of "allowed characters", and strip a string >>>>from everything besides those. >>>> I've tried and tried but so far every time I enter strings containing >>>> unicode, it goes mad and output makes no sense. >>> >>> How are you entering "strings containing unicode"? Browsers don't send >>> Unicode. >> >>Excuse me? They sure can, depending on the language being used. > > Yes, I know better. That was not the sentiment I intended to convey. > >>So the rest of your post is immaterial. Steve's suggestion is a lot >>closer. > > Damn you, Stuckle. How can you see anything at all from up there on your > high horse? with jerry, it's a matter of people in glass houses. except when you start throwing rocks at his, he will claim you have no rock and that, in fact, you've not broken any windows. :) > Despite my faux pas, my suggestion was also correct, your invective > notwithstanding. good word, invective...he likes doing that apparently. at least we encounter it often in his posts. cheers. |
|
|||
|
"Steve" <no.one@example.com> wrote in message news:HIb8j.33$ts4.16@newsfe07.lga... > > "Tim Roberts" <timr@probo.com> wrote in message > news:dpf1m3hicifgnclafaplh3jtibdheacfvk@4ax.com... >> "Steve" <no.one@example.com> wrote: >>> >>>>$name=preg_replace("/([a-zA-Z]|-|[$al])|./",'$1',$name); >>> >>>it's not expensive at all. and a dot is any single character...not a >>>greedy >>>wild card. the only reason he wouldn't want a dot is because it could be >>>an >>>'illegal' character that he's trying to get rid of anyway. as it is, he >>>just >>>didn't escape the dot so that it is the character (period) and not the >>>directive (any single character). >> >> That statement as written will replace each character with itself, one by >> one, repeatedly, for each character in $name. >> >> It is an expensive no-op. > > that is true, however each character is analyzed *as a single character*. > there is no marker being set and a pattern being sought beyond that marker > to see if there is another pattern match. markers are set, the replacement > is made to those characters marked, the process is done. one of the least > expensive operations one could ask of preg. > > may be a good idea to write a pattern you think would be less expense that > does similar things...see if you can time-test compare the two. you can > also measure memory consumption too. i don't think you'll find any > significant consumption of resources running the above, esp. > comparitively. sorry tim, i needed to make it clear - as i'd mentioned in one of my first responses to you - that i think the dot in his preg is just mistakenly not excaped. i don't think he means "any single character", rather, "a period". anyway, my comments above are made under that assumption. otherwise you are more right than before, however more in the line of "that's dumb to put or'ed patterns when one of those will basically make the other conditions/patterns moot". still, in this case, the expense is nominal since all conditions/patterns work over a single character. just thought i'd clarify. cheers |