Regexp question

This is a discussion on Regexp question within the PHP Language forums, part of the PHP Programming Forums category; Textblock (indented for clarity): Hello, my nickname is Sandman, here is a list of things I like: <ol> &...


Go Back   Usenet Forums > PHP Programming Forums > PHP Language

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 08-16-2004
Sandman
 
Posts: n/a
Default Regexp question

Textblock (indented for clarity):

Hello, my nickname is Sandman, here is
a list of things I like:

<ol>
<li> Perl
<li> PHP
<li> MySQL
</ol>

I want to replace newlines with "<br />\n" in this textblock, for displaying it
to a browser - but I don't want to replace it inside some specified html-tags -
in this case "ol", but also "ul", "pre" and "xmp".

So, the resulting text block should be:

Hello, my nickname is Sandman, here is<br />
a list of things I like:<br />
<br />
<ol>
<li> Perl
<li> PHP
<li> MySQL
</ol>

Is there a fancy regexp for this? I know how to exclude certain characters with
the [^...] 'switch', but what about whole blocks of text? In pseudo-code:
[^<(ol|ul|pre|xmp)>].

--
Sandman[.net]
Reply With Quote
  #2 (permalink)  
Old 08-17-2004
Sandman
 
Posts: n/a
Default Solved! (was: Re: Regexp question)

In article <mr-E6670E.12155716082004@individual.net>, Sandman <mr@sandman.net>
wrote:

> Textblock (indented for clarity):
>
> Hello, my nickname is Sandman, here is
> a list of things I like:
>
> <ol>
> <li> Perl
> <li> PHP
> <li> MySQL
> </ol>
>
> I want to replace newlines with "<br />\n" in this textblock, for displaying
> it
> to a browser - but I don't want to replace it inside some specified html-tags
> -
> in this case "ol", but also "ul", "pre" and "xmp".
>
> So, the resulting text block should be:
>
> Hello, my nickname is Sandman, here is<br />
> a list of things I like:<br />
> <br />
> <ol>
> <li> Perl
> <li> PHP
> <li> MySQL
> </ol>
>
> Is there a fancy regexp for this? I know how to exclude certain characters
> with
> the [^...] 'switch', but what about whole blocks of text? In pseudo-code:
> [^<(ol|ul|pre|xmp)>].


I have now solved this, and this is the solution (posted for people searching
for this topic on google):

-----------------------------------------
#!/usr/bin/php
<?
$text="Hello, my nickname is Sandman, here is
a list of things I like:

<ol>
<li> Perl
<li> PHP
<li> MySQL
</ol>

Do you like it?

I do";

print preg_replace_callback(
"!(<(ol|ul|pre|xmp)[^>]*>.*?</\\2>\n?)|\n!is",
create_function('$m', 'return $m[1] ? $m[1] : "<br />\n";'),
$text
);
?>
-----------------------------------------
Output:
-----------------------------------------
Hello, my nickname is Sandman, here is<br />
a list of things I like:<br />
<br />
<ol>
<li> Perl
<li> PHP
<li> MySQL
</ol>
<br />
Do you like it?<br />
<br />
I do
-----------------------------------------

--
Sandman[.net]
Reply With Quote
  #3 (permalink)  
Old 08-17-2004
Sandman
 
Posts: n/a
Default Re: Solved! (was: Re: Regexp question)

In article <pan.2004.08.17.12.17.43.610000@bubbleboy.digiserv .net>,
"Ian.H" <ian@WINDOZEdigiserv.net> wrote:

> On Tue, 17 Aug 2004 13:01:05 +0200, Sandman wrote:
>
>
> [ snip ]
>
> > <ol>
> > <li> Perl

> ^^^
> > <li> PHP ^^^
> > <li> MySQL ^^^
> > </ol>
> > <br />
> > Do you like it?<br />
> > <br />
> > I do

>
>
> Would be better to produce valid code would it not? Some missing tags ;)


That's the input text, I don't touch that in my script. It could be anything
coming from anyone. I just make sure I don't add any <br /> to the <ol>
container space.

--
Sandman[.net]
Reply With Quote
  #4 (permalink)  
Old 08-17-2004
Andy Hassall
 
Posts: n/a
Default Re: Solved! (was: Re: Regexp question)

On Tue, 17 Aug 2004 12:16:09 GMT, "Ian.H" <ian@WINDOZEdigiserv.net> wrote:

>On Tue, 17 Aug 2004 13:01:05 +0200, Sandman wrote:
>
>[ snip ]
>
>> <ol>
>> <li> Perl

> ^^^
>> <li> PHP ^^^
>> <li> MySQL ^^^
>> </ol>
>> <br />
>> Do you like it?<br />
>> <br />
>> I do

>
>
>Would be better to produce valid code would it not? Some missing tags ;)


Well, that's perfectly valid HTML. In fact it's almost identical to the Lists
example in sec. 10.1 of the standard.

But adding <br /> does imply the OP is going for XHTML in which case yes, it's
wrong.

--
Andy Hassall / <andy@andyh.co.uk> / <http://www.andyh.co.uk>
<http://www.andyhsoftware.co.uk/space> Space: disk usage analysis tool
Reply With Quote
  #5 (permalink)  
Old 08-19-2004
Sandman
 
Posts: n/a
Default Re: Solved! (was: Re: Regexp question)

In article <ans4i0tk7g4uopk6oe96e5f67evlqdk5e3@4ax.com>,
Andy Hassall <andy@andyh.co.uk> wrote:

> On Tue, 17 Aug 2004 12:16:09 GMT, "Ian.H" <ian@WINDOZEdigiserv.net> wrote:
>
> >On Tue, 17 Aug 2004 13:01:05 +0200, Sandman wrote:
> >
> >[ snip ]
> >
> >> <ol>
> >> <li> Perl

> > ^^^
> >> <li> PHP ^^^
> >> <li> MySQL ^^^
> >> </ol>
> >> <br />
> >> Do you like it?<br />
> >> <br />
> >> I do

> >
> >
> >Would be better to produce valid code would it not? Some missing tags ;)

>
> Well, that's perfectly valid HTML. In fact it's almost identical to the
> Lists
> example in sec. 10.1 of the standard.
>
> But adding <br /> does imply the OP is going for XHTML in which case yes,
> it's
> wrong.


The OP, being me, strives for correct transitional XHTML. But you have to
understand, as I noted earlier, that I am not the one controlling the input to
the script. I am merely preserving code within it, not altering it.

--
Sandman[.net]
Reply With Quote
Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are Off
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On




All times are GMT +1. The time now is 08:55 AM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO 3.0.0