/    Sign up×
Community /Pin to ProfileBookmark

preg_replace() avoiding html tags in subject string. Clearly I’m a simpleton.

Hi folks, got a [b]very[/b] simple task I want to do here but I must be a simpleton because for the life of me I can’t make it happen without making things massively over complicated.

All I wan’t is to do a bog standard preg_replace on a subject string, but for it not to effect any of the actual html tags that might be in there. I want it to be able to effect the contents of tags (like the text inside of a paragraph element) and text on the outside of an element, but just not actually effect the tags themselves (or their attributes).

Any ideas folks? I’ve been trying for two days solid now and have run out of ideas. Oh and using the DOM API isn’t an option. Whilst it would be the ideal solution to such a simple task, PHP5 doesn’t have one yet (so much for OOP eh).

to post a comment
PHP

7 Comments(s)

Copy linkTweet thisAlerts:
@ShrineDesignsJun 05.2005 — try[code=php]<?php
$str = "<p><b>some text</b><br> and more text</p>";

echo preg_replace("/<[^<>]+>/", '', $str);
?>[/code]
Copy linkTweet thisAlerts:
@Stephen_PhilbinauthorJun 05.2005 — No what I'm after is avoiding alteration of the tags. I just want to do a preg_replace on the subject that avoids the tags and leaves them untouched. I'm starting to get the very stong impression that PHP5 can't do it without using far more cpu time than is sensible.
Copy linkTweet thisAlerts:
@NogDogJun 05.2005 — Possibly (but my mind swims sometimes when I try to wrangle with some of this, so no guarantees) using a "look-behind" assertion looking for an opening "<" with any number of characters that are not ">" preceding the search text:
[code=php]
preg_replace('/(?<!<[^>]*)'.$string.'/', $replacement);
[/code]
Copy linkTweet thisAlerts:
@Stephen_PhilbinauthorJun 05.2005 — I'm not sure about that one. Not sure about that pattern. Is a ! a not operator in regex? And what's that ? mean at the start of a parenthesis section?

I'm not sure what that pattern match but I think it still misses what I'm after. If that pattern matches anything that isn't a tag then that's going to replace the content of the tags entirely.

I'm pretty sure what I'm after would need to expressions. First, the expression to find the parts of the subject that are not tags (like that one above?), and then the actual expression I want to use to do the find and replace with on the sections that matched the expression to find non tag sections.


So say I had:
<i>
</i>&lt;rootel&gt;
&lt;p id="abc"&gt;
abc def &lt;span class="ghi"&gt;ghi&lt;/span&gt; jkl mno
&lt;img src="def.png" alt="xyz def 7" /&gt;
abc pqr
&lt;/p&gt;
&lt;/rootel&gt;


and I wanted to run a preg_replace() to swap abc for 123, def for 456 and ghi for 789. I'd need the regex to identify areas that are not tags, then run the preg_replace() function to do the desired replacements on the areas identified by the first regexp to not be actual tags.
Copy linkTweet thisAlerts:
@NogDogJun 05.2005 — [b](?<=[/b] is a positive look-behind assertion while [b](?<![/b] is a negative look-behind assertion. (An assertion is used to help determine what matches, but is not included within the actual match.) However, testing indicates that look-behind assertions must be of fixed length, so we need to use a look-ahead assertion instead ([b](?=[/b] for positive and [b](?![/b] for negative). I just ran this test, and it worked fine:
[code=php]
<?php
$test = "<p class=test>This is a test. It is <span id='test'>only a test</span>.</p>";
$search = "test";
$replace = "experiment";
$result = preg_replace('/'.$search.'(?![^<]*>)/', $replace, $test);
echo htmlentities($result);
?>
[/code]
Copy linkTweet thisAlerts:
@Stephen_PhilbinauthorJun 06.2005 — Nog..... You are without doubt the greatest dog ever to bless this planet with your presence. Thankyou very much.
Copy linkTweet thisAlerts:
@NogDogJun 06.2005 — Nog..... You are without doubt the greatest dog ever to bless this planet with your presence. Thankyou very much.[/QUOTE]
You're welcome. I actually sort of enjoyed that problem - once I got past the aggravation of not understanding the documentation the first 2 or 3 times I read it. :rolleyes: It felt good when my test script actually worked. ?
×

Success!

Help @Stephen_Philbin spread the word by sharing this article on Twitter...

Tweet This
Sign in
Forgot password?
Sign in with TwitchSign in with GithubCreate Account
about: ({
version: 0.1.9 BETA 6.16,
whats_new: community page,
up_next: more Davinci•003 tasks,
coming_soon: events calendar,
social: @webDeveloperHQ
});

legal: ({
terms: of use,
privacy: policy
});
changelog: (
version: 0.1.9,
notes: added community page

version: 0.1.8,
notes: added Davinci•003

version: 0.1.7,
notes: upvote answers to bounties

version: 0.1.6,
notes: article editor refresh
)...
recent_tips: (
tipper: @nearjob,
tipped: article
amount: 1000 SATS,

tipper: @meenaratha,
tipped: article
amount: 1000 SATS,

tipper: @meenaratha,
tipped: article
amount: 1000 SATS,
)...