/    Sign up×
Community /Pin to ProfileBookmark

splitting a string by character skipping html tags pls read

I need some help and advice with something

I am trying to do the following

User inputs some information into a text area this info is a few paragraphs of text with possible html tags

I want to have them submit this and for php to take all the non html portions of the text they submit and split it by each individual character and insert something inbetween each character but fotr it not to do this with any of the html tags

[QUOTE]

so a sentance “some random sentance” would be split and added such as “s-o-m-e- -r-a-n-d-o-m- -s-e-n-t-a-n-c-e-“

[/QUOTE]

But if if contained html tags it should be like this

[QUOTE]

“some <br> random <br> <a href=”somewhere.html”>sentance</a>” would be split and added such as “s-o-m-e- <br> -r-a-n-d-o-m- <br> <a href=”somewhere.html”>-s-e-n-t-a-n-c-e-</a>”

[/QUOTE]

However with the current script im using it messes up the html tags to be like this

[QUOTE]

“some <br> random <br> <a href=”somewhere.html”>sentance</a>” would be split and added such as “s-o-m-e- -<-b-r->- -r-a-n-d-o-m- -<-b-r->- -<-a- -h-r-e-f-=-“-s-o-m-e-w-h-e-r-e-.-h-t-m-l-“->-s-e-n-t-a-n-c-e-<-/-a->-“

[/QUOTE]

I am currently using the followinf php5 code which works fine on just plain text but has no way of skipping html tags, it generates a text area containing a modified version of the text inputted witha – between every character but i need it to skip characters that are part of a html tag and keep the tags fully intact and working as they should

[code=php]if($_POST[‘enc’]==1){
$text = $_POST[‘text’];
$encode1=utf8_encode($text);//first encode where $string is the supplied string
$l=strlen($encode1);
$a=”;
for($i; $i<$l; $i++)
{
$s=substr($encode1, $i, 1);//get character

$plusone= $s;
$a.=$plusone.’-‘;
}

$encode2=utf8_encode($b);//encode second time
echo ‘<center><textarea cols=”80″ rows=”10″>’.$encode2.'</textarea></center>’;
}[/code]

Any ideas or suggestions on where to start or what to look into or sample code etc would be most appreciated

Thanks in advance

to post a comment
PHP

45 Comments(s)

Copy linkTweet thisAlerts:
@ZnupiDec 10.2007 — Here's a way to do it:
[code=php]

$result = preg_replace("%([^>])(?=[^>]*<)%", "$1-", $text);
echo $result;

[/code]

Hope that helps ?
Copy linkTweet thisAlerts:
@BWWebDesignsauthorDec 10.2007 — Looks good

One more question

Say if i wanted to do that but instead of seperating with a - i wanted to use a randomly generated character or multiple characters

I have the script that can do the random stuff just wasnt sure how i would get your code to input the random character instead of the - and to make each thing random so not the same random inbetween each character in the text but a different one generated each time
Copy linkTweet thisAlerts:
@ZnupiDec 10.2007 — Actually, here's a better way:
[code=php]
$result = preg_replace("%(?<=[^<])([^<>])(?=[^>])%", "$1-", $text);
echo $result;
[/code]

The first one worked only if $text was enclosed between tags (it started with an opening-tag and ended with a closing-tag). This one works anyway.

Good luck ?

[b]Edit[/b]: Sorry, I didn't see your post when I posted this one. To change the dash, just change the second parameter passed to preg_replace with "$1[b]someCharacter(s)[/b]", so all characters will be followed by [b]someCharacter(s)[/b].
Copy linkTweet thisAlerts:
@andre4s_yDec 10.2007 — Maybe this code bigger.. ?

but maybe can help you..
[code=php]
<?php
function add_minus($t)
{
$t = preg_replace("/(w)/","-$1-",$t);
$t = preg_replace("/--/","-",$t);
return $t;
}
$text = "some <br> random <br> <a href="somewhere.html" onclick="alert('test')">sentance</a>";
$temp = explode(" ",$text);
for($i=0; $i<count($temp); $i++)
{
if(preg_match("/<?(.*)>/",$temp[$i]))
{
if(preg_match("/>(.*)</",$temp[$i],$between))
{
$between = add_minus($between[1]);
$temp[$i] = preg_replace("/>(.*)</",">$between<",$temp[$i]);
}
}
elseif(preg_match("/<(.*)>?/",$temp[$i]))
{}
elseif(preg_match("/.+=".*"/",$temp[$i]))
{}
else
{
$temp[$i] = add_minus($temp[$i]);
}
}
$result = implode(" ",$temp);
echo htmlentities($result);
?>
[/code]

input :
some <br> random <br> <a href="somewhere.html" onclick="alert('test')">sentance</a>[/QUOTE]
output :
-s-o-m-e- <br> -r-a-n-d-o-m- <br> <a href="somewhere.html" onclick="alert('test')">-s-e-n-t-a-n-c-e-</a>[/QUOTE]
Copy linkTweet thisAlerts:
@BWWebDesignsauthorDec 10.2007 — andre4s y

With your code i get the following error

Warning: preg_replace() [function.preg-replace]: Compilation failed: lookbehind assertion is not fixed length at offset 8[/QUOTE]

for this line of code

[CODE]$result = preg_replace("%(?<=[^<])([^<>])(?=[^>])%", "$1-", $text);[/CODE]
Copy linkTweet thisAlerts:
@andre4s_yDec 10.2007 — Sorry... i have editted it...

I was trying Znupi code before... ?
Copy linkTweet thisAlerts:
@BWWebDesignsauthorDec 10.2007 — ok yours is working now thanks, ok ive figured out that if i make it so the bit thats added in between each character is some unique reference such as <<Random>> i should some how be able to go through and replace that with some random generated characters but im wondering if i take the final result from your script and it contains that unique reference between each character that isnt a html tag

How would go through and replace each unique tag with a new randomly generated character

i would need to some how get the result into an array that i could pass through and each unique tag would get replaced by generating a new random character each time until all the tags ahad been replaced

But im unsure how i woudl add that little bit in after your code
Copy linkTweet thisAlerts:
@ZnupiDec 10.2007 — Here you go:
[code=php]
$x = preg_replace("%(?<=[^<])([^<>])(?=[^>])%", "$1" . chr(0), $text);

$a = "=-*@^$#!()";
while (strpos($x, chr(0))!==false) {
$p = strpos($x, chr(0));
$x[$p] = $a[mt_rand(0, strlen($a)-1)];
}

echo $x;
[/code]

Where $a is a string containing all the characters you want randomly placed between characters in $text.

Have fun ?
Copy linkTweet thisAlerts:
@BWWebDesignsauthorDec 10.2007 — Znupi

tried your code and i get that error mentioned above
Copy linkTweet thisAlerts:
@andre4s_yDec 10.2007 — BWWebDesigns,

actually i do not fully understand what is your purpose... :p

But, this code maybe not far away with your purpose..
[code=php]
<?php
function add_tag($t)
{ //add unique tag
$t = preg_replace("/(.)/","[r]$1[r]",$t);
$t = preg_replace("/([r]){2}/","[r]",$t);
return $t;
}
$text = "some <br> random <br> <a href="somewhere.html" onclick="alert('test')">sentance</a>";
$temp = explode(" ",$text);
for($i=0; $i<count($temp); $i++)
{
if(preg_match("/<?w(.*)>/",$temp[$i]))
{
if(preg_match("/>(.*)</",$temp[$i],$between))
{
$between = add_tag($between[1]);
$temp[$i] = preg_replace("/>(.*)</",">$between<",$temp[$i]);
}
elseif(preg_match("/[<|>]{2}/",$temp[$i],$between))
{
$temp[$i] = add_tag($temp[$i]);
}
}
elseif(preg_match("/<(.*)>?/",$temp[$i]))
{}
elseif(preg_match("/.+=".*"/",$temp[$i]))
{}
else
{
$temp[$i] = add_tag($temp[$i]);
}
}
$result = implode(" ",$temp);
echo htmlentities($result)."<br />";
$result = preg_replace("/[r]/e","chr(rand(33,64));",$result); //replace tag with random char
echo htmlentities($result);
?>
[/code]

The output in my computer is :

[r]s[r]o[r]m[r]e[r] <br> [r]r[r]a[r]n[r]d[r]o[r]m[r] <br> <a href="somewhere.html" onclick="alert('test')">[r]s[r]e[r]n[r]t[r]a[r]n[r]c[r]e[r]</a>

0s<o"m'e@ <br> .r@a%n"d0o?m, <br> <a href="somewhere.html" onclick="alert('test')">6s/e9n4t%a*n#c@e=</a>
[/QUOTE]

Hope that help you.....?
Copy linkTweet thisAlerts:
@ZnupiDec 11.2007 — That is very strange, since it works perfectly for me. Try copy-pasting again..

Here's what it outputs for "<p>Heeee<b>eeell</b>oooo!</p>" on my machine:
<i>
</i>&lt;p&gt;H!e#e(e@e^&lt;b&gt;e-e=e@l-l(&lt;/b&gt;o=o-o$o(!#&lt;/p&gt;

I'm sorry but I can not help you any further since this works on my PC ?
Copy linkTweet thisAlerts:
@bharanikumarphpJul 04.2008 — Hi dear i have used ur code, forputing the special character between,2character,


its very use full,,

but can u tell me, where u are counting the character,

Can u tell me,which place

u r code put the special character after 1 character only..

u-h-h-f-r-h

but am expecting

ganesh-srinivasan-john

like this

can u tell idea
Copy linkTweet thisAlerts:
@andre4s_yJul 04.2008 — hi bharanikumarphp,

I think we (i and znupi) are not counting the character.

We are using regular expression / regex (easy way to say : pattern match).

If the pattern : match, than replace it with something.

In your case :

if i am not misunderstanding your problem, you want to separate each word with separator.

I think this is easier than the original problem.

Forget the code above, this more simple :
[code=php]
<?php
$text_ori = "ganesh srinivasan john";
//first way
$text = explode(" ",$text_ori);
$text = implode(chr(45),$text);
echo $text;
echo "<br />";
//second way
$text = preg_replace("/b(s+|&nbsp;)b/",chr(45),$text_ori);
echo $text;
?>
[/code]

the output is :

ganesh-srinivasan-john

ganesh-srinivasan-john
[/quote]

This case consider the original word separator is space (" "), in html is &nbsp;.

The first way is :

You can use explode() and implode() function to separate each word into array, and connect them again with the separator.

Weakness in this way is : if the original separator is double (triple and so on) space (" ") or in html entities (&nbsp?, it will not recognize it. Or if the input is just spaces...

Second way do it better. It use regex.

If you want to replace the separator with random char, just replace 45 with random function to pick number between and choose ascii.

For example : rand(33,64).

Hope this help you...

We do more than just count... ?
Copy linkTweet thisAlerts:
@bharanikumarphpJul 04.2008 — The below is my content,

1.I want to put the <--break--> after 1075 character has reached,

2.Never broke the html tag,exampl()<<--break-->table>

3.And also if the cursor meet the html tag then put the <--break--> after the html tag has closed,example()<table><tr><td>this is test</td></tr></table><--break-->


[code=php]
<I>Editor?s note: Laser-deposition
welding is an important alternative to more conventional mold-repair techniques. It
is beginning to find acceptance in U.S. mold shops, having previously gained a
foothold in Europe. <!--author_start-->Richard Hendel<!--author_end-->,
product manager for Rofin-Sinar, describes what laser-deposition welding is and
how it works.</I> <br> <br><TABLE ALIGN="right" BORDER=0
CELLSPACING=0 CELLPADDING=10 width="299"> <TR> <TD><img
src="/images/2002/March/Tooling_RofinSW-Performance.jpg" width="279"
height="297"></TD> </TR> <TR> <TD><font size="1" face="Arial, Helvetica,
sans-serif"><!--abstract_start-->The StarWeld laser welding machine is used to
spot and seam weld high-grade steel alloys, copper, gold, silver, platinum, and
titanium, in a variety of combinations. Output power ranges from 20 to
500W.<!--abstract_end--> </font></TD> </TR> </TABLE>Laser deposition
welding technology is beginning to find its place in modification and repair of
molds. A typical application would be the repair of an injection mold constructed of
cold work steel, which is subject to heavy wear on the edges caused by
processing of glass-fiber-reinforced material. The chipped or rounded edge areas
can be laser-deposit welded to fill cracks, using a wire diameter of .4 mm. After
repair, the insert and the mold have a service life at least equal to that of the
original compon


[/code]
Copy linkTweet thisAlerts:
@andre4s_yJul 04.2008 — I think this code can solve your problem..
[code=php]
//$text_ori = place your text here.
$text = preg_split("/(<.+>|&.+;)/sU",$text_ori,-1,PREG_SPLIT_DELIM_CAPTURE|PREG_SPLIT_NO_EMPTY);
//split the text based on html tag and entity, it will produce array.
$counter = 1075; //number of character
$random = "[random]";//separator => in your case : <--break--> => in html entities &lt;--break--&gt;
for($i = 0; $i < count($text); $i++)
{//looping in array
if(!preg_match("/^</",$text[$i]))
{//array value is not html tag, only non html tag are counted
if(preg_match("/^&.+;$/",$text[$i]))
{// array value is html entity
//consider html entity as 1 character, change later
$counter--;
if($counter==0)
{
$text[$i] .= $random;
break;
}
}
else
{//array value is not contain html tag and entity.
$length = strlen($text[$i]);
$counter = $counter - $length;
if($counter<=0)
{
$length = $length + $counter;
$text[$i] = preg_replace("/^.{".$length."}/s","$0".$random,$text[$i]);
break;
}
}
}
}
$text = implode($text); //join array
echo htmlentities($text); //output it.. voila..
[/code]

In your case, i do count.. ?

Hope this help..
Copy linkTweet thisAlerts:
@bharanikumarphpJul 05.2008 — Dear

in outout the <--break--> not added..

Can u check it out ur code Plz...


Thanks

Regards

B.S.Bharani Kumar
Copy linkTweet thisAlerts:
@andre4s_yJul 05.2008 — just change :
<i>
</i>$random = "[random]";

with
<i>
</i>$random = "&amp;lt;--break--&amp;gt;";

or
<i>
</i>$random = "&lt;--break--&gt;";
Copy linkTweet thisAlerts:
@bharanikumarphpJul 05.2008 — u see this outout i got...

in this <--break--> inserted inside the

<p> Lorem Ipsum has [COLOR="Red"]<--break-->[/COLOR]been the industry's standard dummy text</p>


in this situation

so i want like this

<p> Lorem Ipsum hasbeen the industry's standard dummy text</p> [COLOR="Red"]<--break-->[/COLOR]

[code=php]
Lorem Ipsum is simply dummy text of the printing and typesetting industry.<p> Lorem Ipsum has <--break-->been the industry's standard dummy text</p> ever since the 1500s, when an unknown printer took a galley of type and scrambled <table><tr><td>it to make a type specimen book.</td></ttr></table> It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged.<b> It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages,</b> and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.
[/code]
Copy linkTweet thisAlerts:
@bharanikumarphpJul 05.2008 — Dear

What happen dear
Copy linkTweet thisAlerts:
@bharanikumarphpJul 05.2008 — Hi dear

Am waiting for ur reply
Copy linkTweet thisAlerts:
@andre4s_yJul 05.2008 — First of all : why you do that??

and i think xml parser function is more suitable for this case.

Anyway, if you want to use method like before, this is the patch :
[code=php]
<?php
$text_ori = 'Lorem Ipsum is simply dummy <img>text of the printing and typesetting industry.<p> Lorem Ipsum has been the industry's standard dummy text</p> ever since the 1500s, when an unknown printer took a galley of type and scrambled <table><tr><td>it to make a type specimen book.</td></ttr></table> It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged.<b> It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages,</b> and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.';
$text = preg_split("/(<.+>|&.+;)/sU",$text_ori,-1,PREG_SPLIT_DELIM_CAPTURE|PREG_SPLIT_NO_EMPTY);
//split the text based on html tag, it will produce array.
$counter = 215; //number of character
$random = "[random]";//separator => in your case : <--break--> => in html entities &lt;--break&gt;
$add_random = false; //flag to add randomness
$open_tag = 0; //variable to store how many opentag
//html tag is not counted, html entities is counted as 1 character, comment is not counted
for($i = 0; $i < count($text); $i++)
{//looping in array
if(!preg_match("/^</sU",$text[$i]))
{//array value is not html tag, only non html tag are counted
if(preg_match("/^&.+;$/",$text[$i]))
{// array value is html entity
//consider html entity as 1 character, change later
$counter--;
if($counter==0)
{
if(!$open_tag)
{
$text[$i] .= $random;
}
else
{
$add_random = true;
}
}
}
else
{//array value is not contain html tag and entity.
$length = strlen($text[$i]);
$counter = $counter - $length;
if($counter<=0)
{
$length = $length + $counter;
if(!$open_tag)
{
$text[$i] = preg_replace("/^.{".$length."}/s","$0".$random,$text[$i]);
}
else
{
$add_random = true;
}
}
}
}
elseif(preg_match("/^<.*/>$/U",$text[$i]))
{//self closing tag.
//do nothing
}
elseif(preg_match("/^</w+>/",$text[$i]))
{//closing tag
//turn off toggle
$open_tag--;
if($add_random&&!$open_tag)
{
$add_random = false;
$text[$i] .= $random;
}
}
else
{//opening tag
$open_tag++;
}
}
if($add_random)
{
echo "Cannot find closing tag. not valid xhtml or not complete ones.";
}
else
{
$text = implode($text); //join array
echo htmlentities($text); //output it
}
?>
[/code]

Please remember :

chars in html tag is not counted, html entities count as 1, comment is not counted, and works best using xhtml, because detects self closing tag.
Copy linkTweet thisAlerts:
@bharanikumarphpJul 05.2008 — Thanks a lot dear

I changed the

$[COLOR="Red"]counter = 100[/COLOR]; //number of character to $[COLOR="Red"]counter=105.[/COLOR].

But There is no changes i find,

may i know the reason,

And one more thing only one break is added,

i want to add the continues break, after 105 character has reached,

[code=php]
Lorem Ipsum is simply dummy text of the printing and typesetting industry.<--break-->
<p> Lorem Ipsum has been the industry's standard dummy text</p><--break-->
ever since the 1500s, when an unknown printer took a galley of type and scrambled <--break-->
<table><tr><td>it to make a type specimen book.</td></ttr></table><--break-->

It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged.<--break-->
<b> It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages,</b><--break-->
and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.<--break-->

[/code]


This manner am expecting,

thanks a lot
Copy linkTweet thisAlerts:
@andre4s_yJul 05.2008 — i have editted it..

But i think there still something wrong.. wait..
Copy linkTweet thisAlerts:
@andre4s_yJul 05.2008 — Maybe this do better, but long..
[code=php]
<?php
$text_ori = 'Lorem Ipsum is simply dummy text of the printing and typesetting industry.
<p> Lorem Ipsum has been the industry's standard dummy text</p>
ever since the 1500s, when an unknown printer took a galley of type and scrambled

<table><tr><td>it to make a type specimen book.</td></ttr></table>

It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged.
<b> It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages,</b>
and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.';
$text = preg_split("/(<.+>|&.+;)/sU",$text_ori,-1,PREG_SPLIT_DELIM_CAPTURE|PREG_SPLIT_NO_EMPTY);
//split the text based on html tag, it will produce array.
$counter = 10; //number of character
$random = "<--break-->";//separator => in your case : <--break--> => in html entities &lt;--break&gt;
$add_random = 0; //variable to store how many add randomness
$open_tag = 0; //variable to store how many opentag
//html tag is not counted, html entities is counted as 1 character, comment is not counted

for($i = 0; $i < count($text); $i++)
{//looping in array
if(!preg_match("/^</sU",$text[$i]))
{//array value is not html tag, only non html tag are counted
if(preg_match("/^&.+;$/",$text[$i]))
{// array value is html entity
//consider html entity as 1 character, change later
$counter--;
if($counter==0)
{
if(!$open_tag)
{
$text[$i] .= $random;
}
else
{
$add_random++;
}
}
}
else
{//array value is not contain html tag and entity.
$length = strlen($text[$i]);
$counter_temp = $counter - $length;
if($counter_temp<=0)
{
$how_much = floor($length / $counter);
if(!$open_tag)
{
$length_temp = $length + $counter_temp;
for ($j = $how_much; $j >= 0; $j--)
{
$temporary = $counter * $j;
$test = $length_temp + $temporary;
$text[$i] = preg_replace("/^.{".$test."}/s","$0".$random,$text[$i]);
}
}
else
{
$add_random += $how_much;
}
}
}
}
elseif(preg_match("/^<.*/>$/U",$text[$i]))
{//self closing tag.
//do nothing
}
elseif(preg_match("/^</w+>/",$text[$i]))
{//closing tag
//turn off toggle
$open_tag--;
if($add_random&&!$open_tag)
{
do
{
$add_random--;
$text[$i] .= $random;
} while ($add_random!=0);
}
}
else
{//opening tag
$open_tag++;
}
}
if($add_random)
{
echo "Cannot find closing tag. not valid xhtml or not complete ones.";
}
else
{
$text = implode($text); //join array
echo htmlentities($text); //output it
}
?>
[/code]

it will output

Lorem Ipsu<--break-->m is simpl<--break-->y dummy te<--break-->xt of the <--break-->printing a<--break-->nd typeset<--break-->ting indus<--break-->try. <p> Lorem Ipsum has been the industry's standard dummy text</p><--break--><--break--><--break--><--break--><--break--> ever si<--break-->nce the 15<--break-->00s, when <--break-->an unknown<--break--> printer t<--break-->ook a gall<--break-->ey of type<--break--> and scram<--break-->bled <table><tr><td>it to make a type specimen book.</td></ttr></table><--break--><--break--><--break--> It ha<--break-->s survived<--break--> not only <--break-->five centu<--break-->ries, but <--break-->also the l<--break-->eap into e<--break-->lectronic <--break-->typesettin<--break-->g, remaini<--break-->ng essenti<--break-->ally uncha<--break-->nged. <b> It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages,</b><--break--><--break--><--break--><--break--><--break--><--break--><--break--><--break--><--break--><--break--> and mor<--break-->e recently<--break--> with desk<--break-->top publis<--break-->hing softw<--break-->are like A<--break-->ldus PageM<--break-->aker inclu<--break-->ding versi<--break-->ons of Lor<--break-->em Ipsum.
[/quote]
Copy linkTweet thisAlerts:
@bharanikumarphpJul 05.2008 — Dear

<--break--><--break--><--break--><--break--><--break--><--break--><--break--><--break--><--break--><--break-->

Why this much <--break-->

may i know,,
Copy linkTweet thisAlerts:
@andre4s_yJul 05.2008 — sorry, because that example, i try various number of $count. 1 to 100.

And example above is when i try $count = 10.

But when i try $count = 100, it fall again...

so i patch it again... and again.. (this is bad!!)

I hope this is the last :
[code=php]
<?php
$text_ori = 'Lorem Ipsum is simply dummy text of the printing and typesetting industry.
<p> Lorem Ipsum has been the industry's standard dummy text</p>
ever since the 1500s, when an unknown printer took a galley of type and scrambled

<table><tr><td>it to make a type specimen book.</td></ttr></table>

It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged.
<b> It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages,</b>
and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.';
$text = preg_split("/(<.+>|&.+;)/sU",$text_ori,-1,PREG_SPLIT_DELIM_CAPTURE|PREG_SPLIT_NO_EMPTY);
//split the text based on html tag, it will produce array.
$counter = 105; //number of character, must greater than 0
$counter_temp = $counter;
$random = "<--break-->";//separator => in your case : <--break--> => in html entities &lt;--break&gt;
$add_random = 0; //variable to store how many add randomness
$open_tag = 0; //variable to store how many opentag
//html tag is not counted, html entities is counted as 1 character, comment is not counted
for($i = 0; $i < count($text); $i++)
{//looping in array
if(!preg_match("/^</sU",$text[$i]))
{//array value is not html tag, only non html tag are counted, start count
if(preg_match("/^&.+;$/",$text[$i]))
{// array value is html entity
//consider html entity as 1 character, change later
$counter_temp--;
if($counter_temp==0)
{
if(!$open_tag)
{
$text[$i] .= $random;
}
else
{
$add_random++;
}
$counter_temp = $counter;
}
}
else
{//array value is not contain html tag and entity.
$length = strlen($text[$i]);
$counter_temp = $counter_temp - $length;
if($counter_temp<0)
{
$temporary = $length + $counter_temp;
$how_much = floor(($length - $temporary) / $counter);
if($open_tag)
{//add
$add_random+=$how_much;
$add_random++;
}
else
{//write
for($j = $how_much; $j >=0; $j--)
{
$test = $temporary + ($j * $counter);
$text[$i] = preg_replace("/^.{".$test."}/s","$0".$random,$text[$i]);
}
}
$counter_temp = $counter;
}
}
}
elseif(preg_match("/^<.*/>$/U",$text[$i]))
{//self closing tag.
//do nothing
}
elseif(preg_match("/^</w+>/",$text[$i]))
{//closing tag
//turn off toggle
$open_tag--;
if($add_random&&!$open_tag)
{
do
{
$add_random--;
$text[$i] .= $random;
} while ($add_random!=0);
}
}
else
{//opening tag
$open_tag++;
}
}
if($add_random)
{
echo "Cannot find closing tag. not valid xhtml or not complete ones.";
}
else
{
$text = implode($text); //join array
echo htmlentities($text); //output it
}
?>
[/code]

it will output :

Lorem Ipsum is simply dummy text of the printing and typesetting industry. <p> Lorem Ipsum has been the industry's standard dummy text</p><--break--> ever since the 1500s, when an unknown printer took a galley of type and scrambled <table><tr><td>it to make a type specimen book.</td></ttr></table><--break--> It has survived not only five centuries, but also the leap into electronic typesetting, remaining es<--break-->sentially unchanged. <b> It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages,</b> a<--break-->nd more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.<--break-->
[/quote]

$count = 105; as you can see there

If this fall again, than i think there must be another good way to do this..
Copy linkTweet thisAlerts:
@bharanikumarphpJul 05.2008 — Dear

I got expected output,

I thing this is enought for me,

Thanks a lot,

I need to test with different contents and html tags,

So i tell u after 1 hour,

ok dear......
Copy linkTweet thisAlerts:
@bharanikumarphpJul 05.2008 — This is the content,

but it through the below erro

[code=php]
Parse error: parse error, unexpected T_CONSTANT_ENCAPSED_STRING in C:Program Filesxampphtdocssourceameex_01.php on line 3
[/code]

[code=php]

$text_ori = "<p>Lorem Ipsum is simply dummy text of the printing and typesetting industry.</p>
<b>Lorem Ipsum has been the industry's standard dummy text ever since the 1500s,<b>
<u> when an unknown printer took a galley of type and scrambled it to make a type specimen book. </u>
<table><tr><td>It has survived not only five centuries,</td>
<td> but also the leap into electronic typesetting,</td>
<td> remaining essentially unchanged.</td></tr></table>
<a href=""> It was popularised </a>

<b>in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages,</b>
<lable> and more recently</lable>
<p> with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.
Why do we use it?</p>

<b>It is a long established fact that a reader will be distracted by the readable content of a page when looking at its layout. The point of using Lorem </p>

<b>Ipsum is that it has a more-or-less normal distribution of letters, as opposed to using 'Content here, content here', making it look like readable

English.</b>


<p> Many desktop publishing packages and web page editors now use Lorem Ipsum as their default model text, and a search for 'lorem ipsum' will uncover many

web sites still in their infancy. Various versions have evolved over the years, sometimes by accident, sometimes on purpose (injected humour and the like).
</p>
Where does it come from?

Contrary to popular belief, Lorem Ipsum is not simply random text. It has roots in a piece of classical Latin literature from 45 BC, making it over 2000

years old. <ul>Richard McClintock, a Latin professor at Hampden-Sydney College in Virginia, looked up one of the more obscure Latin words, consectetur, from

a Lorem Ipsum passage,</ul and going through the cites of the word in classical literature, discovered the undoubtable source. Lorem Ipsum comes from

sections 1.10.32 and 1.10.>a33 of "de Finibus Bonorum et Malorum" (The Extremes of Good and Evil) by Cicero, written in 45 BC. This book is a treatise on the ";
[/code]
Copy linkTweet thisAlerts:
@andre4s_yJul 06.2008 — Out of topic there...

It is not the code which is faulty, but you have wrong input.

It is not valid string variable.

If you have corrected it, it still not have valid html.

Please correct it by yourself...

First correct the input in $text_ori, and second correct the html use in input.
Copy linkTweet thisAlerts:
@bharanikumarphpJul 07.2008 — Counter is 50


I given below content as input

<code>

Lorem Ipsum is simply dummy text of the printing and typesetting industry.

<p> Lorem Ipsum has been the industry's standard dummy text</p>

ever since the 1500s, when an unknown printer took a galley of type and scrambled


<table><tr><td>it to make a type specimen book.</td></tr></table>

It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged.

<b> It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages,</b>

and more recently with desktop publishing<p> software like Aldus PageMaker</p> including versions of Lorem Ipsum.

</code>


For that i got output like this

<code>

Lorem Ipsum is simply dummy text of the printing a<--break-->nd typesetting industry. <p> Lorem Ipsum has been the industry's standard dummy text</p><--break--> ever since the 1500s, when an unknown printer to<--break-->ok a galley of type and scrambled <table><tr><td>it to make a type specimen book.</td></tr></table> It has survive<--break-->d not only five centuries, but also the leap into <--break-->electronic typesetting, remaining essentially unch<--break-->anged. <b> It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages,</b>[COLOR="Red"]<--break--><--break--> [/COLOR]and more recently with desktop publishing<p> software like Aldus PageMaker</p><--break--> including versions of Lorem Ipsum.

</code>
Copy linkTweet thisAlerts:
@andre4s_yJul 07.2008 — Sorry if i gave you very untrusted code...

check this :
[code=php]
<?php
$text_ori = 'Lorem Ipsum is simply dummy text of the printing and typesetting industry.<p>Lorem Ipsum has been the industry's standard dummy text</p>ever since the 1500s, when an unknown printer took a galley of type and scrambled<table><tr><td>it to make a type specimen book.</td></tr></table>It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged.<b> It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages,</b>and more recently with desktop publishing<p>software like Aldus PageMaker</p>including versions of Lorem Ipsum.';
$text = preg_split("/(<.+>|&.+;)/sU",$text_ori,-1,PREG_SPLIT_DELIM_CAPTURE|PREG_SPLIT_NO_EMPTY);
//split the text based on html tag, it will produce array.
$counter = 50; //number of character, must greater than 0
$counter_temp = $counter;
$random = "<--break-->";//separator => in your case : <--break--> => in html entities &lt;--break&gt;
$add_random = 0; //variable to store how many add randomness
$open_tag = 0; //variable to store how many opentag
//html tag is not counted, html entities is counted as 1 character, comment is not counted

for($i = 0; $i < count($text); $i++)
{//looping in array
if(!preg_match("/^</sU",$text[$i]))
{//array value is not html tag, only non html tag are counted, start count
if(preg_match("/^&.+;$/",$text[$i]))
{// array value is html entity
//consider html entity as 1 character, change later
$counter_temp--;
if($counter_temp==0)
{
if(!$open_tag)
{
$text[$i] .= $random;
}
else
{
$add_random++;
}
$counter_temp = $counter;
}
}
else
{//array value is not contain html tag and entity.
$length = strlen($text[$i]);
$counter_temp = $counter_temp - $length;
if($counter_temp<=0)
{
$temporary = $length + $counter_temp;
$how_much = floor(($length - $temporary) / $counter);
if($open_tag)
{//add
$add_random+=$how_much;
$add_random++;
}
else
{//write
for($j = $how_much; $j >=0; $j--)
{
$test = $temporary + ($j * $counter);
$text[$i] = preg_replace("/^.{".$test."}/s","$0".$random,$text[$i]);
}
}
do
{
$counter_temp +=$counter;
} while ($counter_temp<=0);
}
}
}
elseif(preg_match("/^<.*/>$/U",$text[$i]))
{//self closing tag.
//do nothing
}
elseif(preg_match("/^</w+>/",$text[$i]))
{//closing tag
//turn off toggle
$open_tag--;
if($add_random&&!$open_tag)
{
do
{
$add_random--;
$text[$i] .= $random;
} while ($add_random!=0);
}
}
else
{//opening tag
$open_tag++;
}
}
if($add_random)
{
echo "Cannot find closing tag. not valid xhtml or not complete ones.";
}
else
{
$text = implode($text); //join array
//$text = preg_replace("/(".preg_quote($random).")+/","$1",$text);
echo htmlentities($text); //output it
}
?>
[/code]

it will output :

Lorem Ipsum is simply dummy text of the printing a<--break-->nd typesetting industry.<p>Lorem Ipsum has been the industry's standard dummy text</p><--break-->ever since the 1500s,<--break--> when an unknown printer took a galley of type and<--break--> scrambled<table><tr><td>it to make a type specimen book.</td></tr></table>It has s<--break-->urvived not only five centuries, but also the leap<--break--> into electronic typesetting, remaining essentiall<--break-->y unchanged.<b> It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages,</b><--break--><--break-->and more recently with desktop publis<--break-->hing<p>software like Aldus PageMaker</p>including version<--break-->s of Lorem Ipsum.
[/quote]

Fixed in counter.

There is double <--break--> because from the last <--break--> until the end of </b> need 2 <--break-->. But the code can not add <--break--> inside <b> tag.

If you not satisfied with this condition, please, uncomment on this code :
[code=php]
//$text = preg_replace("/(".preg_quote($random).")+/","$1",$text);
[/code]

There will be no <--break--> more than one. Please take a note there that i make $text_ori only in one line,

because if it put in multiple line, enter or change line and tabs are consider as chars.
Copy linkTweet thisAlerts:
@bharanikumarphpJul 07.2008 — How to i remove the checking of exact open and close,,,


bcoz in my database table lot of tag is not closed,

i want to avoid the this restricltion
[code=php]
echo "Cannot find closing tag. not valid xhtml or not complete ones.";
[/code]


Thanks in advance
Copy linkTweet thisAlerts:
@bharanikumarphpJul 07.2008 — This is my input
[code=php]
$text_ori = 'gane.<p>sh<b>k</b>u</p><table><tr><td>marsrini</td></tr></table>hhvasan';
[/code]



I thing in my output , the two break after the 3 character,but i given the counter value is 5..

why i display output like this manner

this is my output
[code=php]
gane.<--break--><p>sh<b>k</b>u</p><table><tr><td>marsrini</td></tr></table><--break-->hhv<--break-->asan
[/code]
Copy linkTweet thisAlerts:
@andre4s_yJul 07.2008 — How to i remove the checking of exact open and close,,,
[/QUOTE]

If you remove this, it wil break your own rule no.3

3.And also if the cursor meet the html tag then put the <--break--> after the html tag has closed,example()<table><tr><td>this is test</td></tr></table><--break-->
[/quote]


bcoz in my database table lot of tag is not closed,
[/quote]

The answer is just make it valid xhtml.. ?

In your last post, i think the output is correct.. what is wrong? i do not understand..

g = 1

a = 2

n = 3

e = 4

. = 5; break after this

s = 1

h = 2

k = 3

u = 4

m = 5; break after this, but it is located inside table tag, so break after table tag

a = 1

r = 2

s = 3

r = 4

i = 5; break after this, but it is located inside table tag, so break after table tag, but if you uncomment //$text = preg_replace("/(".preg_quote($random).")+/","$1",$text); it will no output break.

n = 1

i = 2

h = 3

h = 4

v = 5; break after this

a = 1

s = 2

a = 3

n = 4
[/quote]
Copy linkTweet thisAlerts:
@bharanikumarphpJul 07.2008 — Hi Dear

thanks of reply,

every thing is correct,

one thing dear,

i doing site upgradation,

i am not going to insert the data newly,

Already there 10000 records contents there in database table,

These contents are contain html tags,

so they are enterd the values without closing the html tag,

example they used the [COLOR="Red"]<p> some text [/COLOR]

but they are closed the </p>

so in our program if the html tag not closed meand,it through the error,

so that our code fully sufficient for my problem,,,

So sorry to say this...

example this is my table sample content

all record are like this......
[code=php]
<table width="350" cellpadding="5" cellspacing="0" border="0" align="right">
<tr>
<td><img src="/images/2002/November/tool_Fig1.gif" border="0" /></td>
</tr>
<tr>
<td align="center"><font size="1">Figure 1. These examples show the application of die design guidelines.</font></td>
</tr>
</table>
<i>Editor?s note: Chris Rauwendaal is a designer, consultant, and seminar instructor
who has written extensively on process engineering and extrusion topics.</i><br /><br />

Die design for extrusion can be rather complicated, since the size and shape
of the extruded product varies from that of the die flow channel. Multiple mechanisms
affect the size and shape changes in the extruded product, and these can be
controlled by die design.<br /><br />

<b>Basic Considerations</b><br />

The objective of an extrusion die is to distribute the polymer melt in the flow
channel such that the material exits from the die with a uniform velocity. The
actual distribution is determined by the flow properties of the polymer, the
flow channel geometry, the flow rate through the die, and the temperatures of
the die and the polymer melt. If the flow channel geometry is optimized for
one polymer under one set of conditions, a simple change in flow rate or in
temperature can make the geometry less than optimum. <br /><br />

Except for circular dies, it is essentially impossible to create a single flow
channel geometry that can be used for a wide range of polymers and operating
conditions. For this reason adjustment capabilities are often incorporated into
the die. These allow adjustment of the flow while the extruder is running. The
flow distribution can be changed mechanically, thermally, or both ways. Mechanical
adjustment devices include choker bars, restrictor bars, and valves.<br /><br />

Thermal adjustment involves changing the die temperature locally to adjust the
flow locally. Mechanical adjustment capabilities complicate the design of the
die but enhance its flexibility and controllability.<br /><br />

Some general rules are useful in die design:<ul>
<li>No dead spots in the flow channel.<br /><br />
<li>Steady increase in velocity along the flow channel.<br /><br />
<li>Easy assembly and disassembly.<br /><br />
<li>Land length about 10x land clearance.<br /><br />
<li>No abrupt changes in flow channel geometry.<br /><br />
<li>Small approach angles.</ul>

In die design, problems often occur because the product designer has little
or no appreciation for the impact of product design details on the ease or difficulty
of extrusion. In many cases, small design changes can drastically improve or
degrade the extrudability of the product. Some basic guidelines in profile design
minimize extrusion problems:<ul>

<li>Use generous internal and external radiuses on all corners; the smallest possible
radius is about .5 mm.<br /><br />
<li>Maintain uniform wall thickness (important!).<br /><br />
<li>Make walls no thicker than 4 mm.<br /><br />
<li>Make interior walls thinner than exterior walls (for cooling).<br /><br />
<li>Minimize the use of hollow sections.</ul>

Figure 1 illustrates applications of these guidelines to several different profiles.<br /><br />

<table width="200" cellpadding="5" cellspacing="0" border="0" align="right">
<tr>
<td><img src="/images/2002/November/tool_Fig2.gif" border="0" /></td>
</tr>
<tr>
<td align="center"><font size="1">Figure 2. This is an example of a partition in the flow channel of the die.</font></td>
</tr>
</table>
<b>Flow balancing.</b> Mechanical adjustment of the die flow channel can be done in
two basic ways. The length of the channel (land length) can be adjusted to make
sure the average flow velocity is uniform. The other method is to adjust the
height of the channel. <br /><br />

<table width="200" cellpadding="5" cellspacing="0" border="0" align="right">
<tr>
<td><img src="/images/2002/November/tool_Fig3.gif" border="0" /></td>
</tr>
<tr>
<td align="center"><font size="1">Figure 3. These two designs illustrate unbalanced and balanced channel height design.</font></td>
</tr>
</table>
<b>Balancing by channel height.</b> Balancing by land length does not always yield
satisfactory results. Another method is to balance by channel height, as shown
in Figure 3. This figure shows a U-shaped profile with circular sections. The
circular sections are thicker than the walls. Without balancing, the flow through
the circular section is substantially greater than the slit section.<br /><br />

If the flow channel is designed as in Figure 3a, flow in the thin sections would
be much slower than the thick sections; resulting in considerable distortion
of the extruded product. Figure 3b shows the same basic shape but with circular
pins mounted in the circular sections of the die such that the wall thickness
is uniform throughout. The flow velocities from die 3b will be more uniform
than those from die 3a, and, as a result, little distortion would occur in the
product extruded from die 3b.<br /><br />

<b>Size and shape changes.</b> The shape and the size of the extruded product are different
from that of the die flow channel. The extrudate can expand as it exits the
die; this is often called ?die swell?, even though the term ?extrudate
swell? is more appropriate. Extrudate swell occurs because of elastic
recovery of the plastic; however, swelling can also be affected by air entrapment
and foaming.<br /><br />

The extrudate decreases in size as a result of draw down and cooling. Draw down
occurs because the velocity at the take-up is higher than at the die exit. Draw
down is necessary to have a certain level of tension in the line to keep the
extrudate from sagging. In non-circular products draw down changes the shape
of the product because the plastic melt in corners flows more slowly than in
other regions of the die. As a result, material disappears from the corners
by draw down. This is why a square flow channel produces a bulged product (see
Figure 4). The shape change is also affected by the elastic recovery of the
plastic.<br /><br />

<table width="250" cellpadding="5" cellspacing="0" border="0" align="left">
<tr>
<td><img src="/images/2002/November/tool_Fig4.gif" border="0" /></td>
</tr>
<tr>
<td align="center"><font size="1">Figure 4. These illustrations show how a corrected die can produce a desired extrudate shape.</font></td>
</tr>
</table>
Cooling reduces the size of the product because the plastic density increases
as the temperature decreases; this is particularly true for semi-crystalline
plastics such as LDPE, PP, and HDPE. Shape changes can occur as a result of
nonuniform cooling. When the extrudate enters a water bath the outside layers
cool rapidly and solidify. As a result, a rigid, solid skin forms that grows
in thickness as the extrudate continues to cool along the water bath. On the
other hand, the inside material is still at high temperature, and as this material
cools it can form shrink voids if the outside layer is too rigid to deform.
It is also possible that the outside layer is pulled to the center if the thermal
contraction force is high enough to deform the outside layer. An example of
a shrink void is shown in Figure 5.<br /><br />

<table width="150" cellpadding="5" cellspacing="0" border="0" align="right">
<tr>
<td><img src="/images/2002/November/tool_Fig4.gif" border="0" /></td>
</tr>
<tr>
<td align="center"><font size="1">Figure 5. Rapid cooling can cause a shrink void in a part, as shown in the illustrations above.</font></td>
</tr>
</table>
Size and shape changes can also occur in the extruded product by relaxation.
This is the reduction of internal stresses due to changes in molecular configuration
over time. When the extrudate is stretched extensional stresses are introduced.
The stresses can relax over time. The relaxation of internal stresses can lead
to a reduction in length and an increase in cross sectional area. If the internal
stresses are not uniform, the relaxation can lead to warping of the extruded
product.<br /><br />

From these considerations it is clear that the process that produces size and
shape changes in an extruded product is rather complicated with several mechanisms
at work at the same time. As a result, it can be quite difficult to predict
what die geometry produces the desired product shape and dimensions. This is
why die design is one of the most challenging aspects of extrusion engineering.<br /><br />


<table border="1" cellpadding="5" bordercolor="#128a26" bgcolor="#eaf4eb">
<tr>
<td><b>CONTACT INFORMATION</b><br />
Rauwendaal Extrusion Engineering<br />
Los Altos Hills, CA<br />
Chris Rauwendaal<br />
(650) 948-6266; <a href="mailto:[email protected]">[email protected]</a></td>
</tr>
</table>
[/code]
Copy linkTweet thisAlerts:
@andre4s_yJul 07.2008 — There are 3 things you can do :[LIST=1]
  • [*]edit all your database records.

    + : page load faster, page will valid (x)html.

    - : more and more efforts.

    But maybe you avoid this option.

  • [*]make possible unclosed tag become selfclosing tag.

    effort : parse page manually using regex.

    + : not much effort.

    - : you page still not valid (x)html and the page will load longer.

  • [*]make it valid on the fly

    effort :

    a. parse page manually using regex

    b. try Tidy functions : http://id2.php.net/manual/en/book.tidy.php

    + : your page will valid (x)html

    - : more efforts and the page will load longer

  • [/LIST]

    Otherwise, i do not know.. Limitation of my knowledge..

    Maybe someother experts can help you..

    in case you want to try option number 2 :

    there are 2 option :
    [LIST=1]
  • [*]replace possible unclosed tag with selfclosing tag before process.

    add this after the declaration of $text_ori.
    [code=php]
    //list of possible unclosed tag. based on your last post
    //you can add by yourself. example : add $unclosed_tag[]="p";
    $unclosed_tag[]="li";
    for($i = 0; $i < count($unclosed_tag); $i++)
    {
    $text_ori=preg_replace("/(<".$unclosed_tag[$i].".*)(?<!/)>/U","$1 />",$text_ori);
    }
    [/code]

  • [*]make exception when looping.

    replace this code :
    [code=php]
    else
    {//opening tag
    $open_tag++;
    }
    [/code]

    with this one :
    [code=php]
    else
    {//opening tag
    for($j = 0; $j < count($unclosed_tag); $j++)
    {
    if(!preg_match("/(<".$unclosed_tag[$j].".*)(?<!/)>/U",$text[$i]))
    $open_tag++;
    }
    }
    [/code]

    remember to define $unclosed_tag as array before using it.

  • [/LIST]

    But personally, i prefer option nomor 1. It will create your page valid (x)html.

    Why create it valid ? read it here.

    Note :

    //$text = preg_replace("/(".preg_quote($random)."){2,}/","$1",$text);

    will faster than before :

    //$text = preg_replace("/(".preg_quote($random).")+/","$1",$text);

    Because before is look over all <--break-->

    after is only look for double or more <--break-->

    (faulty again... X(( )
    Copy linkTweet thisAlerts:
    @bharanikumarphpJul 07.2008 — Hi Dear

    thanks of reply,

    every thing is correct,

    one thing dear,

    i doing site upgradation,

    i am not going to insert the data newly,

    Already there 10000 records contents there in database table,

    These contents are contain html tags,

    so they are enterd the values without closing the html tag,

    example they used the [COLOR="Red"]<p> some text [/COLOR]

    but they are closed the </p>

    so in our program if the html tag not closed meand,it through the error,

    so that our code fully sufficient for my problem,,,

    So sorry to say this...

    example this is my table sample content

    all record are like this......

    [code=php]
    i remove the attached content gfdgsfd
    [/code]


    [/QUOTE]
    gsfdgsfdg
    Copy linkTweet thisAlerts:
    @bharanikumarphpJul 07.2008 — [code=php]
    <i>It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum..</i><br /><br />It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.<br /><br />
    [/code]


    i entered this content,,,but ithroughs the

    Cannot find closing tag. not valid xhtml or not complete ones.
    Copy linkTweet thisAlerts:
    @andre4s_yJul 07.2008 — Can you show your last code?

    because in here it works good..
    Copy linkTweet thisAlerts:
    @bharanikumarphpJul 07.2008 — [code=php]
    <?php
    $text_ori = '<i>It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum..</i><br /><br />It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.<br /><br />';

    $unclosed_tag[]="p";
    $unclosed_tag[]="li";
    $unclosed_tag[]="i";
    $unclosed_tag[]="ul";
    $unclosed_tag[]="b";
    $unclosed_tag[]="table";
    $unclosed_tag[]="tr";
    $unclosed_tag[]="td";
    $unclosed_tag[]="ul";
    $unclosed_tag[]="br";

    for($i = 0; $i < count($unclosed_tag); $i++)
    {
    $text_ori=preg_replace("/(<".$unclosed_tag[$i].".*)(?<!/)>/U","$1 />",$text_ori);
    }

    $text = preg_split("/(<.+>|&.+;)/sU",$text_ori,-1,PREG_SPLIT_DELIM_CAPTURE|PREG_SPLIT_NO_EMPTY);
    //split the text based on html tag, it will produce array.
    $counter = 40; //number of character, must greater than 0
    $counter_temp = $counter;
    $random = "<--break-->";//separator => in your case : <--break--> => in html entities &lt;--break&gt;
    $add_random = 0; //variable to store how many add randomness
    $open_tag = 0; //variable to store how many opentag
    //html tag is not counted, html entities is counted as 1 character, comment is not counted

    for($i = 0; $i < count($text); $i++)
    {//looping in array
    //echo $i;
    //print $text[$i];
    //echo $random."<br>";
    //echo $i."<br>";
    //echo $text."<br>";
    if(!preg_match("/^</sU",$text[$i]))
    {//array value is not html tag, only non html tag are counted, start count
    echo "am entering"."<br>";
    if(preg_match("/^&.+;$/",$text[$i]))
    {// array value is html entity
    echo "2 am entering"."<br>";
    //consider html entity as 1 character, change later
    $counter_temp--;
    echo "what is temp value".$counter_temp;
    if($counter_temp==0)
    {
    if(!$open_tag)
    {
    $text[$i] .= $random;

    }
    else
    {
    $add_random++;
    }
    $counter_temp = $counter;
    }
    }
    else
    {//array value is not contain html tag and entity.
    $length = strlen($text[$i]);
    $counter_temp = $counter_temp - $length;
    if($counter_temp<=0)
    {
    $temporary = $length + $counter_temp;
    $how_much = floor(($length - $temporary) / $counter);
    if($open_tag)
    {//add
    $add_random+=$how_much;
    $add_random++;
    }
    else
    {//write
    for($j = $how_much; $j >=0; $j--)
    {
    $test = $temporary + ($j * $counter);
    $text[$i] = preg_replace("/^.{".$test."}/s","$0".$random,$text[$i]);
    }
    }
    do
    {
    $counter_temp +=$counter;
    } while ($counter_temp<=0);
    }
    }
    }
    elseif(preg_match("/^<.*/>$/U",$text[$i]))
    {//self closing tag.
    //do nothing

    }
    elseif(preg_match("/^</w+>/",$text[$i]))
    {//closing tag
    //turn off toggle
    $open_tag--;
    if($add_random&&!$open_tag)
    {
    do
    {
    $add_random--;
    $text[$i] .= $random;
    } while ($add_random!=0);
    }
    }
    else
    {//opening tag
    for($j = 0; $j < count($unclosed_tag); $j++)
    {
    if(!preg_match("/(<".$unclosed_tag[$j].".*)(?<!/)>/U",$text[$i]))
    $open_tag++;

    }
    }
    }
    if($add_random)
    {
    echo "Cannot find closing tag. not valid xhtml or not complete ones.";
    }
    else
    {
    $text = implode($text); //join array
    $text = preg_replace("/(".preg_quote($random).")+/","$1",$text);
    echo htmlentities($text); //output it
    }
    ?>
    [/code]
    Copy linkTweet thisAlerts:
    @andre4s_yJul 07.2008 — Hey.. why you list almost all tag as possible unclosed tag??

    OMG... i do not think you do that...

    beyond my expectation.. because i think from your example before, only <li> tag that has possible unclosed. All tags are closed, except li, so i can add a patch.

    If you list almost all tag like that, just forget my option number 2..

    because it only a patch.. not fully can detect whether all tag is closed or not.

    (Tidy function do it better)

    Or.. maybe other can help you better...

    I think it is Out of topic here... please start a new thread, so other experts can help you better...

    Note :

    if you want to go back, just undo my option number 2..

    Therefore, my option number 2 is contain 2 options, you can not apply them together, option number 2 becomes useless..
    Copy linkTweet thisAlerts:
    @bharanikumarphpJul 07.2008 — Dear in my old content some

    here some close tag are missed,

    like <p>

    so for that also i want to omit,,

    and also (appato-ps) that is" [COLOR="Red"]'[/COLOR] ",

    ok dear leave it every thing...

    Sorry...

    simply tell one idea,

    for future , content insertion shall i use this code,

    this will work fine dear,
    [code=php]
    <?php
    $text_ori = 'Lorem Ipsum is simply dummy text of the printing and typesetting industry.<p>Lorem Ipsum has been the industry's standard dummy text</p>ever since the 1500s, when an unknown printer took a galley of type and scrambled<table><tr><td>it to make a type specimen book.</td></tr></table>It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged.<b> It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages,</b>and more recently with desktop publishing<p>software like Aldus PageMaker</p>including versions of Lorem Ipsum.';
    $text = preg_split("/(<.+>|&.+;)/sU",$text_ori,-1,PREG_SPLIT_DELIM_CAPTURE|PREG_SPLIT_NO_EMPTY);
    //split the text based on html tag, it will produce array.
    $counter = 50; //number of character, must greater than 0
    $counter_temp = $counter;
    $random = "<--break-->";//separator => in your case : <--break--> => in html entities &lt;--break&gt;
    $add_random = 0; //variable to store how many add randomness
    $open_tag = 0; //variable to store how many opentag
    //html tag is not counted, html entities is counted as 1 character, comment is not counted

    for($i = 0; $i < count($text); $i++)
    {//looping in array
    if(!preg_match("/^</sU",$text[$i]))
    {//array value is not html tag, only non html tag are counted, start count
    if(preg_match("/^&.+;$/",$text[$i]))
    {// array value is html entity
    //consider html entity as 1 character, change later
    $counter_temp--;
    if($counter_temp==0)
    {
    if(!$open_tag)
    {
    $text[$i] .= $random;
    }
    else
    {
    $add_random++;
    }
    $counter_temp = $counter;
    }
    }
    else
    {//array value is not contain html tag and entity.
    $length = strlen($text[$i]);
    $counter_temp = $counter_temp - $length;
    if($counter_temp<=0)
    {
    $temporary = $length + $counter_temp;
    $how_much = floor(($length - $temporary) / $counter);
    if($open_tag)
    {//add
    $add_random+=$how_much;
    $add_random++;
    }
    else
    {//write
    for($j = $how_much; $j >=0; $j--)
    {
    $test = $temporary + ($j * $counter);
    $text[$i] = preg_replace("/^.{".$test."}/s","$0".$random,$text[$i]);
    }
    }
    do
    {
    $counter_temp +=$counter;
    } while ($counter_temp<=0);
    }
    }
    }
    elseif(preg_match("/^<.*/>$/U",$text[$i]))
    {//self closing tag.
    //do nothing
    }
    elseif(preg_match("/^</w+>/",$text[$i]))
    {//closing tag
    //turn off toggle
    $open_tag--;
    if($add_random&&!$open_tag)
    {

    do
    {
    $add_random--;
    $text[$i] .= $random;
    } while ($add_random!=0);
    }
    }
    else
    {//opening tag
    $open_tag++;
    }
    }
    if($add_random)
    {
    echo "Cannot find closing tag. not valid xhtml or not complete ones.";
    }
    else
    {
    $text = implode($text); //join array
    $text = preg_replace("/(".preg_quote($random).")+/","$1",$text);
    echo htmlentities($text); //output it
    }
    ?>

    Bcoz i planed,
    that is use option,

    To solve the existing content problem,
    I am going to use this code
    [code=php]
    <?php
    mysql_connect("localhost","root","");
    mysql_select_db("testing") or die("could not select the database");
    $sql = "select * from testing";
    $rs = mysql_query($sql);
    if(mysql_num_rows($rs)>0)
    {
    while($obj=mysql_fetch_object($rs))
    {
    $body_chunksplit = $obj->body;
    $nid=$obj->nid;
    $body_chunksplit=str_replace("'", "'", $body_chunksplit);
    $lrn = strlen($body_chunksplit);
    $length=$lrn*2;

    if(strlen($body_chunksplit) >= 4500)
    {
    $body_content=substr(chunk_split($body_chunksplit, 4500, "<--pagebreak-->"), 0, $length);
    //drupal_set_message(wordcount($body_content));
    }
    else
    {
    $body_content=$body_chunksplit;
    }
    mysql_query("UPDATE node_revisions SET body='$body_content' where nid='$nid'") or die("could not");
    }


    }


    ?>
    [/code]



    Here is the code,this code going to perform the inserting <==break--> into existing content...

    Just plz walk through it..

    After inserted, then i edit the database,
    Copy linkTweet thisAlerts:
    @bharanikumarphpJul 08.2008 — dear i want to delete some reply content from forums,


    how to i delete that....
    Copy linkTweet thisAlerts:
    @bharanikumarphpJul 08.2008 — Hey.. why you list almost all tag as possible unclosed tag??

    OMG... i do not think you do that...

    beyond my expectation.. because i think from your example before, only <li> tag that has possible unclosed. All tags are closed, except li, so i can add a patch.

    If you list almost all tag like that, just forget my option number 2..

    because it only a patch.. not fully can detect whether all tag is closed or not.

    (Tidy function do it better)

    Or.. maybe other can help you better...

    I think it is Out of topic here... please start a new thread, so other experts can help you better...

    Note :

    if you want to go back, just undo my option number 2..

    Therefore, my option number 2 is contain 2 options, you can not apply them together, option number 2 becomes useless..[/QUOTE]

    ok

    i want to skip the selected tags, when it is not closed

    is it possbile,

    that are:: LI,TABLE,B
    Copy linkTweet thisAlerts:
    @bharanikumarphpJul 08.2008 — Sorry if i gave you very untrusted code...

    check this :
    [code=php]
    <?php
    $text_ori = 'Lorem Ipsum is simply dummy text of the printing and typesetting industry.<p>Lorem Ipsum has been the industry's standard dummy text</p>ever since the 1500s, when an unknown printer took a galley of type and scrambled<table><tr><td>it to make a type specimen book.</td></tr></table>It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged.<b> It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages,</b>and more recently with desktop publishing<p>software like Aldus PageMaker</p>including versions of Lorem Ipsum.';
    $text = preg_split("/(<.+>|&.+;)/sU",$text_ori,-1,PREG_SPLIT_DELIM_CAPTURE|PREG_SPLIT_NO_EMPTY);
    //split the text based on html tag, it will produce array.
    $counter = 50; //number of character, must greater than 0
    $counter_temp = $counter;
    $random = "<--break-->";//separator => in your case : <--break--> => in html entities &lt;--break&gt;
    $add_random = 0; //variable to store how many add randomness
    $open_tag = 0; //variable to store how many opentag
    //html tag is not counted, html entities is counted as 1 character, comment is not counted

    for($i = 0; $i < count($text); $i++)
    {//looping in array
    if(!preg_match("/^</sU",$text[$i]))
    {//array value is not html tag, only non html tag are counted, start count
    if(preg_match("/^&.+;$/",$text[$i]))
    {// array value is html entity
    //consider html entity as 1 character, change later
    $counter_temp--;
    if($counter_temp==0)
    {
    if(!$open_tag)
    {
    $text[$i] .= $random;
    }
    else
    {
    $add_random++;
    }
    $counter_temp = $counter;
    }
    }
    else
    {//array value is not contain html tag and entity.
    $length = strlen($text[$i]);
    $counter_temp = $counter_temp - $length;
    if($counter_temp<=0)
    {
    $temporary = $length + $counter_temp;
    $how_much = floor(($length - $temporary) / $counter);
    if($open_tag)
    {//add
    $add_random+=$how_much;
    $add_random++;
    }
    else
    {//write
    for($j = $how_much; $j >=0; $j--)
    {
    $test = $temporary + ($j * $counter);
    $text[$i] = preg_replace("/^.{".$test."}/s","$0".$random,$text[$i]);
    }
    }
    do
    {
    $counter_temp +=$counter;
    } while ($counter_temp<=0);
    }
    }
    }
    elseif(preg_match("/^<.*/>$/U",$text[$i]))
    {//self closing tag.
    //do nothing
    }
    elseif(preg_match("/^</w+>/",$text[$i]))
    {//closing tag
    //turn off toggle
    $open_tag--;
    if($add_random&&!$open_tag)
    {
    do
    {
    $add_random--;
    $text[$i] .= $random;
    } while ($add_random!=0);
    }
    }
    else
    {//opening tag
    $open_tag++;
    }
    }
    if($add_random)
    {
    echo "Cannot find closing tag. not valid xhtml or not complete ones.";
    }
    else
    {
    $text = implode($text); //join array
    //$text = preg_replace("/(".preg_quote($random).")+/","$1",$text);
    echo htmlentities($text); //output it
    }
    ?>
    [/code]

    it will output :

    Fixed in counter.

    There is double <--break--> because from the last <--break--> until the end of </b> need 2 <--break-->. But the code can not add <--break--> inside <b> tag.

    If you not satisfied with this condition, please, uncomment on this code :
    [code=php]
    //$text = preg_replace("/(".preg_quote($random).")+/","$1",$text);
    [/code]

    There will be no <--break--> more than one. Please take a note there that i make $text_ori only in one line,

    because if it put in multiple line, enter or change line and tabs are consider as chars.[/QUOTE]



    I want add the one more condition in this,

    that is if the comma or fullstop near , then put <--break--> after

    comma and fullstop of the sentense,
    ×

    Success!

    Help @BWWebDesigns spread the word by sharing this article on Twitter...

    Tweet This
    Sign in
    Forgot password?
    Sign in with TwitchSign in with GithubCreate Account
    about: ({
    version: 0.1.9 BETA 5.19,
    whats_new: community page,
    up_next: more Davinci•003 tasks,
    coming_soon: events calendar,
    social: @webDeveloperHQ
    });

    legal: ({
    terms: of use,
    privacy: policy
    });
    changelog: (
    version: 0.1.9,
    notes: added community page

    version: 0.1.8,
    notes: added Davinci•003

    version: 0.1.7,
    notes: upvote answers to bounties

    version: 0.1.6,
    notes: article editor refresh
    )...
    recent_tips: (
    tipper: @AriseFacilitySolutions09,
    tipped: article
    amount: 1000 SATS,

    tipper: @Yussuf4331,
    tipped: article
    amount: 1000 SATS,

    tipper: @darkwebsites540,
    tipped: article
    amount: 10 SATS,
    )...