I’ve looked at lookahead and lookaround and this site: [URL=”http://www.regular-expressions.info/lookaround2.html”]http://www.regular-expressions.info/lookaround2.html
I got this string:
[code=php]$englishpart =
”
===synonyms====
*blahblahblah [[THIS IS RIGHT]] ijtir.
*usdhty [[this is wee right]] ksdhfiudf.
==bkshbdf===”;
and Im using this regex code:
[code=php]/^=+synonyms=+.**.*[[([^nr])*]].*=+[a-z]*=+$/imsu
but what Im getting is the ===synonyms==== all the *[[THIS IS RIGHT]] and also the bottom ==bkshbdf
All I want is the words inside the [[ ]] so I want the THIS IS RIGHT and this is wee right. But not the other code surrounding it and also the [[,]].
This has something to do with sub patterns right? How do you use it?
[code=php]
if(preg_match("/=+synonyms=+[s|n]*(*w*s*[{2}[w|s]*]{2}s*w*.n*s*)+={2}[a-z]+={2}/mi",$englishpart,$result))
{
preg_match_all("/[{2}(.*)]{2}/i",$result[0],$result);
//print_r($result);
for($i=0; $i<count($result[1]);$i++)
{
echo $result[1][$i];
//print THIS IS RIGHT and this is wee right
}
}
[/code]
[CODE]preg_match("/=+synonyms=+.*(*w*s*[{2}[w|s]*]{2}s*w*.n*s*)+={2}[a-z]+={2}/smi",$englishpart,$result)[/CODE]
Just want to clarify your code, tell me if I'm going wrong anywhere.
It's basically saying ====synonyms==== with anything after that including newlines. But why do you need to escape the equal(=) signs? I did not have to before.
[/QUOTE]
Wouldn't this be acceptable: ^=+synonyms=+.** .*
[/QUOTE]
[CODE]
=+synonyms=+.*
[/CODE]
But this part I don't quite understand.
(* w*s*[ {2}[w|s]*] {2}s*w*. n*s*)+
It starts off with bracket meaning its a sub pattern. Then with a* is that the part that means the asterix(*) part that I had in my previous pattern? And the w meaning words. then s* meaning whitespace. But as I said before those things are meaningless, so can you just have a (.*) for all that so it would be
(* .*[ {2}[w|s]*] {2}s*w*. n*s*)+
then here[ {2}[w|s]*] {2}s*w*. n*s*)+
[/QUOTE]
[CODE]
(*.*[[.*]].*.)+
[/CODE]
Its saying the[ which I understand to be the "[" brackets. The {2} is there for telling us that there are two square [[ brackets right? But wouldn't it just be easier to say[ [ That uses 4 characters while the top uses 5 characters.
[/QUOTE]
Then you have [w|s]* which is exactly the same as the [^nr]* correct?
[/QUOTE]
Then. n*s*)+ which has a "." a newline and whitespace is this part required because the previous (.*) would have handled it right?
[/QUOTE]
Which is exactly what I said in the beginning, only that the backslashes were missing in my beginning code.
Right?
[/QUOTE]
But the
preg_match_all("/[ {2}(.*)] {2}/i",$result[0],$result);
//print_r($result);
for($i=0; $i<count($result[1]);$i++)
{
echo $result[1][$i];
//print THIS IS RIGHT and this is wee right
}
This is the part that actually does the trick right?
[/QUOTE]
preg_match_all("/[ {2}(.*)] {2}/i",$result[0],$result);
This part tells us to directly those inside the [[ and the ]] right?
[/QUOTE]
From the previous result[0]. I don't quite understand why it should be result[0]. I mean the previous preg_match doesn't it only match once? So does that mean it would match only one of the [[THIS IS RIGHT]] or would it match all of it: meaning it would get [[THIS IS RIGHT]] And also [[this is wee right]].
[/QUOTE]
preg_match() returns the number of times pattern matches. That will be either 0 times (no match) or 1 time because preg_match() will stop searching after the first match. preg_match_all() on the contrary will continue until it reaches the end of subject.
[/QUOTE]
The $i<count($result[1]); "Count" counts the number of elements in an array right? So $result[1] is an array? would $result be the array? Or is the $result[1] have more elements in it, making it a multidimensional array? Where is this result coming from? The preg_match_all before it? If so, why are you using result[1]?
I mean result[1] would be the second element of the matches($result) in the preg_match_all right? And it wouldn't be in the first preg_match. So why are you using [1] wouldn't that skip the first match of the preg_match_all. Are you saying that if you use result[0] it would make an error because it's already using result[0] from the previous preg_match to match it's results from the preg_match_all
Therefore should there be a different name for the matches? like "outerresult" and "inneresult".
[/QUOTE]
echo $result[1][$i];
//print THIS IS RIGHT and this is wee right
Then here it's saying echo the $result[1][i] meaning that $result[1] is an array and your cycling through the array to print out the elements in which it's print THIS IS RIGHT and this is wee right.
Again I still don't understand where did the result[1] come from and how did it become an array?
It you saying it's because of the subpattern? Because I don't quite understand how that works.
Sorry for the long post.[/QUOTE]
as you wish... ?
[CODE]preg_match("/=+synonyms=+.*(*w*s*[{2}[w|s]*]{2}s*w*.n*s*)+={2}[a-z]+={2}/smi",$englishpart,$result)[/CODE]
There are so many alternatif of the pattern.. I can make it flexible for you...
?
that[ is my style.
I always escape any non [a-zA-Z0-9] characters.
Play it safe..[/QUOTE]
But To make it more understandable why don't I change the names for the matches, like the first one will have "result" and the second one will be "finalresult"
How's about that?
[/QUOTE]
Man, that is pretty complicated, I'll have to reread this tomorrow. Thanks for all your work, your the kind of people we need more in the world.
[/QUOTE]
0.1.9 — BETA 5.10