/    Sign up×
Community /Pin to ProfileBookmark

[RESOLVED] Regex help needed

[code=php]if (!isset($slug)) {
$slug = strtolower($title);
//replaces a word with spaces on either sides
$slug = preg_replace(‘/s(a)|(an)|(and)|(are)|(as)|(at)|(be)|(by)|(com)|(for)|(from)|(i)|(in)|(is)|(it)|(of)|(on)|(or)|(that)|(the)|(this)|(to)|(was)|(what)|(when)|(where)|(who)|(will)|(with)|(you)|(your)s/i’,’ ‘,$slug);
//the last word in the title
$slug = preg_replace(‘/s(a)|(an)|(and)|(are)|(as)|(at)|(be)|(by)|(com)|(for)|(from)|(i)|(in)|(is)|(it)|(of)|(on)|(or)|(that)|(the)|(this)|(to)|(was)|(what)|(when)|(where)|(who)|(will)|(with)|(you)|(your)$/i’,”,$slug);
//the first word in the title
$slug = preg_replace(‘/^(a)|(an)|(and)|(are)|(as)|(at)|(be)|(by)|(com)|(for)|(from)|(i)|(in)|(is)|(it)|(of)|(on)|(or)|(that)|(the)|(this)|(to)|(was)|(what)|(when)|(where)|(who)|(will)|(with)|(you)|(your)s/i’,”,$slug);
$slug = preg_replace(‘/s+/’,’-‘,$slug);
}
$slug = mysql_real_escape_string($slug); [/code]

The script supposedly removes all stopwords from a string, but I can’t get it to work. The string “the profile I like is fun” becomes “-pr-le-l-ke-s-fun” after passing through this script, when it’s suppose to become “profile-like-fun”. Help?

to post a comment
PHP

6 Comments(s)

Copy linkTweet thisAlerts:
@sohguanhJul 13.2010 — Have you tried to print out contents of $slug after each preg_replace function call ? I did and below output is what I get.

first pass

pr le l ke s fun

second pass

pr le l ke s fun

third pass

pr le l ke s fun

fourth pass

-pr-le-l-ke-s-fun

It seem second pass is not working to your intention. You may want to look at that second calling of preg_replace function instead.

Edit: Oops there are 4 preg_replace function call but the behavoir seem to start from second calling onwards.
Copy linkTweet thisAlerts:
@NogDogJul 13.2010 — I would use the b word boundary assertion, and maybe do something like:
[code=php]
$slug = 'the profile I like is fun';
$regexp = '/b(a|an|and|are|as|at|be|by|com|for|from|i|in|is|it|of|on|or|that|'.
'the|this|to|was|what|when|where|who|will|with|you|your)b/i';
$slug = preg_replace($regexp, '', $slug);
$slug = preg_replace('/s+/', '-', trim($slug));
echo $slug;
[/code]

Of course, if someone were to enter "To be or not to be, that is the question." you would end up with "not-,-question.", which may not be what you want. ?
Copy linkTweet thisAlerts:
@sohguanhJul 13.2010 — Have you tried to print out contents of $slug after each preg_replace function call ? I did and below output is what I get.

first pass

pr le l ke s fun
[/QUOTE]


The solution has been provided by NogDog but I would like to tell you why your first preg_replace is already not doing what you intended.

the profile I like is fun

"the" is removed as it is a stopword based on your definition

pr "of" is removed as it is a stopword based on your definition

prof "i" is removed as it is a stopword based on your definition

So your "profile" is left with "pr le" and for the rest it should be easy to decipher yourself.

You really need to know how RegEx engine work before you attempt to put in your regular expression. Forming a RegEx is an "art" as I am still trying to learn it till today.
Copy linkTweet thisAlerts:
@narutodude000authorJul 13.2010 — [code=php]if (!isset($slug)) {
$regex = '/b(a|an|and|are|as|at|be|by|com|for|from|i|in|is|it|of|on|or|that|the|this|to|was|what|when|where|who|will|with|you|your)b/i';
$slug = preg_replace($regex, '', $title);
}
//removes special character
$slug = preg_replace('/[^a-zA-Z0-9]/','',$slug);
$slug = trim($slug);
//if someone's title was something like "I was with you:
if ($slug == ''){
$slug = $title;
}
$slug = preg_replace('/s+/', '-', $slug);
$slug = strtolower($slug);[/code]

This doesn't work because b matches the whitespace on either side of the stopword, so they get removed as well. So the slug became "profilelikefun".

Another question, does trim() remove one whitespace character from either end or all?
Copy linkTweet thisAlerts:
@narutodude000authorJul 13.2010 — Never mind, I got it.
Copy linkTweet thisAlerts:
@NogDogJul 13.2010 — Never mind, I got it.[/QUOTE]

Just to clarify for anyone else who comes along to read this thread, the "b" assertion does [I]not[/I] match the white-space character (or other characters that it considers as a boundary indicator). It [I]asserts[/I] that there is such a boundary condition there but does not include anything in the match. You can think of it as matching the zero-length point between the matched character and the white-space, punctuation, or start/end of string if that makes it easier to deal with.
×

Success!

Help @narutodude000 spread the word by sharing this article on Twitter...

Tweet This
Sign in
Forgot password?
Sign in with TwitchSign in with GithubCreate Account
about: ({
version: 0.1.9 BETA 5.28,
whats_new: community page,
up_next: more Davinci•003 tasks,
coming_soon: events calendar,
social: @webDeveloperHQ
});

legal: ({
terms: of use,
privacy: policy
});
changelog: (
version: 0.1.9,
notes: added community page

version: 0.1.8,
notes: added Davinci•003

version: 0.1.7,
notes: upvote answers to bounties

version: 0.1.6,
notes: article editor refresh
)...
recent_tips: (
tipper: @AriseFacilitySolutions09,
tipped: article
amount: 1000 SATS,

tipper: @Yussuf4331,
tipped: article
amount: 1000 SATS,

tipper: @darkwebsites540,
tipped: article
amount: 10 SATS,
)...