/    Sign up×
Community /Pin to ProfileBookmark

when to and not to use RegExp

[RANT]
OK, I think we are all guilty of abusing regular expressions from time to time purely for the sake of writing speed, but I think it’s important to take a few uses I have seen used as examples, and point out the correct method of achieving the desired effects under the given situation.
[B]/^[a-z]*$/[/B]
yes, there is a faster way of checking if you only use letters a-z, although it may take longer to type:

[code=php](strspn($STR,”abcdefghijklmnopqrstuvwxyz”) == strlen($STR))[/code]

what’s more is that this function becomes faster in relation to the regular expression the larger the list of valid characters becomes:

[code=php](strspn($STR, “abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ01234556789”) == strlen($STR))[/code]

ranks significantly quicker than the related regular expression.
[b]/^[^@]{1,63}@[^@]{1,255}$/[/b]
very nice, test the length of 2 strings using a regular expression, never mind that the strlen function was created for this very thing, we can write a regular expression for it so it will all be fine. quite why the writer of this never thought that splitting the string on the @ symbol, then providing 3 checks of length(one on the array, 1 on each half of the result) I do not know, but although it sounds like a longer check it works out to STILL be more efficient.
[b]preg_replace(“/ />/”, “>”, $str)[/b]
sorry, but this is just silly, to remove the / just use a standard str_replace script.
[/RANT]
There are tons more example of this sort of abuse(being taught as if it should be done) online, but these are the worst offenders I could find(2 of which I only found this morning, congratulations roScripts.

to post a comment
PHP

6 Comments(s)

Copy linkTweet thisAlerts:
@MrCoderNov 05.2007 — Nice rant, but are they really faster and if so by how much?
Copy linkTweet thisAlerts:
@scragarauthorNov 05.2007 — my results for strsn+strlen vs preg_match, each doing 1000 tests on strings increasing in length every time:
STRSPN:
time taken: 0.021502017974854

PREG:
time taken: 0.026623010635376

that's 1/5th faster give or take, with a larger difference on long strings, I can provide more info on actual test and such if needed(I have a long page of results if you want them, listing string checked, time taken for each test and such).
Copy linkTweet thisAlerts:
@TJ111Nov 05.2007 — For kicks I profiled two scripts. Here's the results:
[code=php]
<?php
$STR = "test";
print (strspn($STR,"abcdefghijklmnopqrstuvwxyz") == strlen($STR)) ? "success" : "fail";

//this script took 0.029 ms to execute
?>[/code]


[code=php]
<?php
$str = "test";
print (preg_match('/^[a-z]*$/', $str)) ? "success" : "fail";

//this script took 0.368 ms to execute
//thats more than a 10x decrease in performance
?>[/code]


I was rather surprised at how big a difference there was between such small scripts. I know the regexp engine was slower, but not by that much. Personally, I really only use regular expressions for validation or for changing complicated strings, otherwise I just use string functions.
Copy linkTweet thisAlerts:
@scragarauthorNov 05.2007 — ind that as the string get's larger preg realy starts to show it's stuff, however it was never able to catch up completely in my tests, it was always slightly behind, even with incredible large strings(in excess of 5,000 characters).
Copy linkTweet thisAlerts:
@NogDogNov 05.2007 — [code=php]
if(ctype_alpha($string)) {
[/code]

...is functionally equivalent to...
[code=php]
if(preg_match('/^[a-z]+$/i', $string) {
[/code]

...and is both shorter to type and faster to execute.
Copy linkTweet thisAlerts:
@bokehNov 10.2007 — Regex is faster by miles.[code=php]<?php

microtime(true); # initialize

$string='abcdefgijklmnopqrstuvwxyzabcdefgijklmnopqrstuvwxyzabcdefgijklmnopqrstuvwxyzabcdefgijklmnopqrstuvwxyzabcdefgijklmnopqrstuvwxyzabcdefgijklmnopqrstuvwxyzabcdefgijklmnopqrstuvwxyzabcdefgijklmnopqrstuvwxyzabcdefgijklmnopqrstuvwxyzabcdefgijklmnopqrstuvwxyz';

$start = microtime(true);
$regex = '/^[a-z]+$/i';
for($i=0; $i<1000; $i++)
{
(preg_match($regex, $string));
}
echo "Regex method: ".(microtime(true)-$start)."<br>n";

$start = microtime(true);
for($i=0; $i<1000; $i++)
{
(strspn($string,"abcdefghijklmnopqrstuvwxyz") == strlen($string)) ;
}
echo "Scrager method: ".(microtime(true)-$start)."<br>n";

?>[/code]
Result:[CODE]Regex method: 0.003291130065918
Scrager method: 0.012798070907593[/CODE]
But if you change the string so it has a bad character early in the string Scrager's method doesn't seam quite so lazy:[code=php]$string='a-bcdefgijklmnopqrstuvwxyzabcdefgijklmnopqrstuvwxyzabcdefgijklmnopqrstuvwxyzabcdefgijklmnopqrstuvwxyzabcdefgijklmnopqrstuvwxyzabcdefgijklmnopqrstuvwxyzabcdefgijklmnopqrstuvwxyzabcdefgijklmnopqrstuvwxyzabcdefgijklmnopqrstuvwxyzabcdefgijklmnopqrstuvwxyz';

[/code]
Result:[CODE]Regex method: 0.0022270679473877
Scrager method: 0.0011720657348633[/CODE]
×

Success!

Help @scragar spread the word by sharing this article on Twitter...

Tweet This
Sign in
Forgot password?
Sign in with TwitchSign in with GithubCreate Account
about: ({
version: 0.1.9 BETA 5.7,
whats_new: community page,
up_next: more Davinci•003 tasks,
coming_soon: events calendar,
social: @webDeveloperHQ
});

legal: ({
terms: of use,
privacy: policy
});
changelog: (
version: 0.1.9,
notes: added community page

version: 0.1.8,
notes: added Davinci•003

version: 0.1.7,
notes: upvote answers to bounties

version: 0.1.6,
notes: article editor refresh
)...
recent_tips: (
tipper: @AriseFacilitySolutions09,
tipped: article
amount: 1000 SATS,

tipper: @Yussuf4331,
tipped: article
amount: 1000 SATS,

tipper: @darkwebsites540,
tipped: article
amount: 10 SATS,
)...