/    Sign up×
Community /Pin to ProfileBookmark

preg_replace("everything accept ….")

Hi,

For creating url’s I want to remove everything from the segment (to store in database) except alpha-numeric characters, underscores and dashes.

I have a regex for the above (/^([-a-z0-9_ -])+$/i) but I need a regex for all the other characters I think … or is there another solution?

Thanks!

Christophe

to post a comment
PHP

12 Comments(s)

Copy linkTweet thisAlerts:
@JonaSep 27.2010 — Hi,

For creating url's I want to remove everything from the segment (to store in database) except alpha-numeric characters, underscores and dashes.

I have a regex for the above (/^([-a-z0-9_ -])+$/i) but I need a regex for all the other characters I think ... or is there another solution?

Thanks!

Christophe[/QUOTE]


[font=arial]Hi,

You can use the carrot (^) to negate a set (bracketed expression). The below will match a string that, from beginning to end, contains no letters, numbers, underscores, space characters or dashes. (Note that I would recommend using [/font][font=monaco]s[/font][font=arial] rather than a space character to account for different charsets.)[/font]

<i>
</i>/^([^a-z0-9_ -])+$/i
Copy linkTweet thisAlerts:
@Christophe27authorSep 27.2010 — Hi,

Hmm, not sure if I understand you ...

So you would use the following string from "image()" to "image"?

[CODE]$clean_file_uri_segment = preg_replace("/^([^a-z0-9_ -])+$/i", $file_uri_segment);[/CODE]

Or am I missing something? You don't have to exclude spaces ? The file_uri_segment is already free of spaces ?

Christophe
Copy linkTweet thisAlerts:
@JonaSep 27.2010 — [font=arial]If you simply want to remove non-word characters from a string, it would be easiest to use something like this.[/font]

<i>
</i>$clean_file_uri_segment = preg_replace('/W*/', '', $file_uri_segment);
Copy linkTweet thisAlerts:
@Christophe27authorSep 30.2010 — Your solution works when converting "image(1)" to "image1", but when I test and upload an image with a filename like "Capture(1)%£+;=mdùç&é'(§è!çà)´][.jpg" it makes the title "Capture(1)%£+;=mdùç&amp;é'(§è!çÃ*)´][.jpg" ....

Any solution?

Thanks for your help!

Christophe
Copy linkTweet thisAlerts:
@NogDogSep 30.2010 — Use Jona's original idea, but without anchoring it to the start/end of the string:
[code=php]
$new = preg_replace('/[^a-z0-9_-]+/i', '', $string);
[/code]
Copy linkTweet thisAlerts:
@Christophe27authorSep 30.2010 — Your solution works but when somebody includes a character like " or ' in the filename, the script fails ...

Also the reg ex escapes the . from the extension so the new filename becomes imagejpg instead of image.jpg
Copy linkTweet thisAlerts:
@NogDogSep 30.2010 — Your solution works but when somebody includes a character like " or ' in the filename, the script fails ...[/quote]

Define "fails".

Also the reg ex escapes the . from the extension so the new filename becomes imagejpg instead of image.jpg[/QUOTE]

So add the "." to the character class in the regexp.
Copy linkTweet thisAlerts:
@Christophe27authorSep 30.2010 — I have tested some more and appearantly it works. I must have made a mistake before ...

But now there is a new problem. What if the filename is something like ().jpg? During upload the preg_replace() will delete all the strange characters, included (), so the filename will become something like .jpg

Perhaps I have to generate an mt_rand() integer to replace the empty filename ... or maybe better, replacing all the strange characters in the first place with an underscore and always checking if the file doesn't exists already in the users dir.

What do you suggest? The upload function will normally be heavy used in this genre website.
Copy linkTweet thisAlerts:
@NogDogSep 30.2010 — I might simply generate a [url=http://php.net/uniqid]uniqid[/url]() to use as the file name for every file. You could use [url=http://php.net/pathinfo]pathinfo[/url]() to get the file suffix (if there is one) and append that to the generated file name. Then just use that name to store the file and save the name in the DB. (You could also store the original file name in another DB column if you think there's any possibility you would need to know it at some point.)
[code=php]
$fileName = uniqid('file_', true); // maybe use user's ID instead of hard-coded prefix?
$info = pathinfo($_FILES['file']['tmp_name']);
$fileName .= (!empty($info['extension'])) ? "." . $info['extension'] : '';
move_uploaded_file($_FILES['file']['tmp_name'], "$uploadDir/$fileName");
[/code]
Copy linkTweet thisAlerts:
@Christophe27authorOct 01.2010 — Yes, I was thinking the same thing (simply generate a uniqid for every file) but I would like to keep the original filename (without special characters of course) for SEO. It is better the filename = "bananas.jpg" than "dks5s2eki.jpg".

I thought to strip the extension off, then strip the filename (without extension) for special characters with a regex, and then check if it is empty after stripping, if so, generate a uniqid() as filename and then put it all back together.

But is there a function to strip the extension off a string? I know there are functions like explode() but what if there are multiple dots in the filename.

[CODE]
$string = 'great.photo().jpg';
$piece = explode('.', $string);
// I would have three strings instead of two (filename and extension)
[/CODE]


So is there a function to split a string at the last dot so I would have ...

[CODE]
$piece['0'] // great.photo()
$piece['1'] // jpg

$clean_title = preg_replace('/[^a-z0-9_-]+/i', '', $piece['0']);
$new_filename = '$clean_title' . '.' . $piece['1'];
[/CODE]


If I would have this, my problem will be solved :-)

Christophe
Copy linkTweet thisAlerts:
@NogDogOct 01.2010 — See the pathinfo() function. If using PHP 5.2.0 or later, the "filename" element of the returned array will be the file name without the suffix.
Copy linkTweet thisAlerts:
@Christophe27authorOct 01.2010 — That pretty much looks exacly what I need. Thanks!
×

Success!

Help @Christophe27 spread the word by sharing this article on Twitter...

Tweet This
Sign in
Forgot password?
Sign in with TwitchSign in with GithubCreate Account
about: ({
version: 0.1.9 BETA 5.25,
whats_new: community page,
up_next: more Davinci•003 tasks,
coming_soon: events calendar,
social: @webDeveloperHQ
});

legal: ({
terms: of use,
privacy: policy
});
changelog: (
version: 0.1.9,
notes: added community page

version: 0.1.8,
notes: added Davinci•003

version: 0.1.7,
notes: upvote answers to bounties

version: 0.1.6,
notes: article editor refresh
)...
recent_tips: (
tipper: @AriseFacilitySolutions09,
tipped: article
amount: 1000 SATS,

tipper: @Yussuf4331,
tipped: article
amount: 1000 SATS,

tipper: @darkwebsites540,
tipped: article
amount: 10 SATS,
)...