I’m trying to write a script that will take a string and replace some of the more common special characters that are present when someone copies and pastes text, like curly quotes and M-dashes.
Please note, I am aware of htmlspecialchars(), and this will not work for me. I do not want all HTML special characters to be replaced, just select characters. Therefore, I need a way to match them specifically.
Observe this simple test. The unicode value for the curly left quote is 8220, and the PHP regex syntax for matching unicode values is supposed to be x{
[code=php]
$str = “Here’s a quote: “My quote””;
echo (preg_match(‘/x{8220}/u’, $str)) ? ‘match’ : ‘no match’;
With this simple test, I get “no match.” Am I misunderstanding something about the way the literal character can be matched with the unicode value? (I know I could simply use /“/, but I’d prefer a method that doesn’t require special characters in the code itself.)