/    Sign up×
Community /Pin to ProfileBookmark

Database Character Code Issue

Hello everyone again.

With my last post a situation was solved with me extracting database info and printing it to an Excel file.

Now I’m trying to make sure my file records [B]match[/B] the queries. The queries for the “[U]event title[/U]” are being called from a form field in an html page.

So you enter in the name of the event title and it queries the database for records matching the title.

Take this event title for instance:

“Bringing In The New Year” @ Justin’s Restaurant & Lounge

It’s weird because I’m telling it to encode for UTF-8 and save to the database with:

[code=php]$event_title = utf8_encode($eventtitle);[/code]

But it’s still saving it to the database as ISO 8859-1 Encoding.

The query from the html form field has to have the [B]same character set[/B] as the input in the database [B]it’s looking for[/B] and this is where I’m running into the issue.

Any suggestions?

to post a comment
PHP

21 Comments(s)

Copy linkTweet thisAlerts:
@ChuckBauthorNov 21.2010 — This may help: http://www.charles-reace.com/blog/2008/10/03/utf8-in-php-and-mysql/[/QUOTE]

Thanks for your assistance.

I'm trying the suggestions in the article and I'm still running into issues.

[B]1st[/B], I dropped the table and created a new one with the suggested charset code:

[code=php]
CREATE TABLE example_table (id INT( 11 ) NOT NULL AUTO_INCREMENT PRIMARY KEY, x1 varchar(30), x2 varchar(100)) [COLOR="Blue"]ENGINE = MYISAM CHARACTER SET utf8 COLLATE utf8_unicode_ci[/COLOR]
[/code]


The mysql table shows that the fields are set to "utf8_unicode_ci"...

2nd, I set one of the form field attributes[I] (where someone enters their info)[/I] to [COLOR="Blue"]accept-charset="UTF-8"[/COLOR]


3rd, before the page enters the info into the database, mysql_query encodes the input again with:

[code=php]
$DB="name";
$linkID = @mysql_connect("localhost", "name", "password");
mysql_select_db($DB, $linkID);
mysql_query("SET NAMES 'utf8'");


$sql="INSERT INTO example_table (x1, x2, x3, x4) VALUES('$value1','$value2','$value3','$value4')";
mysql_query($sql, $linkID);
mysql_close($linkID);
[/code]


So when I observe the info in the database, it shouldn't have ISO-8859-1 characters.

But it still does.

I can't type those characters here to show you what I mean because this forum converts them.


it has " [COLOR="Blue"]ampersand#8220;[/COLOR] " ...etc inside of the input...frustrating..
Copy linkTweet thisAlerts:
@NogDogNov 22.2010 — Sounds as if something somewhere in the chain is applying htmlentities() to the inputs, maybe?
Copy linkTweet thisAlerts:
@ChuckBauthorNov 22.2010 — Sounds as if something somewhere in the chain is applying htmlentities() to the inputs, maybe?[/QUOTE]

lol....? you've assisted me several times since I've joined the 'webdeveloper' community...and it's pretty discouraging when someone of your experience says 'maybe?'....lol...

this means I'm going to either have to spend hours on trying to solve this or I just have to find another solution...

it's tough for me to see where the problem lies...the site is in Wordpress and I've checked the header, character coded the form field, character coded the sql table code..etc...I have 'utf-8' everywhere and still can't see why I get ISO whatever..
Copy linkTweet thisAlerts:
@NogDogNov 22.2010 — Well, if you are actually seeing...
[code=php]
“
[/code]
...in the data, then that's not indicating an incorrect character encoding, but that something has transformed it into a HTML character entity string.

If you are seeing the actual left-double-quote character, it sounds more like the crap you get if you copy-and-paste from a MS Word document, which is a pain in the butt and which I deal with this way: http://www.charles-reace.com/blog/2008/10/15/filtering-ms-word-text/
Copy linkTweet thisAlerts:
@ChuckBauthorNov 22.2010 — Well, if you are actually seeing...
[code=php]
“
[/code]
...in the data, then that's not indicating an incorrect character encoding, but that something has transformed it into a HTML character entity string.
[/QUOTE]



That's exactly what I'm seeing throughout the entire database input...

The problem is that if you enter a search in a form search field, the output [I](after you hit submit)[/I] to query the rows in the database isn't transformed to the [B]same[/B] HTML character entity string as was inserted into the database.

If it was, then it would find a match....

Here's the thing, this variable is what's checking for matching rows

[code=php]$eventtitle = htmlentities(sanitize($event));[/code]

with the MySQL line:

[code=php]$sqlstart="SELECT * FROM user_info WHERE nameofevent='".$eventtitle."' ORDER BY id";[/code]

If I can get $eventitle to match what's in the database (which is a string with html entities just like you said above), then it will work.

When I queried for a title that didn't have quotes in it or any other crazy characters, it pulled the rows and worked. Like for instance [CODE][I]New World Order[/I][/CODE], it would find New World Order with no problem and pull all of the info from that row.

This lets me know that there's nothing wrong with the sql query itself. It's just that $eventtitle isn't matching what's in the database when I have quotes and crazy character's involved.
Copy linkTweet thisAlerts:
@criterion9Nov 23.2010 — I'm curious about this part:

htmlentities(sanitize($event));
[/quote]

Can we see the code that puts the content into the DB as well as what the line above is doing?
Copy linkTweet thisAlerts:
@ChuckBauthorNov 30.2010 — I'm curious about this part:

Can we see the code that puts the content into the DB as well as what the line above is doing?[/QUOTE]


Appreciate the response.

Here's the code that puts info into the DB:

[code=php]
<?php $post = $wp_query->post; ?>

<?php

if (isset($_POST['eventfield'])){


$firstname = sanitize($_POST['firstname']);
$lastname = sanitize($_POST['lastname']);
$phonenumber = sanitize($_POST['phonenumber']);
$emailaddress = sanitize($_POST['emailaddy']);
$event_title = sanitize(single_post_title('',FALSE));
$date=date("l dS F Y h:i:s A");
$ipaddr=$_SERVER['REMOTE_ADDR'];




//moved DB connection higher
$DB="dbname";
$linkID = @mysql_connect("host info", "dbname", "password");
mysql_select_db($DB, $linkID);


$sql="INSERT INTO user_info (firstname, lastname, phonenumber, user_email, nameofevent, visit_time, ipaddr) VALUES('$firstname','$lastname','$phonenumber','$emailaddress','$event_title','$date','$ipaddr')";
mysql_query($sql, $linkID);
mysql_close($linkID);

}


?>
[/code]


As far as the code with the line in question, when I want to look for a specific title in the DB [I](that may include quotations or any other non alpha character)[/I], you enter that title into a form field. When you hit submit, that form field calls a function called [I]runCompanyData ($event)[/I], where $event is the title you are searching for.

Here's the function info:

[code=php]
<?php
function runCompanyData($event) {

require ('addtl_validation.php');


define(db_host, "hostinfo");
define(db_user, "userinfo");
define(db_pass, "password");
define(db_link, mysql_connect(db_host,db_user,db_pass));



define(db_name, "dbname");

mysql_select_db(db_name);


$eventtitle = htmlentities(sanitize($event));


$sqlstart="SELECT * FROM user_info WHERE nameofevent='".$eventtitle."' ORDER BY id";
$export = mysql_query($sqlstart);
$fields = mysql_num_fields($export);

for ($i = 0; $i < $fields; $i++) {
$header .= strtolower(mysql_field_name($export, $i)) . "t";
}

while($row = mysql_fetch_row($export)) {
$line = '';
foreach($row as $value) {

if ((!isset($value)) OR ($value == "")) {
$value = "t";
} else {
$value = str_replace('"', '""', $value);
$value = '"' . $value . '"' . "t";
}
$line .= $value;
}
$data .= trim($line)."n";
}
$data = str_replace("r","",$data);

if ($data == "") {
$data = "n(0) Records Found!n";

}

header("Content-type: application/x-msdownload");
header("Content-Disposition: attachment; filename=extraction.xls");
header("Pragma: no-cache");
header("Expires: 0");
print "$headern$data";

}
?>
[/code]


The sanitize function is:

[code=php]
function sanitize($value){
mysql_real_escape_string($value);
return $value;
}
[/code]


It works when the only non alpha character I include in the title is the '@' symbol.
Copy linkTweet thisAlerts:
@criterion9Dec 01.2010 — Are you sure you need to use htmlentities here?

$eventtitle = htmlentities(sanitize($event));
[/quote]
Copy linkTweet thisAlerts:
@ChuckBauthorDec 01.2010 — Are you sure you need to use htmlentities here?[/QUOTE]


The reason why I chose to use htmlentities here is because if you read the very last post from NogDog, he suggest that if I'm seeing

[code=php]&#8220;[/code]

then what the database is doing is creating an HTML entity in the "title" inserted into the DB.

So I figured, if I surround the title search [code=php]$event[/code], with htmlentities($event), then it will match what's being put into the database and give me the results.
Copy linkTweet thisAlerts:
@criterion9Dec 01.2010 — Those are MS word (and similar) specific characters (slanted quotes). You should be altering those to &quote; instead.
Copy linkTweet thisAlerts:
@ChuckBauthorDec 01.2010 — Those are MS word (and similar) specific characters (slanted quotes). You should be altering those to &quote; instead.[/QUOTE]

How would I do that for the input being inserted into the database?

The answer to that would also let me know how I need to modify the argument to query the DB.
Copy linkTweet thisAlerts:
@criterion9Dec 01.2010 — What WYSIWYG embedded editor are you using (ckeditor, telerik, etc)?
Copy linkTweet thisAlerts:
@ChuckBauthorDec 02.2010 — What WYSIWYG embedded editor are you using (ckeditor, telerik, etc)?[/QUOTE]


I'm using Dreamweaver...but only to manage written code....
Copy linkTweet thisAlerts:
@criterion9Dec 02.2010 — I meant what embedded editor are you using in the cms?
Copy linkTweet thisAlerts:
@ChuckBauthorDec 03.2010 — I meant what embedded editor are you using in the cms?[/QUOTE]

Please excuse my ignorance. To my knowledge I don't think I am using an embedded editor. The site is being created in Wordpress. How would I find out what embedded editor I'm using?
Copy linkTweet thisAlerts:
@ChuckBauthorDec 03.2010 — http://dianev.com/blog/2007/12/08/microsoft-word-making-weird-characters-in-your-blog-or-email/

http://www.velvetblues.com/web-development-blog/wordpress-problems-pasted-ms-word-content/

http://wordpress.org/support/topic/how-can-i-get-microsoft-word-text-into-my-rss-feeds

http://australiansearchengine.wordpress.com/2009/03/31/convert-microsoft-word-characters/

If I get time I'll try to pass on the MS character replacement php function I borrowed from Nogdog or Mindzai (though I can't remember which at the moment).[/QUOTE]


Very much appreciated...in the meantime, since it works w/o quotes, hashes..etc, I'll just make sure none of these are used...
Copy linkTweet thisAlerts:
@NogDogDec 03.2010 — ...

If I get time I'll try to pass on the MS character replacement php function I borrowed from Nogdog or Mindzai (though I can't remember which at the moment).[/QUOTE]


Do you mean this one? ?

http://www.charles-reace.com/blog/2008/10/15/filtering-ms-word-text/
Copy linkTweet thisAlerts:
@ChuckBauthorDec 03.2010 — Do you mean this one? ?

http://www.charles-reace.com/blog/2008/10/15/filtering-ms-word-text/[/QUOTE]


Appreciate it NogDog,

I can work with this and see what results I get.
Copy linkTweet thisAlerts:
@criterion9Dec 04.2010 — Do you mean this one? ?

http://www.charles-reace.com/blog/2008/10/15/filtering-ms-word-text/[/QUOTE]


Yep that is the one I adapted from. ?
×

Success!

Help @ChuckB spread the word by sharing this article on Twitter...

Tweet This
Sign in
Forgot password?
Sign in with TwitchSign in with GithubCreate Account
about: ({
version: 0.1.9 BETA 5.7,
whats_new: community page,
up_next: more Davinci•003 tasks,
coming_soon: events calendar,
social: @webDeveloperHQ
});

legal: ({
terms: of use,
privacy: policy
});
changelog: (
version: 0.1.9,
notes: added community page

version: 0.1.8,
notes: added Davinci•003

version: 0.1.7,
notes: upvote answers to bounties

version: 0.1.6,
notes: article editor refresh
)...
recent_tips: (
tipper: @AriseFacilitySolutions09,
tipped: article
amount: 1000 SATS,

tipper: @Yussuf4331,
tipped: article
amount: 1000 SATS,

tipper: @darkwebsites540,
tipped: article
amount: 10 SATS,
)...