0

I have saved strings in a MySQL database that contain symbol –. I would like to convert that to the correct symbol –. Do you have any idea what would be the easiest way to do it?

Tuomo B
  • 127
  • 2
  • 7
  • what language are you using to send it to mySql? is it Java..? and what is your mysql envoding is set to? refer this guide: http://stackoverflow.com/questions/10113355/correct-php-method-to-store-special-chars-in-mysql-db – Nomesh DeSilva Aug 08 '15 at 05:30
  • PHP. I tried basic REPLACE() via PHP but so far it didn't work out. MySQL doesn't recognize the symbols. – Tuomo B Aug 08 '15 at 05:33
  • replace will not work, but you need to set the proper encoding for your response before it sends to the mysql. if you refer to the link I gave, it has several solutions you can try out with PHP and how set the collation in mysql. again: http://stackoverflow.com/questions/10113355/correct-php-method-to-store-special-chars-in-mysql-db – Nomesh DeSilva Aug 08 '15 at 05:40
  • This will replace the symbol to zero in query : `select replace('ds–','–','0');` . use replace in MYSQL query instead of PHP – Abhishek Ginani Aug 08 '15 at 05:50
  • Encode and decode your string? Insert the decoded string value into the DB – James111 Aug 08 '15 at 06:04

1 Answers1

0

Don't look for the "easiest way"; you are likely continue to have trouble.

– id Mojibake for . SELECT HEX(col), col FROM ..., you should get E28093. If so, then I'll explain one fix. If, instead, you get C3A2E282ACE2809C, then you have a "double encoding", and it takes a different fix.

Answer these questions:

  • Were the bytes you in the client are correctly encoded in utf8? They probably were.
  • Did you connected with the default SET NAMES latin1 (or set_charset('latin1') or ...)? (It needed to be utf8; but simply changing it won't suffice.)
  • Is the column in the table was declared CHARACTER SET latin1. (Or possibly it was inherited from the table/database.) It should have been utf8. There are multiple ways of changing it; using the wrong method will only make things worse.

Plan A: Start over with the correct connection parameter and table definition.

Plan B: Answer the above questions, and I will help you fix the mess.

Rick James
  • 135,179
  • 13
  • 127
  • 222
  • Actually I didn't get either E28093 or C3A2E282ACE2809C. But here are the answers: 1) Yes, the bytes in the client side are UTF-8. 2) Yes, I connected with the default. 3) The table has char set of latin1. – Tuomo B Aug 10 '15 at 09:34
  • What hex did you get? – Rick James Aug 10 '15 at 12:14
  • The symbol – is actually in a middle of a sentence so it's difficult to say how it was converted. What I did now was that I fixed the problem manually to the corrupted rows and fixed the code so that in the future the `–` symbol will get converted to `–` before inserting to the database. Thanks for help! – Tuomo B Aug 11 '15 at 08:35