Page 1 of 1

Charset/encoding issues & utf-8

Posted: Mon Nov 11, 2013 7:20 pm
by Albert Wiersch
If you are getting a message about the character encoding, you may want to convert your document to UTF-8. UTF-8 is the recommended encoding for Internet documents.

NOTE: Always keep backups. Converting to a different encoding can potentially (but hopefully not) cause some corruption issues, and, if it does, then you may want to revert back to a backup document.

For example, if you use this:

Code: Select all

<meta http-equiv="content-type" content="text/html; charset=iso-8859-1">
This is important. And you want to convert to utf-8, then choose File->Save with Encoding. In the dialog that appears, choose Unicode (UTF-8) as the encoding. We recommend you leave the "Use encoding signature" option unchecked. Click OK.

Now change any references to the old encoding in your document to "utf-8" (case doesn't matter) which is the new encoding you just savied the document with. For example, the above becomes this:

Code: Select all

<meta http-equiv="content-type" content="text/html; charset=utf-8">
Or, replace the entire meta tag above with one like this (valid for HTML5 or higher documents):

Code: Select all

<meta charset="utf-8"/>
If you notice any corruption issues, like characters that have changed or double characters, then something went wrong in the conversion and you should revert back to your backups, figure out what went wrong, and try again.