Character Encodings

Encoding is a subject you do not take into consideration until you run into problems with your chracter sets. I have programmed for about 8 years until I really had to deal with encodings converting the www.pioneer.eu website from iso-8859-1 encoding to UTF-8.

I am not going to explain what encodings are and how to use them properly. There are plenty of resources on the web for that. I can recommend the introduction from Joel Spolsky: The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!).

In western countries browsers, web servers, databases and code engines (like the Java Virtual Machine) are using iso-8859-1 as the default encoding. If you ever have the slightest notion that your website might have higher requirements now or in the future, you should use th UTF-8 encoding. It requires action to configure components to use UTF-8. However this is easy to do compared with converting the encoding later.

Interested in the possibilities of UTF-8 try the UTF-8 sampler page.

Advertisements

One Response to “Character Encodings”

  1. Email marketing and encoding « Arnoud on Software Development Says:

    […] senders have troubles getting encoding right. It is simple but yet not easy to do. I have written before about encoding and Joel Spolsky has a great article about it. Encoding e-mails is even harder than webpages and […]

Comments are closed.


%d bloggers like this: