Software internationalization
|
Saturday Apr 12, 2008
Migrating from Latin-1 to UTF-8
Here's an example: a simple web application that stores names and addresses in a database. Chances are, if you haven't done anything explicit to change this, the web page itself will have no charset encoding associated with it. And neither will your application server. And neither will your database. And without explicit settings, many applications use Latin-1 as the default character set. So, you'll be able to enter, store, retrieve, and display common Western European names, but you won't be able to handle Russian or Japanese or Chinese or, well, you get the idea. So let's imagine you decide to convert from Latin-1 to UTF-8 so that you open up your application to the rest of the world's languages and scripts. What does that mean? What must you do? How do you start? Here are some of the charset conversion points you'll need to resolve as you migrate through this problem:
To help you get started, I've discussed the first 4 conversion points in the article Character Conversions from Browser to Database. Go ahead, take a look. But come back here to let me know what you think. I'll talk about some of the JavaScript issues in an upcoming blog. Posted at 10:26PM Apr 12, 2008 by John O'Conner in Unicode | Comments:
|
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||