Software internationalization
|
Wednesday Apr 23, 2008
Encoding URIs in JavaScript
The
The problem with this is that the escape mechanism is broken if you want to use UTF-8 as your document encoding. If you were dynamically composing URL strings with parameters, those parameters will definitely not be escaped correctly. Instead of
Fortunately, JavaScript has resolved the problem, but the solution means you'll have to use another function. The
What's this mean for you? Maybe nothing if you're hopelessly attached to ISO-8859-1. However, if you're trying to reach a global market with your product, chances are very good that you've decided to use UTF-8 for your character set encoding. That's an excellent choice, but you'll have to manage the conversion points. In a nutshell, that simply means that you'll need to use UTF-8 from front to back consistently.
Part of managing those conversion points is consistently providing well-formed URIs to your application server. If you use JavaScript to manipulate data or to create dynamic URIs in your application, make sure you toss aside that deprecated Posted at 12:01AM Apr 23, 2008 by John O'Conner in JavaScript | Comments[0]
Saturday Apr 12, 2008
Migrating from Latin-1 to UTF-8
Here's an example: a simple web application that stores names and addresses in a database. Chances are, if you haven't done anything explicit to change this, the web page itself will have no charset encoding associated with it. And neither will your application server. And neither will your database. And without explicit settings, many applications use Latin-1 as the default character set. So, you'll be able to enter, store, retrieve, and display common Western European names, but you won't be able to handle Russian or Japanese or Chinese or, well, you get the idea. So let's imagine you decide to convert from Latin-1 to UTF-8 so that you open up your application to the rest of the world's languages and scripts. What does that mean? What must you do? How do you start? Here are some of the charset conversion points you'll need to resolve as you migrate through this problem:
To help you get started, I've discussed the first 4 conversion points in the article Character Conversions from Browser to Database. Go ahead, take a look. But come back here to let me know what you think. I'll talk about some of the JavaScript issues in an upcoming blog. Posted at 10:26PM Apr 12, 2008 by John O'Conner in Unicode |
Thursday Apr 10, 2008
Updating timezone data in older Java VMs
Wouldn't it be nice if you could install the latest version of the JDK or JRE in your production environment? But maybe you just can't do that because of your company policy, testing cycles, or adoption process. Unfortunately, whether you can update or not, things change. Some things affect your application whether you want them to or not. For example, as timezone data changes, software must change to keep pace. Wouldn't it be great if you could update just the timezone data in your existing vm...might be easier to get approval for that instead of approval to replace your complete JDK/JRE. The Timezone Updater Tool allows you to update timezone data in older JDK/JREs. Using the updater tool, you're able to update the data without updating the entire JDK/JRE. You can learn more about this product from the online article Timezone Updater Tool. Posted at 09:52PM Apr 10, 2008 by John O'Conner in Java |
Tuesday Apr 08, 2008
Basic Definitions for a Unicode Discussion
If you want to communicate, defining confusing terms right up front is always a good first step. So I'll try to define some Unicode terms:
You can always see more terms by visiting the Unicode Glossary. Posted at 11:40PM Apr 08, 2008 by John O'Conner in Unicode | Unicode 5.1 released this week
Posted at 12:33AM Apr 08, 2008 by John O'Conner in Unicode | |
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||