Best practice: Use UTF-8 as your source code encoding

Software engineering teams have become more distributed in the last few years. It’s not uncommon to have programmers in multiple countries, maybe a team in Belarus and others in Japan and in the U.S. Each of these teams most likely speaks different languages, and most likely their host systems use different character encodings by default. […]

Continue Reading

An Internationalization and Unicode Web Service?

Chances are that your favorite development platform already has internationalization support built into it. And probably Unicode charset support is there too. For example, Java and .NET platforms contain lots of APIs for formatting dates, numbers, etc in locale-sensitive ways. And you can get Unicode character data easily too. Unfortunately, the Unicode standard changes periodically, […]

Continue Reading

Unicode support doesn’t mean your application is internationalized

Over the years, I’ve helped many organizations internationalize their software products. One of the most common misunderstandings is how Unicode will help their product. Customers sometimes mistakenly believe that Unicode support will be sufficient to internationalize their products. Sometimes they believe that Unicode “support” is a single, yes-no, on-off ability, when instead Unicode support is […]

Continue Reading

What is Unicode?

Unicode is a character set standard. This particular standard assigns a unique number to every character used around the globe, regardless of written and spoken language, computing platform, or application. Unicode includes all the characters used from other more limited character sets. Prior to Unicode, smaller character sets assigned character values differently from each other. […]

Continue Reading