Tag Archives: Unicode

Attending IUC 34 and career longevity

After a few years being away from the internationalization crowd, I’m attending the Internationalization and Unicode Conference again this year. How great to see old friends and to make new ones. Some things are new — some new people. However, many things are old or definitely older. What’s old? Well, for one, the problems. It’s… Read More »

These characters aren’t exotic!

Recently I had the opportunity to sign up for health benefits with a 3rd party site that manages these things for my employer. Sites that collect data often limit the set of characters that you must use for each field. That’s reasonable for numeric fields, date fields, etc. After all, you don’t want invalid data… Read More »

An Internationalization and Unicode Web Service?

Chances are that your favorite development platform already has internationalization support built into it. And probably Unicode charset support is there too. For example, Java and .NET platforms contain lots of APIs for formatting dates, numbers, etc in locale-sensitive ways. And you can get Unicode character data easily too. Unfortunately, the Unicode standard changes periodically,… Read More »

Unicode support doesn’t mean your application is internationalized

Over the years, I’ve helped many organizations internationalize their software products. One of the most common misunderstandings is how Unicode will help their product. Customers sometimes mistakenly believe that Unicode support will be sufficient to internationalize their products. Sometimes they believe that Unicode “support” is a single, yes-no, on-off ability, when instead Unicode support is… Read More »

What is Unicode?

Unicode is a character set standard. This particular standard assigns a unique number to every character used around the globe, regardless of written and spoken language, computing platform, or application. Unicode includes all the characters used from other more limited character sets. Prior to Unicode, smaller character sets assigned character values differently from each other.… Read More »

Character Conversion points

You’d think this sort of problem would be resolved by now, but it’s not. It’s still almost impossible to quickly and easily migrate an application from the too common default Latin-1 to UTF-8 character set encoding. The problem isn’t that UTF-8 can’t handle the conversion. No, that’s definitely not it. UTF-8 can represent any Latin-1… Read More »