Posts Tagged ‘i18n’

Standard Charsets in Java 7

February 1st, 2013 joconner No comments

Once in a while I poke my nose through the release notes of new Java releases. It’s not a particularly rewarding activity, but this time I did find something interesting. Oddly enough, it was interesting for what it did NOT say. I was surprised, so I thought you might want to know about a new class that is now available and quietly overlooked in any release notes.

Character sets have their own class representation in Java: Charset. You can use the Charset class to identify a character set for encoding or decoding. To create a Charset object, you use a factory method: Charset.forName(String charset). The uncomfortable trick to using this method is that you must be prepared to catch an exception if the JRE doesn’t actually supply the requested character set. Bummer.

I’ve always wondered why the JDK allows a random string as the parameter. I suppose it was for convenience…to allow the JDK to be updated over time with new charset support without having to change any API or enumeration. That’s understandable. But not really knowing what minimal set of character sets is supported in a particular JDK is somewhat…unnerving…especially to an engineer just trying to get his/her work finished.

The JDK documentation was always clear on what character sets you could absolutely depend on to be present. That was helpful and much needed. At least an observant developer could depend on that. However, the JDK now provides a more robust and useful way to identify which charsets are minimally supported. Java 7 provides a new class: java.nio.charset.StandardCharsets.

StandardCharsets does one thing. It lets you know what set of character sets is minimally supported in your JDK. The set is probably unchanged from Java 6 or Java 5 or even earlier. However, now you don’t have to read the documentation as carefully; the standard set is given to you. The Standardcharsets class explicitly enumerates the normal set for you.

Rocket science? No. But this welcome addition to the JDK was a long time in coming, and I’m glad to have found it.

VN:F [1.9.22_1171]
Rating: 5.0/5 (1 vote cast)
VN:F [1.9.22_1171]
Rating: +1 (from 1 vote)
Categories: Java Tags: , , ,

Still Can’t Use Apostrophes? Really?

December 10th, 2012 joconner No comments

Answer this for me. Why in the world are we still preventing very common characters from name fields in online forms, in bank account applications, in insurance forms…tax returns? Why? 

In 2012, many companies have adopted Unicode in their backend databases. But what’s wrong with their development teams that prevent them from allowing customers to spell their names correctly in their application’s user interface? I live in California. We have LOTS of hyphenated names, names with accents, names with apostrophes. There really is no excuse for preventing users from spelling their names online in the the same way that they spell them on paper.

At this point I’m just irritated. At one point I thought I could just tell people how to fix these things. Then I thought I could occasionally blog about it — thinking the word would get out slowly. Well, I suppose if it is working at all, the message is getting out slower than anticipated. I never had delusions that an i18n blog would be generally popular with the masses. This isn’t a soap opera or Hollywood expositor after all. However, you might thing that common sense would just spread, that it would simply be absorbed across the web. It ‘aint so.

Look, if you are a software developer and have ANY influence on how your company provides its input or signup forms online, can you do me a favor? Can you remember that some people have names that actually have an apostrophe or hyphen or n-with-an-accent-grave? You can easily parse these fields; you can check against sql attacks etc that use interesting characters to turn databases into mush. We have the technology people. Let’s consider what might happen if we use it.

All the best,

John O’Conner (note the apostrophe)

VN:F [1.9.22_1171]
Rating: 4.0/5 (1 vote cast)
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)

Just another suggestion for language selection lists

June 1st, 2012 joconner No comments

Recently I was asked to fill in an online questionnaire. As I began the form, an entire window was shown to me, and it had a single UI item on it, a language selection list. Of course, I had to click on the list, wondering what wonderful choices I might have for a user interface. Surprisingly, the list revealed a single entry: English.


If you sometimes read my blog, you’ll know that I’m particularly interested in language lists at the moment. However, if you are interested in providing a list, I do have one additional suggestion…if you don’t have any choices, you probably shouldn’t use a selection list. It’s just too much of a bummer when nothing else is available and just doesn’t make much sense. Maybe disable it until your UI does provide additional languages.

That’s it for today. 

VN:F [1.9.22_1171]
Rating: 1.0/5 (1 vote cast)
VN:F [1.9.22_1171]
Rating: +1 (from 3 votes)
Categories: Globalization, Web Tags:

Java 7 Locale Changes

September 13th, 2011 joconner No comments

I admit that I haven’t been particularly active in the i18n or Java standards bodies in the last few years. Pardon me, but I had a wild rumpus in a startup for almost 3 years, then joined Yahoo for a couple years, and BAM! … during that time, the Unicode Consortium added emoji, and the Oracle Java folks overhauled the once relatively simple java.util.Locale class among other things. That’ll teach me to turn my back on those folks.

You can almost forgive the Unicode people for their addition of emoji… I just know that Ken Lunde had something to do with it but haven’t been able to pin the whole issue on him yet. :) [Just kidding people, just kidding Ken!]

BUT more interesting to me right now…have you seen java.util.Locale lately. What the heck happened there? Java 7 introduces Locale.Builder, Locale.Category, and a Locale.forLanguageTag method. With Java 7′s support of BCP 47 (the best practices for language coding), the Locale got a beefy new Builder inner class to help with…well, to help with building BCP 47 compliant locales. More on this in a different post. With the BCP 47 support, I suppose you might need the forLanguageTag method too. Now here’s what stumps me though — that Category class!

As far as I can tell, this category is to help you differentiate a locale instance for two different purposes: user interface language and locale-sensitive data formats. The new categories are the following:


Do we really need a separate Category class to make that explicit?

Locale has another new method: setDefault(Locale.Category, Locale newLocale). I can’t wait to play with this a bit more. I suspect that this will set the default resource bundles (DISPLAY) for UI language and the default FORMAT for things like DateFormat, etc. However, I can’t understand why these two categories are actually needed in the platform libraries. Convenience maybe? You could always use two different Locale instances for this in the past, but I suppose this makes that easier? I’ll have to play around with these to find out more.

Come back again in a couple days after I look into this new Category and how it affects internationalization. I’m having fun discovering some of the new functionality and am eager to let you know what I find.

VN:F [1.9.22_1171]
Rating: 0.0/5 (0 votes cast)
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)
Categories: Java Tags: ,

IUC 34 submission was declined

September 19th, 2010 joconner No comments

Well, the IUC committee made its decision. Although I was initially a little disappointed, the committee declined my session proposal. That’s probably the best; the article was a rehash of an older subject that I already presented long ago. I refreshed the original article and thought I could recycle it. Ha…the IUC wasn’t fooled. The said “NO!”

That’ll teach me. Next time I’ll provide all original content, and I already know what I’ll propose. Timezone selection for web apps. OK, that’s easy you say. Easy peasy. Well, it’s really not. It turns out that it’s pretty easy to pick a locale and time format for a user on the network. But what time zone should you select when displaying time?

That question isn’t always easy to answer. Lots of factors come into play including the locale of the user, the location of the event, the primary location of the site. Which to choose?

I’ll get this written up if you think it’s an interesting subject. Let me know.

VN:F [1.9.22_1171]
Rating: 0.0/5 (0 votes cast)
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)
Categories: Java, Unicode Tags: , ,

Internationalization Consulting career at EMC?

July 1st, 2010 joconner No comments

Looking for a new position? EMC is looking for an Internationalization Consultant! I can’t take it, but maybe you can.


The Internationalization Consultant is responsible for participating in planning and discovery sessions, providing consultation on I18N requirements and optimal business/engineering process design, overall status assessments, and contributing to the campaign for education and awareness of I18N standards for product design and development throughout the enterprise.

More Information

For more information, you should check out the EMC job listing directly. Or search their job listings for requisition id #55885BR.

VN:F [1.9.22_1171]
Rating: 0.0/5 (0 votes cast)
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)
Categories: jobs Tags: ,

Timezone selection for web apps

June 23rd, 2010 joconner No comments

Web applications have an unusual problem regarding timezones and formatted, user-viewable dates for a few reasons:

  • several timezones are usually involved, and the determination of which should be used isn’t particularly clear
  • date formatting code for even a modest list of locales isn’t available in browsers by default

Timezones Everywhere

Desktop applications have it pretty simple. Their users work on a host, and that host has a configured timezone. Windows and Mac users have convenient access to those settings in their “System Properties” or “Control Panel”. If a desktop application needs to display the date or create a time, it has access to the local host timezone. Usually there’s no question…this is the correct timezone for an application to use. If a user temporarily relocates to a different timezone, the user is responsible for changing that setting in his system settings.

Web applications have potentially several timezones to juggle as they manipulate time:

  • client timezone
  • server timezone
  • event location timezone
  • site timezone
  • user account timezone

The client timezone is the computer’s system timezone — the timezone set within system properties or the control panel. This timezone is controlled on the local desktop or host environment. JavaScript in a browser has indirect access to this host provided timezone, and that timezone is available as the default — and usually the only — timezone available to a browser-confined application.

The server timezone is the timezone of the physical server on which your application runs. In a global, world-wide application, this server timezone is probably the least relevant of all the timezones because a server can be almost anywhere in the world and is most likely not representative of the user majority.

The event location timezone is the timezone of an actual event; it is the physical event’s location. For example, world cup soccer matches are held in South Africa this year. It might be tempting to use the event’s timezone when publishing soccer match times. Unfortunately, if I’m a Brazilian fan trying to determine a game time, the South African timezone and time display may not be useful and may be confusing. Here’s an example from a Google search page that shows game times for a U.S. English-speaking user:


The site timezone is the timezone associated with the web app’s domain or site. Let’s use world cup soccer again. Yahoo has dedicated sports sites for various locales/regions around the globe. For example, is primarily the U.S. site, but it has navigational links to allow a user to go to other regional versions of the experience. If you look at a similar schedule of existing or upcoming games, you’ll notice that Yahoo chooses to display times using the site’s timezone.

One view of upcoming games looks like this on the U.S. English site. Notice that it uses Eastern Daylight Time for the site’s timezone:


A different look from the Brazilian site shows this. Note that the times use the BRT timezone:

Screen shot 2010-06-23 at 1.09.45 PM.png

The point here is simple: you have lots of choices for displaying time and time zones to your users. Making the choice is difficult. Once you’ve determined which zone to use, the technical issues aren’t nearly as hard to solve.

In this specific blog, I’ll not solve the technical aspects of displaying time in the proper time zone and format, but I will leave you with my personal suggestion. In general, I think a web site should use as much information as it has available to customize information and to present it clearly to its users. For times and dates, it makes sense to me to present that information in the timezone and format that the user is accustomed to seeing. If your site knows that I’m a U.S. English speaker, maybe you should at least display times using a U.S. timezone. Of course, some regions have multiple zones, and that becomes a new problem. However, in the case of the U.S, which has multiple zones, a California resident is probably more likely to know that EDT is a 3 hour offset. It is doubtful that most California residents would know immediately that South African time is …. well …. some other offset. You see, the South African timezone just doesn’t help me much if I intend to actually watch or record the event. EDT is at least a bit more familiar.

There you have it…some ideas about timezones and their display. Use what your site knows about users to display information. The more localized in formats, the better in my opinion.

VN:F [1.9.22_1171]
Rating: 0.0/5 (0 votes cast)
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)
Categories: Web Tags: ,

Understanding Locale in the Java Platform

November 14th, 2009 joconner No comments

Language and geographic environment are two important influences on our culture. They create the system in which we interpret other people and events in our life. They also affect, even define, proper form for presenting ourselves and our thoughts to others. To communicate
effectively with another person, we must consider and use that person’s culture, language, and environment.

Similarly, a software system should respect its users’ language and geographic region to be effective. Language and region form a locale, which represents the target setting and context for localized software. The Java platform uses java.util.Locale objects to represent locales. This article describes the Locale object and its implications for programs written for the Java platform.

Have a look. It’s an older article, but still perfectly valid and useful: Understanding Locale in the Java Platform.

VN:F [1.9.22_1171]
Rating: 0.0/5 (0 votes cast)
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)
Categories: Java Tags: , ,

An Internationalization and Unicode Web Service?

July 7th, 2009 joconner No comments

Chances are that your favorite development platform already has internationalization support built into it. And probably Unicode charset support is there too. For example, Java and .NET platforms contain lots of APIs for formatting dates, numbers, etc in locale-sensitive ways. And you can get Unicode character data easily too. Unfortunately, the Unicode standard changes periodically, typically much more frequently than your development platform. You applications probably do NOT have full awareness of the latest Unicode database properties. Maybe that’s not important to you? But maybe it is? What do you do in that case? What can you do? You wait…and wait a little more. You wait until your platform’s authors update their Unicode support. That can be a very long time.

Do you think there’s a need for a web service to provide up-to-date Uncode character property information? If there were such a thing, would you use it?

Maybe Unicode character data isn’t exciting enough for you. So how about this? What if a web service existed that could provide human readable dates, times, and numbers in locale-sensitive formats? Sure those APIs exist in Java and .NET…but again, those platforms aren’t always up to date with the latest locale data and formats. What if your application could use use a web service to retrieve those formats?

Oh, and we haven’t mentioned calendar support yet. With Java, you get Gregorian, a Japanese Imperial calendar, a Thai one….and maybe that’s it. Those are important of course. But what if you need something more exotic? A Hebrew or Islamic or … any number of other calendars in use today and the past. What if a web service existed that could provide that information to your application. Would you use it?

No. I don’t have this web service today. But I wonder if others see a need for it. If so, drop me a note, let me know your ideas. If you don’t think it’s needed, let me know that too. I’m interested in arguments for and against such a service.

VN:F [1.9.22_1171]
Rating: 0.0/5 (0 votes cast)
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)
Categories: Unicode Tags: ,

Discovering resource bundles at runtime

March 6th, 2009 joconner No comments


Yesterday a friend asked me a question about Java resource bundles: how can I get my application to discover resource bundles dynamically?

It seemed like a simple question. I answered in my typical fashon: Well, everytime you need a new bundle, just add that bundle’s jar or directory to your classpath and run the application. Or if it’s just a single properties file, add that file to your existing classpath, maybe drop it into the same directory as your other properties files.

My friend wasn’t content with that. He had done some homework before asking me. He said, “My application is in a jar file. I launch it on the command line like this, java -jar MyApp.jar. When I wanted to add new resource bundle jar files, I tried this: java -cp Hello_ja.jar -jar MyApp.jar. It didn’t work. The app doesn’t see the additional Hello_ja.jar file.”

I explained that when you use java -jar MyApp.jar, the jar’s own classpath settings defined by its manifest file will override anything on the command line that is defined by the cp option. The cp option is worthless for setting a classpath in this situation.

Of course, he still needed a solution. Hmmm…how could I help him?

It turns out that you actually can add classes to your classpath dynamically at runtime. Your application can discover new jar files and use the functionality it finds in them. The ClassLoader class can help you. The ClassLoader class can add class directories, jar files, and even lists of jar files to a classpath. Then when you need a class from that directory or jar file, you can use ClassLoader to load the class.

Let’s look at how this might work for loading resource bundles. In my examples, I’m going to create an application that displays three greetings in a panel. I’ll package the application as a jar file. The application will load its resource bundles from jar files that I’ll place into a “plugins” subdirectory directly under the application jar. The big deal about this is that the application will discover the jar files that are in the plugins subdirectory each time it launches. So, if you want to add another localized set of bundles, just drop the new jar into the plugins directory.

First, let’s start with the Greetings application. It’s simple just to demonstrate the point. It loads resources from a standard resource bundle and puts the resource strings in a dialog. Although the ResourceBundle for messages is just a standard bundle, notice that I’ve provided a class loader. The class loader has been instructed to add jar files to the classpath, specifically the jars found in the plugins subdirectory. I’ll show you the ResourceLoader class later. For now, here’s part of the basic dialog-based app:

package com.joconner.hello;

import com.joconner.resplugin.ResourceLoader;
import java.util.Locale;
import java.util.ResourceBundle;

 * @author JOConner
public class GreetingDialog extends javax.swing.JDialog {
    /** A return status code - returned if Cancel button has been pressed */
    public static final int RET_CANCEL = 0;
    /** A return status code - returned if OK button has been pressed */
    public static final int RET_OK = 1;

    private ResourceBundle res = null;

    /** Creates new form GreetingDialog */
    public GreetingDialog(java.awt.Frame parent, boolean modal) {
        super(parent, modal);
        ClassLoader resourceLoader = ResourceLoader.createForDirectory("plugins/");
        res = ResourceBundle.getBundle("com.joconner.hello.resources.messages",
                Locale.getDefault(), resourceLoader);

    /** @return the return status of this dialog - one of RET_OK or RET_CANCEL */
    public int getReturnStatus() {
        return returnStatus;

    private void initComponents() {
        tfMorning = new javax.swing.JTextField();
        tfAfternoon = new javax.swing.JTextField();
        tfEvening = new javax.swing.JTextField();
        addWindowListener(new java.awt.event.WindowAdapter() {
            public void windowClosing(java.awt.event.WindowEvent evt) {

As you can see, the above dialog doesn’t do anything surprising with the bundles; it retrieves a bundle, pulls strings from it, and uses those strings in 3 text fields. However, notice that I’ve instructed the ResourceBundle class to use a special class loader. The class loader code is here:

public class ResourceLoader {

    private ResourceLoader() {}
    private ResourceLoader(File dir) { = dir;


     * Create a ResourceLoader for a specific file system location.
     * All JAR files and subdirectories in the location will be added
     * to the classpath
    public static ClassLoader createForDirectory(File dir) {
        ResourceLoader loader = null;
        if (dir.isDirectory()) {
            loader = new ResourceLoader(dir);
        return loader.getClassLoader();

    public static ClassLoader createForDirectory(String dir) {
        File f = new File(dir);
        return createForDirectory(f);

    private File[] addJarsToPath() {
        File[] jarFiles = directory.listFiles();

        List urlList = new ArrayList();
        for(File f: jarFiles) {
            try {
            } catch (MalformedURLException ex) {
                Logger.getLogger(ResourceLoader.class.getName()).log(Level.SEVERE, null, ex);
        ClassLoader parentLoader = Thread.currentThread().getContextClassLoader();
        URL[] urls = new URL[urlList.size()];
        urls = urlList.toArray(urls);
        URLClassLoader classLoader = new URLClassLoader(urls, parentLoader);
        this.loader = classLoader;

        return jarFiles;

    public ClassLoader getClassLoader() {
        return this.loader;

    private File directory;
    private ClassLoader loader;

This ClassLoader puts all jar files that are in my application-defined “plugins” subdirectory onto the classpath. See the addJarsToPath method for the exact wa to do this. Of course, this particular implementation isn’t robust. If you were doing this in an important application, you might want to confirm that a jar file really was a resource jar file….maybe your jar has a particular manifest file entry to provide very basic assurance that it is what you think it is.

Finally, just create your ResourceBundles as you normally would. Create jar files for each set of localized bundles for a specific language. Then drop them into a “plugins” directory immediately under your application’s main directory…the place your applications sits on the file system.

Now you can add new localized bundles to your application without having to recompile or jar your primary application. You don’t even have to change its classpath since new “plugins” jar files will automatically be added to the classpath for resource bundle creation.

There you go. Enjoy.

VN:F [1.9.22_1171]
Rating: 4.8/5 (4 votes cast)
VN:F [1.9.22_1171]
Rating: +2 (from 2 votes)
Categories: Java Tags: ,