Author Archives: joconner

Managing Translatable UI Text in JavaScript with RequireJS

Old world map

Internationalizing a web application includes the task of extracting user interface text for translations. When text is separated from the rest of your application business logic, you are able to localize and translate it easier. Although the JavaScript language and browser environment don’t prescribe any particular method for creating externalized text resources, many libraries exist to help this effort. One such library is RequireJS. The library includes an i18n plugin that helps you organize your text resources and load them at runtime depending on the needed language. The goal of this article is to describe how to use the RequireJS i18n plugin on a simple, single page application.

This sample application has a single html file index-before.html that contains text headers and other UI elements. We will extract the text, put it into a separate resource file, translate that file, and use the translatable files at runtime as needed by the customer’s language.

NOTE: This is not a RequireJS tutorial. The article assumes that you have some RequireJS knowledge.

Setting Up Your Environment

Download the js-localization project on Github. The project contains the source code for this article, allowing you to see code both before and after using the i18n plugin.

The project’s base directory is js-localization. Two HTML files are in this directory, index-before.html and index-after.html, which are the before- and after-internationalization files. The scripts subdirectory holds all JavaScript libraries for the project. All 3rd-party libraries, including RequireJS, are in scripts/libs. This application’s primary application file is scripts/main.js. Externalized text bundles are in scripts/nls.

Understanding the Original Index File

The original file looks like this:

    <meta charset="UTF-8">
    <title>Localization with RequireJS</title>
    <link href="styles/quotes.css" rel="stylesheet"/>
    <h1>Famous Quotation</h1>
    I love deadlines. I like the whooshing sound they make as they fly by.
    Douglas Adams

This file contains the following translatable items that we will extract for translation:

  • The title element contents
  • The h1 element contents
  • Two p element contents

Enabling Your HTML File

Include the RequireJS core library in your HTML file with a script element within the head section:

<script src="scripts/libs/require.js" data-main="scripts/main" charset="UTF-8"></script>

This script loads the require JavaScript file and also tells RequireJS to load your main module. The main module is the application’s starting point.

Creating I18n Text Modules

We need to pull out the text and put it into a separate text resource bundle. We will create text resource bundles in the scripts/nls subdirectory. RequireJS will look for resource bundles within the nls sudirectory unless you configure it otherwise. For our needs, we’ll create scripts/nls/text.js, put all exported text there, and provide a key for each string.

The updated HTML file maintains the same core document structure that is in the original file. However, we’ve removed the UI text. The HTML file, found in index-after.html now looks like this without text:

    <meta charset="UTF-8">
    <title id="titleTxt"></title>
    <link href="styles/quotes.css" rel="stylesheet"/>
    <script src="scripts/libs/require.js" data-main="scripts/main" charset="utf-8"></script>
    <h1 id="headerTxt"></h1>
    <p id="quoteTxt"></p>
    <p id="authorTxt"></p>

Where is the text? It’s in scripts/nls/text.js:

    "root": {
        "titleTxt": "Localization with RequireJS",
        "headerTxt": "Famous Quotations",
        "quoteTxt": "I love deadlines. I like the whooshing sound they make as they fly by.",
        "authorTxt": "Douglas Adams"

In general, each string in the original HTML file should be extracted into one or more nls resource bundles, which are really just JavaScript files. The sole purpose of a resource bundle is to define localizable resources. The files that define your primary set of language key-values are called root bundles. As part of a root bundle, a JavaScript file should define a root object that contains content for your application’s base language. This application’s root language is used when no target language can be found that matches the requested language.

Every piece of localizable text should have a key. The key is used to extract the original and translated text from the bundles. For example, headerTxt contains the label for “Famous Quotations”.

Adding Translated Text

Now that you’ve separated text into one or more resource bundles, you can send those files away for translation. For each target language, you will create a subdirectory in the nls directory. In this example, I used Google Translate to translate the text.js content into Japanese and Spanish. The nls subdirectories that contain translations must be named using standard BCP-47 language tags. The sub-directories for Japanese and Spanish are nls/js and nls/es respectively. Because there is only one source file, there will be only one file in each translation subdirectory.

Informing the Library

You must inform the RequireJS library about the available translations. In each source file that contains translations, you must add a language tag that matches the translation subdirectory name. We then have to update the nls/text.js file to notify the library for both Japanese and Spanish like this:

    "root": {
        "titleTxt": "Localization with RequireJS",
        "headerTxt": "Famous Quotations",
        "quoteTxt": "I love deadlines. I like the whooshing sound they make as they fly by.",
        "authorTxt": "Douglas Adams"
    "es": true,
    "ja": true

For each translation, you should include an additional language tag key in the root bundle. Since our root bundle nls/text.js has been translated into both Japanese and Spanish, we include those language tags and set their value to true, indicating that the translation exists.

Configuring RequireJS

RequireJS determines your customer’s language in one of two ways:

1. It uses your browser's language setting via the `navigator.language` or `navigator.languages` object.
2. It uses your customized, explicit configuration.

The navigator.language object is generally available across all browsers and represents the preferred language that is typically configurable in your browser’s language settings. This may be a reasonable default language, but I don’t recommend that a professional, consumer-facing application rely on this setting.

Instead, you should explicitly configure RequireJS to use a language that you select for the customer, with navigator.language as a backup perhaps. Your RequireJS configuration should be in the primary application JavaScript file. In our case, the scripts/main.js file contains this configuration as well as our application code.

If you set the i18n.locale configuration option for the i18n plugin, RequireJS will use that setting as your application’s language. By setting the value of this field, you control what language RequireJS will attempt to use. Set the language/locale option in the main.js file like this:

    config: {
        i18n: {
            locale: "en"

In an actual application, you will not hard-code this locale setting. Instead, you will determine your customer’s language another way, perhaps using navigator.language as a default.

Accessing the Resource Bundle

Once things are configured, using the resource bundle is easy. You just have to include it as a module in your main.js file. Then, you access each key using the name you give the module.

define(["jquery", "i18n!nls/text"],  function($, text) {

    // pull the text from the bundle and insert it into the document


In the above case, I’ve required two modules: jquery and “i18n!nls/text”. When you require a module using a plugin, you must append a “!” to the plugin name. After the plugin name, append the root resource bundle path. In this case, even though we have three languages, we point RequireJS to the root bundle. The i18n plugin will read the root bundle and discover the additional supported translations.

Since the code uses text as the module name, we can retrieve the text values by simply referencing the keys in the bundle. For example, if we want the quoteTxt, we reference it with text.quoteTxt in our code. The above code uses this technique to populate all the UI text in our simple HTML file.

Demonstrating the Plugin

We’ve setup the environment, configured the plugin, translated a file, and have modified our application so that it pulls translated text from the bundles. Now let’s see this work. You shouldn’t need any additional files or tools. Just point your browser to the index-after.html file on your local drive. If you’ve not changed anything, you should see the following English content:

Quote en

Now if you update the main.js file and change the i18n.locale setting to ja, you will see the next image. Remember, this is not a professional translation and is only used for an example.

Quote ja


Although JavaScript has no predefined framework for providing translatable resources, it is reasonably easy to use a library like the RequireJS i18n plugin to help manage UI text strings. Interestingly, the Dojo libraries work similarly to RequireJS. So, if you’re using Dojo, you will manage translations in your application in much the same way.

One of the most interesting parts of JavaScript internationalization is the question of how to determine the user’s preferred language. I’ve written about this before, so instead of handling that question here, I’ll refer you to Language Signals on the Web.

Good luck in your internationalization efforts. Like anything else, the hardest part is just getting started. Hopefully this article makes that first step easier.

Do You Know What Countries are in Western Europe?

Different people and organizations define geographical regions in almost the same way, but there are differences. Consider the different regions of Europe for example.

What countries do YOU think are in Western Europe? Are you sure one or two of those aren’t in Southern Europe? How do you decide?

According to the United Nations, nine (9) countries define “Western Europe”. Can you name them? I’ve included an image below to help.

Western Europe No Names

You can learn more about how the UN defines world macro regions with their M.49 document.

Java and BCP 47 Language Tags

Since Java 7, Java’s Locale object has been updated to take on the role of a language tag as identified in RFC 5646 and BCP 47 specs. The newer language tag support gives developers the ability to be more precise in identifying language resources. The new Locale, used as a language tag, can be used to identify languages in this general form:


Of course, you can continue to think of locale as a lang_region_variant identifier, but Java now uses the RFC 5646 spec to enhance the Locale class to support language, script, broader regions, and even newer extensions if needed. And if the rules for generating this string of identifiers seems intimidating, you can use the Locale.Builder class to build up the tag without worries of misforming it.

The primary language identifier is the almost the same item you’ve always known; it’s an ISO 639 2-letter code or 3-letter code. The spec recommends using the shortest id possible.

The script is new. You can now add a proper script identifier that specifies the writing system used for the language. People can use multiple writing systems to write languages. For example, Japanese speakers/writers can use 3 or more different scripts for Japanese: kanji, hiragana, katakana, and even “romaji” or Latin script. Serbian is another language often written in either Latin or Cyrillic characters.

The region identifier was once limited to 2-letter ISO 3166 codes, but now you can also use the United Nations 3-digit macro geographical region codes in the region portion of a language tag. A macro geographical region identifies a larger region that comprises more than one country. For example, the UN currently defines Eastern Europe to be macro region 151 and includes 10 countries within it.

Eastern Europe 151

Finally, you can use variant, extension, and privateuse sub-tags to provide even more context for a language tag. See RFC 5646 for more details on these. I suggest that you also use the Locale.Builder class to assist if you need to use this level of detail.

Take a look at the Locale documentation for all the details on using these new features. They definitely give you much more control of how you identify and use language resources in your internationalized applications.

Getting started as an Android developer

Android is the most accessible and least expensive mobile platform for developers to get started with. Although you may eventually want to publish your apps on the Google Play Store, which requires a minimal fee, getting and using the developer tools is free.

Here’s how to get started:
1. Download the Android Studio.
2. Turn on the developer options on your device.
3. Head over to the Getting Started site to begin.

Downloading the Android Studio IDE

Google has partnered with JetBrains to create an integrated development environment (IDE) that’s customized for Android development. Of course you can continue to use whatever environment you like as long as you have the SDK, but this IDE makes getting started simple.

Download the Android Studio.

Enable Developer Options

Although the Android Studio has great emulators, you should also test your apps on actual devices. Enable those devices for debugging and developer support by tapping 7 times on your device’s “Build number” label. Yes, you heard right. Use these instructions to find this on your device.

The Getting Started Guide

The best information for getting started is the Android site itself. It has great tips for new users and veterans. I recommend starting on the developer’s training site.

JavaScript Internationalization Libraries

7564199138 f613b0fc16 m
Creating a browser-based web application requires a combination of HTML, CSS, JavaScript, and backend services. Creating an internationalized application requires additional techniques and libraries to provide and manage localized data formats and resources. This document focuses just on front-end JavaScript needs.

Depending on your application, your internationalization needs will be different. However, common concerns include these issues:

  • date, time, number, and currency formatting
  • text localization
  • pluralization and complex messages
  • sorting and collation

More demanding use-cases may even require these:

  • phone number formatting
  • alternative calendars

Finally, considering that EcmaScript now includes an internationalization API that is already supported in Chrome and Firefox, I think the following is important:

In my opinion, using a single library that supports all of the above is preferable to using multiple libraries. However, one solution doesn’t always work for every situation and a single solution doesn’t even seem to exist. You may have to use two or more different libraries to get all the functionality you need. For example, libraries that provide general number formatting support typically don’t provide phone number support.

Considering all the features above, you might consider these JavaScript internationalization libraries:

EcmaScript Intl Library

EcmaScript, the language specification for JavaScript, has defined a new Internationalization API. This API is already implemented in Chrome and Firefox browsers, and may be partially implemented in other browsers as well. The Internationalization API defines a new namespace Intl that provides objects for the following:

  • collation
  • number formatting
  • date/time formatting

The new Intl library is extremely important because it means that you get some internationalization support without loading an external library. Additionally, implementations support the CLDR data, which is the best-practice for format patterns.

Unfortunately, this API is not available in all browser-supplied JavaScript implementations, Safari and Internet Explorer being notable holdouts. Additionally, many mobile browsers do not yet provide the APIs.

If you need to use a separate library, you should favor those that provide a polyfill for this API when possible. This will make future transitions smoother.


The Intl.js library is a polyfill for the EcmaScript Intl library. It provides the number and date/time formatting APIs. However, it does not provide the collation API. Because it uses CLDR data, you can feel comfortable that the actual formats created by this library will be the same as–or very similar to–the EcmaScript Intl library. However, because collation isn’t available, you obviously have to look elsewhere if that is needed. If you don’t need collation, this may be a good option for basic data formatting.

JQuery Globalize

TheJQuery Globalize library now provides CLDR support. It also provides the number, date, and time formatting. However, it gives you a couple additional features above Intl.js that make it worth considering:

  • message translation API
  • pluralization

Message translation is a localization API for common message strings. The library defines a file format for translatable message strings and gives you API for retrieving those strings after translation.

Pluralization is a feature that lets you accommodate the differences in word choice that are needed when word forms depend on their count. For example, the pluralization library let’s you conveniently handle word choice for 0, 1, or n number of mouse or mice. Both Polish and Russian, for example, have several plural forms for 2, 3, 4, 5, or even more instances of a particular noun.


The Format.js library was recently released by Yahoo. It builds upon the Intl.js library and adds support for the following:

  • support for template libraries like Handlebars, React, and Dust
  • cache support for Intl format objects
  • ICU message syntax for pluralization, gender, and other types of message variability
  • relative time (5 min ago, 2 hours ago, etc)

The set of APIs is modular. If you only need the relative time support, you can load just that library instead of other items.

Dojo I18n

If you are already using Dojo, Dojo’s own internationalization libraries are an obvious choice. With support for string resource bundles, date and time formatting, and number and currency formatting, Dojo’s library has a lot to offer. This library’s additional benefit is its support for the CLDR patterns.

Phone Number Library

The libphonenumber library solves a very specific need for phone number parsing and formatting. If your application needs this, this well-supported library from Google should handle your common use-cases. In addition to JavaScript, the library exists for Java, C++, and other languages. The library helps you parse, format, and validate phone numbers for many countries of the world. Surprisingly, the library also helps you determine the type of a phone number, for example, fixed-line, mobile, toll-free, etc.


No single JavaScript library exists for all internationalization needs. However, you should be able to use one of either Intl.js, Globalize.js, Format.js, or Dojo for basic data formatting. For phone numbers, the libphonenumber library seems to be the only real choice for now, but it seems to be well supported and adequate for many use cases. Unfortunately, I haven’t found a widespread collation solution yet. However, given the typical size and download-time constraints for JavaScript applications, especially in mobile settings, sorting on the server side before transmitting sortable data may be a better solution for now.

Good luck in your own internationalization work in JavaScript. If you haven’t already adopted a set of support libraries, consider some of the ones mentioned above.

LocalDate in Java 8

Halloween 2014 calThe java.time.LocalDate class is new in Java 8. Inspired by the Joda Time library, LocalDate represents a date as it might be used from a wall calendar. It is not a singular instant in time like java.util.Date. You might use a LocalDate to represent a birthday, the start of a school year, or an anniversary. LocalDate text representations are familiar. They look like “2014-10-09″ or “October 9, 2014″ or other similar user-friendly text.

How do I create a LocalDate?

Create a LocalDate with any of its several static creation methods. I won’t cover all the ways here but will show you three methods, one of which requires close attention.

Give it to me now!

Want a LocalDate now? Here’s how you do it (assume I’m in Los Angeles CA):

LocalDate todayInLA =;

LocalDate will provide a text string formatted in ISO YYYY-MM-dd format if you call its toString method:

String formattedDate = todayInLA.toString();

This will create text like this:


You should know this fact about now(). It uses your computer’s default time zone to retrieve now’s date. So, a computer in Los Angeles (USA) and a computer in Dublin (Ireland) could execute this same method at the same time and produce two different calendar dates. It’s only reasonable they should. After all, they would be in different time zones.

If you want to be specific about the time zone used when creating the local date, use the over-loaded method with a time zone:

ZoneId zoneDublin = ZoneId.of("Europe/Dublin");
LocalDate todayInDublin =;

Executed at exactly the same time as the previous now method, this time zone-aware method could return the following:


I want a specific date

You can be more specific about your date too. You can ask for a date for a specific year, month, and day. This creates a LocalDate that is the same regardless of time zone. For example, let’s create a local date for Halloween (celebrated on October 31 for those areas that celebrate the holiday):

LocalDate halloween = LocalDate.of(2014, 10, 31);

The toString method produces this:


Can I format the LocaleDate differently?

Of course you can! You’ll need the format method for that. The format method uses a DateTimeFormatter that understands locale-sensitive preferences for printed dates.

You can learn about the format method and more from the LocalDate Javadocs.

JavaScript file encodings

All text files have a character encoding regardless of whether you explicitly declare it. JavaScript files are no exception. This article describes both how and why you should declare an encoding when importing script files into an HTML document.

JavaScript’s Character Model

A JavaScript engine’s internal character set is Unicode. The Ecmascript 5.1 Standard standard says that all strings are encoded in 16-bit code units described by UTF-16. Once inside the JavaScript interpreter, all characters and strings are stored and accessed as UTF-16 code units. However, before being processed by the JavaScript engine, a JavaScript file’s charset can be anything, not necessarily a Unicode encoding.

Character Encoding Conversion

When you import a JavaScript file into an HTML document, by default he browser uses the document’s charset to convert the JavaScript file into the interpreter’s encoding (UTF-16). You can also use an explicit charset when importing a file. When an HTML file charset and a JavaScript file charset are different, you will most likely see conversion mistakes. The results are mangled, incorrect characters.

Conversion Problems

I created a simple demonstration of the potential problem. The demo has 5 files:

  • jsencoding.html — base HTML file, UTF-8 charset
  • stringmgr.js — a basic string resource mgr, UTF-8 charset
  • resource.js — an English JavaScript resource file containing the word family, UTF-8 charset
  • resource_es.js — a Spanish file containing the word girl, ISO-8859-1 charset
  • resource_ja.js — a Japanese file containing the word baseball, SHIFT-JIS charset

In the base HTML file, I’ve imported 3 JavaScript resource files using the following import statements:

    <script src="resource.js"></script>
    <script src="resource_es.js"></script>
    <script src="resource_ja.js"></script>


The image shows how the text resources have been converted incorrectly. The browser imported the Spanish JavaScript file using the HTML file’s UTF-8 encoding even though the file is stored using ISO-8859-1. The Japanese resource script is stored as SHIFT-JIS and doesn’t convert correctly either.

After updating the import statements, we see a better result:

    <script src="resource.js" charset="UTF-8"></script>
    <script src="resource_es.js charset="ISO-8859-1"></script>
    <script src="resource_ja.js" charset="SHIFT-JIS"></script>

Correct conversions


To avoid charset conversion problems when importing JavaScript files and JavaScript resources, you should include the file charset. An even better practice is to use UTF-8 as your charset in all files, which minimizes these conversion problems significantly.

You can checkout the code for this article on my github account here:
I18n Examples

Best practices for character sets

CharsampleYou may not understand every language, but that doesn’t mean your applications can’t. Regardless of your customer’s language choice, your application should be able to process, transfer, and store their data. Even if you don’t provide a localized user interface, your application should allow your customer to enter text in their own language and in their own script. For example, my word processor is localized into English, but it allows me to enter text in a variety of scripts and languages.

How is that possible? The most basic requirement for this ability is to use a single character set internally. If you want to handle all scripts, your only choice is the Unicode character set.

Rule #1:
Use Unicode as your character set.

Unicode has several possible encodings, including UTF-32, UTF-16, UTF-16LE, UTF-16BE, and UTF-8. These encodings transform Unicode code points into code units. Code points are the values between 0 and 0x10FFFF, the range of integers that are allocated for character definitions. An encoding transforms code points into code units, which are used to serialize a character for storage or transmission. A single code point becomes 1 or more code units during encoding.

Although all the encodings are well-defined, UTF-8 is the easiest to use primarily because of its code unit size. UTF-8 has 8-bit (byte) code units that are immune to the common memory design issues involving byte ordering. I recommend that you use UTF-8 everywhere possible in your system. You’ll avoid mistakes in determining little endian and big endian layouts with the other encodings.

Rule #2:
Use UTF-8 everywhere possible

OK, so we’ve got the rules for character set and encoding choice. Now you have to implement those rules.

Complex systems typically have many points of failure for textual data loss. Those points are usually hand-off points across systems:
1. File export and import.
2. Outbound and inbound HTTP request data and parameters
3. Database connections and schemas

Each of these deserves its own discussion. Moreover, each has specific implementation details for different products. Unfortunately I can’t cover any of them adequately in this particular post. However, I’ll try to touch on these subjects in a future update. If you have questions about any of them, let me know. I’ll use your suggestions to help me decide which to address first.

For now you have my own best practices for character set choice when creating any system. Good luck!

//John O.

Using Jersey (JAX-RS) with Spring

Jersey spring
Spring does a great job for what it does — dependency injection, transaction management, database query simplification, and plenty of other things. Jersey, the reference JAX-RS implementation, is also excellent for what it does — it simplifies the creation of RESTful web services by providing a framework for defining resources, accessing them, and creating their service representations. Wanting to use them both in the same project is only natural! This article shows you how to use both Spring and Jersey. It uses Spring to manage bean creation and scope, and it uses Jersey for resource creation and path mapping.

First, set up your dependencies to include both Spring and Jersey. Make sure you have the following important dependencies:
1. The spring framework
2. The Jersey-Spring integration project that includes a custom servlet that tells Jersey to let Spring handle all the bean creation and injection
3. The Jersey JAX-RS framework itself

Assuming you use Maven, your dependencies look something like this:





You’ll notice that I excluded a lot of Spring items from the Jersey integration dependency. That’s because our project explicitly asks to use a specific Spring version, and I don’t want the versions that Jersey imports.

Next, you have to update your web.xml file to use the Jersey servlet for Spring support. In this web.xml file, I don’t even use a DispatcherServlet at all. Instead, any url mapping will be dispatched and handled by the Jersey annotations. Here’s the web.xml:

    <display-name>Spring - Jersey Integration</display-name>


        <servlet-name>Spring Container</servlet-name>

        <servlet-name>Spring Container</servlet-name>

Finally, create your Spring beans in an applicationContext.xml file:

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns=""

    <bean id="helloworld" class="" scope="session"/>

The SpringJerseyBean is this:



public class SpringJerseyBean {
    private int count = 0;

    public String sayHello() {
        return "Hello world: " + count++;

Now you have everything you need to create a Jersey service using any of Spring’s additional features. In this configuration, Spring instantiates a helloWorld bean in session scope. Of course, you can change that to any of Spring’s other bean scopes too.

The absolute minimum you should know about internationalization


Internationalization is a design and engineering task that prepares your software product to be localized. It doesn’t create a localized product; instead, it puts your product in a state that allows localization. The goal of internationalization should be a single code base that can be used as-is to create multiple localized versions of your product.

This article provides a high-level description of some issues you must resolve during internationalization. This is not  a comprehensive list:

  • Character sets
  • Resource externalization
  • User interface design
  • Data formats
  • Sorting

Character Sets

Your application will most likely manage, manipulate, store, and display information. Much of the information will be user-readable text. One property of text is it’s character set.

If you want a global-ready application, the choice of character set is simple: Unicode. Unicode allows you to manage text in practically any script without losing data due to character conversion problems. Regardless of the default  character set of the underlying host OS, your application should convert text to Unicode for internal manipulation. Additionally, your application should transmit and store text as Unicode. Doing anything else is unnecessarily complicated and completely unnecessary in any modern operating environment.

Unicode has several possible encodings, including UTF-16 and UTF-8. My experience is that developers rarely get to use just one of these. However, they are BOTH Unicode. Their only significant difference is how a specific code point is encoded in code units. Unless you have a well-understood reason for doing otherwise, I suggest you store and transmit text in UTF-8. Your specific programming language may require you to use UTF-16 for text operations. When displaying text to your user, you might use UTF-16 in a desktop application. When rendering HTML views, you can typically use UTF-16 or UTF-8. I suggest you use UTF-8 everywhere possible.

Resource Externalization

A resource is any text label, message, graphic image, video, audio, or other application file that you intend to present to the user. Instead of hard-coding these resources into your application code, you should extract them into external files that can be used at run-time. By extracting user-facing resources into resource files, you make translation and localization easier. Practically every programming environment provides a mechanism for creating external resource files. 

User Interface Design

User interface layout is often affected by the length of text labels, fields and other visable text. When designing layouts, remember that field and label sizes will increase for some language translations. Design your user interface with the largest label and field lengths in mind. Additionally, follow the typical rules for avoiding culturally sensitive images, hand gestures, and body parts. Also, avoid concatenating shorter pieces of text to build up larger sentences. When translated, the concatenated text rarely has correct syntax or meaning.

Some languages are written from right-to-left. If targeting those languages, remember that the entire layout of page components is often arranged from right-to-left. You may need to create a “reversible” layout that can accommodate those languages and cultures.

Data Formats

Numbers and dates have different formats around the world. Digit separators, currency symbols, and date field orders are all part of the many differences that you’ll need to consider. Fortunately, you don’t have to discover the correct formats and standards for every culture. Many programming environments already provide libraries to format numbers, currencies, and dates using the Common Locale Data Repository (CLDR) formats. 

The main point I want to share about formats is this: separate concerns for data formats by storing and manipulating data in a canonical, non-localized form and apply localized formats only in the “view” layer of your application.


Languages have sorting rules. Those rules help you find names or products in long lists. Dictionaries, phone books, and product catalogs use linguistic sorting to help people find information quickly. When presenting long lists to your users, your application should use those sorting rules as well. Learn and use the sorting or collation libraries in your programming language or technology environment.


Internationalization is an effort to create products that can be translated and localized for many languages and cultures. Creating an internationalized product requires that you consider and plan for a variety of common technical issues. A few of those issues are character set choice, user-interface design, data formats and sorting. You rarely have to solve those issues yourself; you can often find and use existing libraries for this purpose.

More Resources