Java and BCP 47 Language Tags

Since Java 7, Java’s Locale object has been updated to take on the role of a language tag as identified in RFC 5646 and BCP 47 specs. The newer language tag support gives developers the ability to be more precise in identifying language resources. The new Locale, used as a language tag, can be used to identify languages in this general form:


Of course, you can continue to think of locale as a lang_region_variant identifier, but Java now uses the RFC 5646 spec to enhance the Locale class to support language, script, broader regions, and even newer extensions if needed. And if the rules for generating this string of identifiers seems intimidating, you can use the Locale.Builder class to build up the tag without worries of misforming it.

The primary language identifier is the almost the same item you’ve always known; it’s an ISO 639 2-letter code or 3-letter code. The spec recommends using the shortest id possible.

The script is new. You can now add a proper script identifier that specifies the writing system used for the language. People can use multiple writing systems to write languages. For example, Japanese speakers/writers can use 3 or more different scripts for Japanese: kanji, hiragana, katakana, and even “romaji” or Latin script. Serbian is another language often written in either Latin or Cyrillic characters.

The region identifier was once limited to 2-letter ISO 3166 codes, but now you can also use the United Nations 3-digit macro geographical region codes in the region portion of a language tag. A macro geographical region identifies a larger region that comprises more than one country. For example, the UN currently defines Eastern Europe to be macro region 151 and includes 10 countries within it.

Eastern Europe 151

Finally, you can use variant, extension, and privateuse sub-tags to provide even more context for a language tag. See RFC 5646 for more details on these. I suggest that you also use the Locale.Builder class to assist if you need to use this level of detail.

Take a look at the Locale documentation for all the details on using these new features. They definitely give you much more control of how you identify and use language resources in your internationalized applications.

LocalDate in Java 8

Halloween 2014 calThe java.time.LocalDate class is new in Java 8. Inspired by the Joda Time library, LocalDate represents a date as it might be used from a wall calendar. It is not a singular instant in time like java.util.Date. You might use a LocalDate to represent a birthday, the start of a school year, or an anniversary. LocalDate text representations are familiar. They look like “2014-10-09” or “October 9, 2014” or other similar user-friendly text.

How do I create a LocalDate?

Create a LocalDate with any of its several static creation methods. I won’t cover all the ways here but will show you three methods, one of which requires close attention.

Give it to me now!

Want a LocalDate now? Here’s how you do it (assume I’m in Los Angeles CA):

LocalDate todayInLA =;

LocalDate will provide a text string formatted in ISO YYYY-MM-dd format if you call its toString method:

String formattedDate = todayInLA.toString();

This will create text like this:


You should know this fact about now(). It uses your computer’s default time zone to retrieve now’s date. So, a computer in Los Angeles (USA) and a computer in Dublin (Ireland) could execute this same method at the same time and produce two different calendar dates. It’s only reasonable they should. After all, they would be in different time zones.

If you want to be specific about the time zone used when creating the local date, use the over-loaded method with a time zone:

ZoneId zoneDublin = ZoneId.of("Europe/Dublin");
LocalDate todayInDublin =;

Executed at exactly the same time as the previous now method, this time zone-aware method could return the following:


I want a specific date

You can be more specific about your date too. You can ask for a date for a specific year, month, and day. This creates a LocalDate that is the same regardless of time zone. For example, let’s create a local date for Halloween (celebrated on October 31 for those areas that celebrate the holiday):

LocalDate halloween = LocalDate.of(2014, 10, 31);

The toString method produces this:


Can I format the LocaleDate differently?

Of course you can! You’ll need the format method for that. The format method uses a DateTimeFormatter that understands locale-sensitive preferences for printed dates.

You can learn about the format method and more from the LocalDate Javadocs.

Using Jersey (JAX-RS) with Spring

Jersey spring
Spring does a great job for what it does — dependency injection, transaction management, database query simplification, and plenty of other things. Jersey, the reference JAX-RS implementation, is also excellent for what it does — it simplifies the creation of RESTful web services by providing a framework for defining resources, accessing them, and creating their service representations. Wanting to use them both in the same project is only natural! This article shows you how to use both Spring and Jersey. It uses Spring to manage bean creation and scope, and it uses Jersey for resource creation and path mapping.

First, set up your dependencies to include both Spring and Jersey. Make sure you have the following important dependencies:
1. The spring framework
2. The Jersey-Spring integration project that includes a custom servlet that tells Jersey to let Spring handle all the bean creation and injection
3. The Jersey JAX-RS framework itself

Assuming you use Maven, your dependencies look something like this:





You’ll notice that I excluded a lot of Spring items from the Jersey integration dependency. That’s because our project explicitly asks to use a specific Spring version, and I don’t want the versions that Jersey imports.

Next, you have to update your web.xml file to use the Jersey servlet for Spring support. In this web.xml file, I don’t even use a DispatcherServlet at all. Instead, any url mapping will be dispatched and handled by the Jersey annotations. Here’s the web.xml:

    <display-name>Spring - Jersey Integration</display-name>


        <servlet-name>Spring Container</servlet-name>

        <servlet-name>Spring Container</servlet-name>

Finally, create your Spring beans in an applicationContext.xml file:

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns=""

    <bean id="helloworld" class="" scope="session"/>

The SpringJerseyBean is this:



public class SpringJerseyBean {
    private int count = 0;

    public String sayHello() {
        return "Hello world: " + count++;

Now you have everything you need to create a Jersey service using any of Spring’s additional features. In this configuration, Spring instantiates a helloWorld bean in session scope. Of course, you can change that to any of Spring’s other bean scopes too.

Enumerating Android Calendars

AndroidAndroid APIs allow you to query information about calendars in your system. Your application can perform typical read, write, update, and delete (CRUD) operations on calendars using a combination of several classes.

To retrieve calendar data, you’ll use the following classes:

  • Context
  • ContentResolver
  • Cursor

Android security requires that you announce your application’s intentions for calendar access. You indicate this in the application’s manifest file. The following manifest entry tells the Android platform that your application will read calendar information:

<uses-permission android:name="android.permission.READ_CALENDAR"/>

Make sure that the <uses-permission> is immediately outside the <application> tag. If you do not put this permission indicator in your manifest file, your application will throw security exceptions. More importantly, it won’t be able to access calendar information.

Why do you need three classes (Context, ContentResolver, and Cursor) to retrieve calendar information? First, a cursor is used to iterate through calendar information. Second, the Cursor is provided by a ContentResolver. Finally, you need a Context to retrieve a content resolver.

Within an Activity class, which represents a user-interface view, you can get a content resolver easily with the getContentResolver method. An Activity is a subclass of Context. That’s simple enough. However, if you want to separate concerns, you may want to create a calendar service to isolate calendar details from the rest of your application. As a separate, non-Activity, class, your CalendarService (implementation left to the reader) may not have access to a context. So you may need to provide a resolver or context from your Activity when instantiating a CalendarService instance. 

Here’s how you retrieve the content resolver:

import android.content.ContentResolver;
// if you are calling from within an Activity
ContentResolver resolver = getContentResolver(); // if you are calling from elsewhere with access to a Context ContentResolver resolver = context.getContentResolver();

Once you have the resolver, you can then query it for the exact data items needed. Calendars have a lot of information including name, time zone, and colors. Tell the resolver exactly what you want by declaring a projection. A projection is simply a String array that indicates the fields that you want to extract from a calendar row in Android’s databases.

The following code shows how to perform the query:

import android.provider.CalendarContract;
String[] projection = {CalendarContract.Calendars._ID,
String selection = String.format("%s = 1", CalendarContract.Calendars.VISIBLE);
Cursor c = contentResolver.query(CalendarContract.Calendars.CONTENT_URI,
    null, null);
while(c.moveToNext()) {
    // the cursor, c, contains all the projection data items
    // access the cursor’s contents by array index as declared in
    // your projection 
    long id = cursor.getLong(0);
String name = cursor.getString(1)); ... } c.close();

This particular example simply iterates over the calendar meta-data, not actual events. You’ll need an additional query for that.

Comparison of the Instant and Date Classes


Java 8 has a new java.time package, and one of its new classes is Instant. The best counterpart to this in past platforms is the java.util.Date class.

There are a couple notable differences between Date and Instant:

  • Date has very few useful methods, and Instant provides many.
  • Instant provides finer time granularity and a longer timeline.

Most of Date’s methods have been deprecated. Date manipulation and formatting have been delegated to the Calendar and DateFormat classes. In comparison, the Instant class allows you to perform some very basic functionality directly. You can add seconds and milliseconds for example. You can parse and generate ISO 8601 date strings with Instant as well. ISO 8601 dates have a consistent form across all locales and look like this: 2014-08-12T14:51:53:00Z. Most of the Instant methods are purely for convenience. You can do the similar things with Date using the Calendar and DateFormat classes.

Both Date and Instant have the same epoch (1970-01-01T00:00:00Z), but Instant can represent a much longer timeline. Date’s internal structure uses a long to represent milliseconds from the epoch. Instant, however, uses a long to represent seconds from epoch AND an int to represent nanoseconds of that second. That certainly means you don’t have to worry about date rollover problems in the near future.

The differences between Date and Instant are relatively minor, but these classes really are the starting point of of a more thorough discussion of the java.time package. Expect more details in the near future.

The New Date and Time API in Java 8

It’s no secret that developers have been unsatisfied with the existing Date and Calendar classes of previous Java versions. I’ve heard complaints that the Calendar API is difficult to understand, lacks needed features, and even causes unexpected concurrency bugs. As a result, developers sometimes migrated to the popular Joda Time library, which apparently satisfied their needs.

I’ve always suspected that the standard Date and Calendar API would be updated (or replaced), but I can’t help being a little surprised to see the new java.time package in Java 8. I’m not so surprised that it exists but that it is so comprehensive…and that it seems so familiar. If you’re one of those who moved to Joda Time, you’ll feel a sense of déjà vu. The new Java 8 library looks a lot like Joda Time. After a little snooping, now I understand why. The new Date and Time API was created by Stephen Colebourne, the author of Joda Time. Of course, he worked with Oracle and others within the umbrella of the JSR 310 proposal, but this is Joda Time in many ways.

As I take a first browse of the new API, I noted a couple simple thoughts: the API is feature-rich and complete, and it’s still complex.

Time, dates, and date manipulations are not simple, and no API is going to make that change . However, I think that this new API does a great job of making things less complicated than before. If you haven’t looked at it yet, please check it out. Let me know what you think. I’ll do the same and share how to use the APIs in upcoming blogs.


Standard Charsets in Java 7

Once in a while I poke my nose through the release notes of new Java releases. It’s not a particularly rewarding activity, but this time I did find something interesting. Oddly enough, it was interesting for what it did NOT say. I was surprised, so I thought you might want to know about a new class that is now available and quietly overlooked in any release notes.

Character sets have their own class representation in Java: Charset. You can use the Charset class to identify a character set for encoding or decoding. To create a Charset object, you use a factory method: Charset.forName(String charset). The uncomfortable trick to using this method is that you must be prepared to catch an exception if the JRE doesn’t actually supply the requested character set. Bummer.

I’ve always wondered why the JDK allows a random string as the parameter. I suppose it was for convenience…to allow the JDK to be updated over time with new charset support without having to change any API or enumeration. That’s understandable. But not really knowing what minimal set of character sets is supported in a particular JDK is somewhat…unnerving…especially to an engineer just trying to get his/her work finished.

The JDK documentation was always clear on what character sets you could absolutely depend on to be present. That was helpful and much needed. At least an observant developer could depend on that. However, the JDK now provides a more robust and useful way to identify which charsets are minimally supported. Java 7 provides a new class: java.nio.charset.StandardCharsets.

StandardCharsets does one thing. It lets you know what set of character sets is minimally supported in your JDK. The set is probably unchanged from Java 6 or Java 5 or even earlier. However, now you don’t have to read the documentation as carefully; the standard set is given to you. The Standardcharsets class explicitly enumerates the normal set for you.

Rocket science? No. But this welcome addition to the JDK was a long time in coming, and I’m glad to have found it.

Encoding Unicode Characters When UTF-8 Is Not An Option

The other day I suggested that you use UTF-8 to encode your Java source code files. I still think that’s a best practice. If you can do that, you owe it to yourself to follow that advice.

But what if you can’t store text as UTF-8? Perhaps your repository won’t allow it. Or maybe you simply can’t standardize on UTF-8 across the groups. What then? In that case, you should use ASCII to encode your text files. It’s not an optimal solution. However, I can help you get more interesting Unicode characters into your Java source files despite the actual file encoding limitation.

The trick is to use the native2ascii tool to convert your non-ASCII unicode characters to a \uXXXX encoding. After editing and creating a file contain UTF-8 text like this:

String interestingText = "家族";

You would instead run the native2ascii tool on the file to produce an ASCII file that encodes the non-ASCII characters in  \u-encoded notation like this:

String interestingText="\u5BB6\u65CF";

In your compiled code, the result is the same. Given the correct font, the characters will display properly. U+5BB6 and U+65CF are the code points for “家族”. Using this type of \u-encoding, we’ve solved the problem of getting the non-ASCII characters into your text file and repository. Simply save the converted, \u-encoded file instead of the original, non-ASCII file.

The native2ascii tool is part of your Java Development Kit (JDK). You will use it like this:

native2ascii -encoding UTF-8 <inputfile> <outputfile>

There you have it…an option for getting Unicode characters into your Java files without actually using UTF-8 encoding.


Best practice: Use UTF-8 as your source code encoding


Software engineering teams have become more distributed in the last few years. It’s not uncommon to have programmers in multiple countries, maybe a team in Belarus and others in Japan and in the U.S. Each of these teams most likely speaks different languages, and most likely their host systems use different character encodings by default. That means that everyone’s source code editor creates files in different encodings too. You can imagine how mixed up and munged a shared source code repository might become when teams save, edit and re-save source files in multiple charset encodings. It happens, and it has happened to me when working with remote teams.

Here’s an example, you create a test file containing ASCII text. Overnight, your Japanese colleagues edit and save the file with a new test and add the following line in it:

String example = "Fight 文字化け!";

They save the file and submit it to the repository using Shift-JIS or some other common legacy encoding. You pick up the file the next day, add a couple lines to it, save it, and BAM! Data loss. Your editor creates garbage characters because it attempts to save the file in the ISO-8859-1 encoding. Instead of the correct Japanese text from above, your file now contains the text “Fight ?????” Not cool, not fun. And you’ve most likely broken the test as well.

How can you avoid these charset mismatches? The answer is to use a common charset across all your teams. The answer is to use Unicode, and more specifically, to use UTF-8. The reason is simple. I won’t try hard to defend this. It’s just seems obvious. Unicode is a superset of all other commonly used character sets. UTF-8, a specific encoding of Unicode, is backward compatible with ASCII character encoding, and all programming language source keywords and syntax (that I know) is composed of ASCII text. UTF-8 as a common charset encoding will allow all of your teams to share files, use characters that make sense for their tests or other source files, and never lose data again because of charset encoding mismatches.

If your editor has a setting for file encodings, use it and choose UTF-8. Train your team to use it too. Set up your ANT scripts and other build tools to use the UTF-8 encoding for compilations. You might have to explicitly tell your java compiler that source files are in UTF-8, but this is worth making the change. By the way, the javac command line argument you need is simply “-encoding UTF-8”.

Recently I was using NetBeans 7 in a project and discovered that it defaults to UTF-8. Nice! I was pleasantly surprised.

Regardless of your editor, look into this. Find out how to set your file encodings to UTF-8. You’ll definitely benefit from this in a distributed team environment in which people use different encodings. Standardize on this and make it part of your team’s best practices.

“Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the United States and other countries.”