Author Archives: joconner

LocalDate in Java 8

Halloween 2014 calThe java.time.LocalDate class is new in Java 8. Inspired by the Joda Time library, LocalDate represents a date as it might be used from a wall calendar. It is not a singular instant in time like java.util.Date. You might use a LocalDate to represent a birthday, the start of a school year, or an anniversary. LocalDate text representations are familiar. They look like “2014-10-09″ or “October 9, 2014″ or other similar user-friendly text.

How do I create a LocalDate?

Create a LocalDate with any of its several static creation methods. I won’t cover all the ways here but will show you three methods, one of which requires close attention.

Give it to me now!

Want a LocalDate now? Here’s how you do it (assume I’m in Los Angeles CA):

LocalDate todayInLA = LocaleDate.now();

LocalDate will provide a text string formatted in ISO YYYY-MM-dd format if you call its toString method:

String formattedDate = todayInLA.toString();

This will create text like this:

2014-10-09

You should know this fact about now(). It uses your computer’s default time zone to retrieve now’s date. So, a computer in Los Angeles (USA) and a computer in Dublin (Ireland) could execute this same method at the same time and produce two different calendar dates. It’s only reasonable they should. After all, they would be in different time zones.

If you want to be specific about the time zone used when creating the local date, use the over-loaded method with a time zone:

ZoneId zoneDublin = ZoneId.of("Europe/Dublin");
LocalDate todayInDublin = LocalDate.now(zoneDublin);

Executed at exactly the same time as the previous now method, this time zone-aware method could return the following:

2014-10-10

I want a specific date

You can be more specific about your date too. You can ask for a date for a specific year, month, and day. This creates a LocalDate that is the same regardless of time zone. For example, let’s create a local date for Halloween (celebrated on October 31 for those areas that celebrate the holiday):

LocalDate halloween = LocalDate.of(2014, 10, 31);

The toString method produces this:

2014-10-31

Can I format the LocaleDate differently?

Of course you can! You’ll need the format method for that. The format method uses a DateTimeFormatter that understands locale-sensitive preferences for printed dates.

You can learn about the format method and more from the LocalDate Javadocs.

JavaScript file encodings

All text files have a character encoding regardless of whether you explicitly declare it. JavaScript files are no exception. This article describes both how and why you should declare an encoding when importing script files into an HTML document.

JavaScript’s Character Model

A JavaScript engine’s internal character set is Unicode. The Ecmascript 5.1 Standard standard says that all strings are encoded in 16-bit code units described by UTF-16. Once inside the JavaScript interpreter, all characters and strings are stored and accessed as UTF-16 code units. However, before being processed by the JavaScript engine, a JavaScript file’s charset can be anything, not necessarily a Unicode encoding.

Character Encoding Conversion

When you import a JavaScript file into an HTML document, by default he browser uses the document’s charset to convert the JavaScript file into the interpreter’s encoding (UTF-16). You can also use an explicit charset when importing a file. When an HTML file charset and a JavaScript file charset are different, you will most likely see conversion mistakes. The results are mangled, incorrect characters.

Conversion Problems

I created a simple demonstration of the potential problem. The demo has 5 files:

  • jsencoding.html — base HTML file, UTF-8 charset
  • stringmgr.js — a basic string resource mgr, UTF-8 charset
  • resource.js — an English JavaScript resource file containing the word family, UTF-8 charset
  • resource_es.js — a Spanish file containing the word girl, ISO-8859-1 charset
  • resource_ja.js — a Japanese file containing the word baseball, SHIFT-JIS charset

In the base HTML file, I’ve imported 3 JavaScript resource files using the following import statements:

    <script src="resource.js"></script>
    <script src="resource_es.js"></script>
    <script src="resource_ja.js"></script>

Mojibake

The image shows how the text resources have been converted incorrectly. The browser imported the Spanish JavaScript file using the HTML file’s UTF-8 encoding even though the file is stored using ISO-8859-1. The Japanese resource script is stored as SHIFT-JIS and doesn’t convert correctly either.

After updating the import statements, we see a better result:

    <script src="resource.js" charset="UTF-8"></script>
    <script src="resource_es.js charset="ISO-8859-1"></script>
    <script src="resource_ja.js" charset="SHIFT-JIS"></script>

Correct conversions

Recommendations

To avoid charset conversion problems when importing JavaScript files and JavaScript resources, you should include the file charset. An even better practice is to use UTF-8 as your charset in all files, which minimizes these conversion problems significantly.

You can checkout the code for this article on my github account here:
I18n Examples

Best practices for character sets

CharsampleYou may not understand every language, but that doesn’t mean your applications can’t. Regardless of your customer’s language choice, your application should be able to process, transfer, and store their data. Even if you don’t provide a localized user interface, your application should allow your customer to enter text in their own language and in their own script. For example, my word processor is localized into English, but it allows me to enter text in a variety of scripts and languages.

How is that possible? The most basic requirement for this ability is to use a single character set internally. If you want to handle all scripts, your only choice is the Unicode character set.

Rule #1:
Use Unicode as your character set.

Unicode has several possible encodings, including UTF-32, UTF-16, UTF-16LE, UTF-16BE, and UTF-8. These encodings transform Unicode code points into code units. Code points are the values between 0 and 0x10FFFF, the range of integers that are allocated for character definitions. An encoding transforms code points into code units, which are used to serialize a character for storage or transmission. A single code point becomes 1 or more code units during encoding.

Although all the encodings are well-defined, UTF-8 is the easiest to use primarily because of its code unit size. UTF-8 has 8-bit (byte) code units that are immune to the common memory design issues involving byte ordering. I recommend that you use UTF-8 everywhere possible in your system. You’ll avoid mistakes in determining little endian and big endian layouts with the other encodings.

Rule #2:
Use UTF-8 everywhere possible

OK, so we’ve got the rules for character set and encoding choice. Now you have to implement those rules.

Complex systems typically have many points of failure for textual data loss. Those points are usually hand-off points across systems:
1. File export and import.
2. Outbound and inbound HTTP request data and parameters
3. Database connections and schemas

Each of these deserves its own discussion. Moreover, each has specific implementation details for different products. Unfortunately I can’t cover any of them adequately in this particular post. However, I’ll try to touch on these subjects in a future update. If you have questions about any of them, let me know. I’ll use your suggestions to help me decide which to address first.

For now you have my own best practices for character set choice when creating any system. Good luck!

//John O.

Using Jersey (JAX-RS) with Spring

Jersey spring
Spring does a great job for what it does — dependency injection, transaction management, database query simplification, and plenty of other things. Jersey, the reference JAX-RS implementation, is also excellent for what it does — it simplifies the creation of RESTful web services by providing a framework for defining resources, accessing them, and creating their service representations. Wanting to use them both in the same project is only natural! This article shows you how to use both Spring and Jersey. It uses Spring to manage bean creation and scope, and it uses Jersey for resource creation and path mapping.

First, set up your dependencies to include both Spring and Jersey. Make sure you have the following important dependencies:
1. The spring framework
2. The Jersey-Spring integration project that includes a custom servlet that tells Jersey to let Spring handle all the bean creation and injection
3. The Jersey JAX-RS framework itself

Assuming you use Maven, your dependencies look something like this:

   <dependency>
        <groupId>org.springframework</groupId>
        <artifactId>spring-webmvc</artifactId>
        <version>4.0.3.RELEASE</version>
    </dependency>

    <dependency>
        <groupId>com.sun.jersey.contribs</groupId>
        <artifactId>jersey-spring</artifactId>
        <version>1.18</version>
        <exclusions>
            <exclusion>
                <groupId>org.springframework</groupId>
                <artifactId>spring-web</artifactId>
            </exclusion>
            <exclusion>
                <groupId>org.springframework</groupId>
                <artifactId>spring-aop</artifactId>
            </exclusion>
            <exclusion>
                <groupId>org.springframework</groupId>
                <artifactId>spring-asm</artifactId>
            </exclusion>
            <exclusion>
                <groupId>org.springframework</groupId>
                <artifactId>spring-beans</artifactId>
            </exclusion>
            <exclusion>
                <groupId>org.springframework</groupId>
                <artifactId>spring-context</artifactId>
            </exclusion>
            <exclusion>
                <groupId>org.springframework</groupId>
                <artifactId>spring-core</artifactId>
            </exclusion>
        </exclusions>

    </dependency>
    <dependency>
        <groupId>com.sun.jersey</groupId>
        <artifactId>jersey-servlet</artifactId>
        <version>1.18</version>
    </dependency>

    <dependency>
        <groupId>javax.servlet</groupId>
        <artifactId>servlet-api</artifactId>
        <version>2.5</version>
        <scope>provided</scope>
    </dependency>

You’ll notice that I excluded a lot of Spring items from the Jersey integration dependency. That’s because our project explicitly asks to use a specific Spring version, and I don’t want the versions that Jersey imports.

Next, you have to update your web.xml file to use the Jersey servlet for Spring support. In this web.xml file, I don’t even use a DispatcherServlet at all. Instead, any url mapping will be dispatched and handled by the Jersey annotations. Here’s the web.xml:

<web-app>
    <display-name>Spring - Jersey Integration</display-name>

    <listener>
        <listener-class>org.springframework.web.context.ContextLoaderListener</listener-class>
    </listener>
    <listener>
        <listener-class>org.springframework.web.context.request.RequestContextListener</listener-class>
    </listener>

    <servlet>
        <servlet-name>Spring Container</servlet-name>
        <servlet-class>com.sun.jersey.spi.spring.container.servlet.SpringServlet</servlet-class>
     <load-on-startup>1</load-on-startup>
    </servlet>

    <servlet-mapping>
        <servlet-name>Spring Container</servlet-name>
        <url-pattern>/*</url-pattern>
    </servlet-mapping>
</web-app>

Finally, create your Spring beans in an applicationContext.xml file:

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd">

    <bean id="helloworld" class="com.joconner.ws.SpringJerseyBean" scope="session"/>
</beans>

The SpringJerseyBean is this:

package com.joconner.ws;

import javax.ws.rs.GET;
import javax.ws.rs.Path;

@Path("/hello")
public class SpringJerseyBean {
    private int count = 0;

    @GET
    public String sayHello() {
        return "Hello world: " + count++;
    }
}

Now you have everything you need to create a Jersey service using any of Spring’s additional features. In this configuration, Spring instantiates a helloWorld bean in session scope. Of course, you can change that to any of Spring’s other bean scopes too.

The absolute minimum you should know about internationalization

Worldmapboards

Internationalization is a design and engineering task that prepares your software product to be localized. It doesn’t create a localized product; instead, it puts your product in a state that allows localization. The goal of internationalization should be a single code base that can be used as-is to create multiple localized versions of your product.

This article provides a high-level description of some issues you must resolve during internationalization. This is not  a comprehensive list:

  • Character sets
  • Resource externalization
  • User interface design
  • Data formats
  • Sorting

Character Sets

Your application will most likely manage, manipulate, store, and display information. Much of the information will be user-readable text. One property of text is it’s character set.

If you want a global-ready application, the choice of character set is simple: Unicode. Unicode allows you to manage text in practically any script without losing data due to character conversion problems. Regardless of the default  character set of the underlying host OS, your application should convert text to Unicode for internal manipulation. Additionally, your application should transmit and store text as Unicode. Doing anything else is unnecessarily complicated and completely unnecessary in any modern operating environment.

Unicode has several possible encodings, including UTF-16 and UTF-8. My experience is that developers rarely get to use just one of these. However, they are BOTH Unicode. Their only significant difference is how a specific code point is encoded in code units. Unless you have a well-understood reason for doing otherwise, I suggest you store and transmit text in UTF-8. Your specific programming language may require you to use UTF-16 for text operations. When displaying text to your user, you might use UTF-16 in a desktop application. When rendering HTML views, you can typically use UTF-16 or UTF-8. I suggest you use UTF-8 everywhere possible.

Resource Externalization

A resource is any text label, message, graphic image, video, audio, or other application file that you intend to present to the user. Instead of hard-coding these resources into your application code, you should extract them into external files that can be used at run-time. By extracting user-facing resources into resource files, you make translation and localization easier. Practically every programming environment provides a mechanism for creating external resource files. 

User Interface Design

User interface layout is often affected by the length of text labels, fields and other visable text. When designing layouts, remember that field and label sizes will increase for some language translations. Design your user interface with the largest label and field lengths in mind. Additionally, follow the typical rules for avoiding culturally sensitive images, hand gestures, and body parts. Also, avoid concatenating shorter pieces of text to build up larger sentences. When translated, the concatenated text rarely has correct syntax or meaning.

Some languages are written from right-to-left. If targeting those languages, remember that the entire layout of page components is often arranged from right-to-left. You may need to create a “reversible” layout that can accommodate those languages and cultures.

Data Formats

Numbers and dates have different formats around the world. Digit separators, currency symbols, and date field orders are all part of the many differences that you’ll need to consider. Fortunately, you don’t have to discover the correct formats and standards for every culture. Many programming environments already provide libraries to format numbers, currencies, and dates using the Common Locale Data Repository (CLDR) formats. 

The main point I want to share about formats is this: separate concerns for data formats by storing and manipulating data in a canonical, non-localized form and apply localized formats only in the “view” layer of your application.

Sorting

Languages have sorting rules. Those rules help you find names or products in long lists. Dictionaries, phone books, and product catalogs use linguistic sorting to help people find information quickly. When presenting long lists to your users, your application should use those sorting rules as well. Learn and use the sorting or collation libraries in your programming language or technology environment.

Conclusion

Internationalization is an effort to create products that can be translated and localized for many languages and cultures. Creating an internationalized product requires that you consider and plan for a variety of common technical issues. A few of those issues are character set choice, user-interface design, data formats and sorting. You rarely have to solve those issues yourself; you can often find and use existing libraries for this purpose.

More Resources

 

Internationalization is everyone’s responsibility

Teamwork

In my first jobs out of college, I was part of software internationalization teams that were completely independent from the “core” teams. The core teams created the products for the original target locale (usually the United States), and the internationalization teams created branches of that product and performed all the engineering work to put the product in other markets. The core teams rarely worried about localization or data formatting. They didn’t concerned themselves with layouts or character sets. They rarely even allowed the internationalization teams to push the updated products back into their core code base. Weekly updates to the internationalization branch were always a merge-mess.

Those environments were frustrating and tedious. The worse thing is that those environments were common throughout the software industry. That was 25+ years ago. 

After a few years out of college, I dreamed that internationalization would one day become everyone’s responsibility. I had hope that core product teams would take on internationalization work on their own and that my job as an internationalization engineer would eventually become obsolete. After more than two decades, the situation is better but not perfect.

After all this time, internationalization engineers are still required. I still find dedicated internationalization teams. The biggest improvement, however, is that core teams welcome code updates back into the core product repositories, and some teams even attempt to follow best practices. In the best cases, the internationalization teams provide globalization tools, best practice guidelines, and educational support. However, in most cases the internationalization teams still come in after all the core feature work is finished and retrofit existing code to be internationally-aware. This is almost always error-prone and expensive.

The ideal development environment is one in which internationalization is everyone’s responsibility. Retrofitting a product is simply not the best approach to add cultural awareness and localizability to products. It was time-consuming, expensive, and error-prone decades ago, and it still is.

Once every product and engineering team takes on the responsibility, internationalization actually becomes easier. Once the tasks become a regular, planned part of sprint deliverables, internationalization simply works better. That is, it’s easier to manage, integrate, and implement.

The truth is simple: when internationalization is everyone’s responsibility, you can create a better product.

Internationalization as a form of technical debt

debt.jpg

The term technical debt is often used to label implementation choices that trade long-term goals for limited, short-term solutions. Technical debt has a negative connotation because it means that you have accrued a technical obligation that must be resolved before you can make future progress. Teams take on technical debt for many reasons: short schedules, insufficient knowledge, poor team collaboration, and a host of others. Generally, you want to avoid technical debt because it represents a technical hurdle that you have avoided or have resolved only partly. Technical debt limits the rate at which can innovate and progress in the future.

I’ve often thought about how many product and software teams approach software internationalization. The typical team will begin development with a single geographical market. The team knows that they want to succeed internationally, but they don’t worry about that concern at first. They have schedules, product features, and short-term needs that demand attention. They sacrifice long-term goals for short-term wins. They ignore best practices in software internationalization because of insufficient knowledge or perhaps even laziness. Over time, internationalization work becomes technical debt that must be paid to make further progress into desired markets. 

The disheartening fact is that unattended technical debt in internationalization will eventually require code refactoring, new implementations, and even updated designs. Interest increases rapidly. At some point, you may not be able to do everything yourself, and you might need external help.

Fortunately, internationalization does not have to become technical debt. Basic internationalization usually does not have huge upfront costs for either schedules or resources. Internationalization can be integrated into every sprint or delivery schedule. The very basic concepts are simple, and a little up-front and regular consideration will pay huge dividends.

So, what can you do now to avoid technical debt in internationalization? I suggest you tackle this in a few steps:

  1. Make everyone responsible for knowing the basic issues and concerns in internationalization.
  2. Resolve to implement best practices for each of the concerns that will affect your product.
  3. Make internationalization a part of your ongoing development and review process.

In an upcoming blog, I’ll provide you with some resources for each of these steps. 

All the best,
John O’Conner

Enumerating Android Calendars

AndroidAndroid APIs allow you to query information about calendars in your system. Your application can perform typical read, write, update, and delete (CRUD) operations on calendars using a combination of several classes.

To retrieve calendar data, you’ll use the following classes:

  • Context
  • ContentResolver
  • Cursor

Android security requires that you announce your application’s intentions for calendar access. You indicate this in the application’s manifest file. The following manifest entry tells the Android platform that your application will read calendar information:

<manifest>
...
</application>
<uses-permission android:name="android.permission.READ_CALENDAR"/>
</manifest>

Make sure that the <uses-permission> is immediately outside the <application> tag. If you do not put this permission indicator in your manifest file, your application will throw security exceptions. More importantly, it won’t be able to access calendar information.

Why do you need three classes (Context, ContentResolver, and Cursor) to retrieve calendar information? First, a cursor is used to iterate through calendar information. Second, the Cursor is provided by a ContentResolver. Finally, you need a Context to retrieve a content resolver.

Within an Activity class, which represents a user-interface view, you can get a content resolver easily with the getContentResolver method. An Activity is a subclass of Context. That’s simple enough. However, if you want to separate concerns, you may want to create a calendar service to isolate calendar details from the rest of your application. As a separate, non-Activity, class, your CalendarService (implementation left to the reader) may not have access to a context. So you may need to provide a resolver or context from your Activity when instantiating a CalendarService instance. 

Here’s how you retrieve the content resolver:

import android.content.ContentResolver;
// if you are calling from within an Activity
ContentResolver resolver = getContentResolver(); // if you are calling from elsewhere with access to a Context ContentResolver resolver = context.getContentResolver();

Once you have the resolver, you can then query it for the exact data items needed. Calendars have a lot of information including name, time zone, and colors. Tell the resolver exactly what you want by declaring a projection. A projection is simply a String array that indicates the fields that you want to extract from a calendar row in Android’s databases.

The following code shows how to perform the query:

import android.provider.CalendarContract;
...
String[] projection = {CalendarContract.Calendars._ID,
    CalendarContract.Calendars.NAME,
    CalendarContract.Calendars.CALENDAR_DISPLAY_NAME,
    CalendarContract.Calendars.CALENDAR_TIME_ZONE,
    CalendarContract.Calendars.CALENDAR_COLOR,
    CalendarContract.Calendars.IS_PRIMARY,
    CalendarContract.Calendars.VISIBLE};
String selection = String.format("%s = 1", CalendarContract.Calendars.VISIBLE);
Cursor c = contentResolver.query(CalendarContract.Calendars.CONTENT_URI,
    projection,
    selection,
    null, null);
while(c.moveToNext()) {
    // the cursor, c, contains all the projection data items
    // access the cursor’s contents by array index as declared in
    // your projection 
    long id = cursor.getLong(0);
String name = cursor.getString(1)); ... } c.close();

This particular example simply iterates over the calendar meta-data, not actual events. You’ll need an additional query for that.

Comparison of the Instant and Date Classes

Clock1

Java 8 has a new java.time package, and one of its new classes is Instant. The best counterpart to this in past platforms is the java.util.Date class.

There are a couple notable differences between Date and Instant:

  • Date has very few useful methods, and Instant provides many.
  • Instant provides finer time granularity and a longer timeline.

Most of Date’s methods have been deprecated. Date manipulation and formatting have been delegated to the Calendar and DateFormat classes. In comparison, the Instant class allows you to perform some very basic functionality directly. You can add seconds and milliseconds for example. You can parse and generate ISO 8601 date strings with Instant as well. ISO 8601 dates have a consistent form across all locales and look like this: 2014-08-12T14:51:53:00Z. Most of the Instant methods are purely for convenience. You can do the similar things with Date using the Calendar and DateFormat classes.

Both Date and Instant have the same epoch (1970-01-01T00:00:00Z), but Instant can represent a much longer timeline. Date’s internal structure uses a long to represent milliseconds from the epoch. Instant, however, uses a long to represent seconds from epoch AND an int to represent nanoseconds of that second. That certainly means you don’t have to worry about date rollover problems in the near future.

The differences between Date and Instant are relatively minor, but these classes really are the starting point of of a more thorough discussion of the java.time package. Expect more details in the near future.

The New Date and Time API in Java 8

It’s no secret that developers have been unsatisfied with the existing Date and Calendar classes of previous Java versions. I’ve heard complaints that the Calendar API is difficult to understand, lacks needed features, and even causes unexpected concurrency bugs. As a result, developers sometimes migrated to the popular Joda Time library, which apparently satisfied their needs.

I’ve always suspected that the standard Date and Calendar API would be updated (or replaced), but I can’t help being a little surprised to see the new java.time package in Java 8. I’m not so surprised that it exists but that it is so comprehensive…and that it seems so familiar. If you’re one of those who moved to Joda Time, you’ll feel a sense of déjà vu. The new Java 8 library looks a lot like Joda Time. After a little snooping, now I understand why. The new Date and Time API was created by Stephen Colebourne, the author of Joda Time. Of course, he worked with Oracle and others within the umbrella of the JSR 310 proposal, but this is Joda Time in many ways.

As I take a first browse of the new API, I noted a couple simple thoughts: the API is feature-rich and complete, and it’s still complex.

Time, dates, and date manipulations are not simple, and no API is going to make that change . However, I think that this new API does a great job of making things less complicated than before. If you haven’t looked at it yet, please check it out. Let me know what you think. I’ll do the same and share how to use the APIs in upcoming blogs.

//John