In order to deliver on the promise "write once, run anywhere," the engineers at Java designed the famous Java Virtual Machine. True, your program will run anywhere there is a JVM, but what about users in other countries? Will they have to know English to use your application? Java 1.1 answers that question with a resounding "no," backed up by various classes that are designed to make it easy for you to write a "global" application. In this section we'll talk about the concepts of internationalization and the classes that support them.
Internationalization programming revolves around the
Locale class. The class itself is very simple;
it encapsulates a country code, a language code, and a rarely used
variant code. Commonly used
languages and countries are defined as constants in the
Locale class. (It's ironic that these
names are all in English.) You can retrieve the
codes or readable names, as follows:
Locale l = Locale.ITALIAN; System.out.println(l.getCountry()); // IT System.out.println(l.getDisplayCountry()); // Italy System.out.println(l.getLanguage()); // it System.out.println(l.getDisplayLanguage()); // Italian
The country codes comply with ISO 639. A complete list of country codes is at http://www.ics.uci.edu/pub/ietf/http/related/iso639.txt. The language codes comply with ISO 3166. A complete list of language codes is at http://www.chemie.fu-berlin.de/diverse/doc/ISO_3166.html. There is no official set of variant codes; they are designated as vendor-specific or platform-specific.
Various classes throughout the Java API use a
Locale to decide how to represent
themselves. We have already seen how the
DateFormat class uses
Locales to determine how to
format and parse strings.
If you're writing an internationalized program, you want all the text that
is displayed by your application to be in the correct language. Given
what you have just learned about Locale,
you could print out different messages by testing the
Locale. This gets cumbersome quickly,
however, because the messages for all Locales
are embedded in your source code. ResourceBundle
and its subclasses offer a cleaner, more flexible solution.
A ResourceBundle is a collection of objects that
your application can access by name, much like a
Hashtable
with String keys.
The same ResourceBundle may be defined for many different
Locales. To get a particular
ResourceBundle, call the
factory method ResourceBundle.getBundle(),
which accepts the name of a ResourceBundle
and a Locale. The following example
gets a ResourceBundle for two
Locales, retrieves a string message from
it, and prints the message. We'll define the ResourceBundles
later to make this example work.
import java.util.*;
public class Hello {
public static void main(String[] args) {
ResourceBundle bun;
bun = ResourceBundle.getBundle("Message", Locale.ITALY);
System.out.println(bun.getString("HelloMessage"));
bun = ResourceBundle.getBundle("Message", Locale.US);
System.out.println(bun.getString("HelloMessage"));
}
}The getBundle() method throws the run-time
exception MissingResourceException if an
appropriate ResourceBundle cannot
be located.
Locales are defined in three ways. They can
be standalone classes, in which case they will be either subclasses of
ListResourceBundle or direct subclasses of
ResourceBundle. They can also be defined
by a property file, in which case they will be represented at run-time by
a PropertyResourceBundle object. When
you call ResourceBundle.getBundle(), either
a matching class is returned or an instance of PropertyResourceBundle
corresponding to a matching property file. The algorithm used by
getBundle() is based on appending the
country and language codes of the requested Locale
to the name of the resource. Specifically, it searches for resources using
this order:
name_language_country_variantname_language_countryname_languagenamename_default-language_default-country_default-variantname_default-language_default-countryname_default-language
In the example above, when we try to get the
ResourceBundle
named Message, specific to
Locale.ITALY, it searches for the following
names (no variant codes are in the
Locales we are using):
Message_it_ITMessage_itMessageMessage_en_USMessage_en
Let's define the Message_it_IT
ResourceBundle now, using a subclass
of ListResourceBundle:
import java.util.*;
public class Message_it_IT extends ListResourceBundle {
public Object[][] getContents() {
return contents;
}
static final Object[][] contents = {
{"HelloMessage", "Buon giorno, world!"},
{"OtherMessage", "Ciao."},
};
}ListResourceBundle makes it easy to define a
ResourceBundle class; all we have to do is
override the getContents() method.
Now let's define a ResourceBundle for
Locale.US. This time, we'll make a
property file. Save the following data in a file called
Message_en_US.properties:
HelloMessage=Hello, world! OtherMessage=Bye.
So what happens if somebody runs your program in
Locale.FRANCE, and no
ResourceBundle is defined for that
Locale? To avoid a run-time
MissingResourceException,
it's a good idea to define
a default ResourceBundle. So in
our example, you could change the name of the property file
to Message.properties. That way, if a
language-specific or country-specific ResourceBundle
cannot be found, your application can still run.
The java.text package includes,
among other things, a set of classes designed for generating and
parsing string representations of objects. We have already seen
one of these classes, DateFormat.
In this section we'll talk about the other format classes,
NumberFormat,
ChoiceFormat, and
MessageFormat.
The NumberFormat class can be
used to format and parse currency, percents, or plain old numbers. Like
DateFormat,
NumberFormat
is an abstract class. However, it has several useful factory methods.
For example, to generate currency strings, use
getCurrencyInstance():
double salary = 1234.56;
String here = // $1,234.56
NumberFormat.getCurrencyInstance().format(salary);
String italy = // L 1.234,56
NumberFormat.getCurrencyInstance(Locale.ITALY).format(salary);The first statement generates an American salary, with a dollar sign,
a comma to separate thousands, and a period as a decimal point. The
second statement presents the same string in Italian, with a lire
sign, a period to separate thousands, and a comma as a decimal point.
Remember that the NumberFormat worries
about format only; it doesn't attempt to do currency conversion.
(Among other things, that would require access to a dynamically
updated table and exchange rates--a good opportunity for a Java Bean
but too much to ask of a simple formatter.)
Likewise, getPercentInstance()
returns a formatter you can use for generating and parsing
percents. If you do not specify a Locale
when calling a getInstance() method,
the default Locale is used:
NumberFormat pf = NumberFormat.getPercentInstance();
System.out.println(pf.format(progress)); // 44%
try {
System.out.println(pf.parse("77.2%")); // 0.772
}
catch (ParseException e) {}And if you just want to generate and parse plain old numbers, use
a NumberFormat returned by
getInstance() or its
equivalent, getNumberInstance():
NumberFormat guiseppe = NumberFormat.getInstance(Locale.ITALY);
NumberFormat joe = NumberFormat.getInstance();// defaults to Locale.US
try {
double theValue = guiseppe.parse("34.663,252").doubleValue();
System.out.println(joe.format(theValue)); // 34,663.252
}
catch (ParseException e) {}
We use guiseppe to parse a number in Italian format
(periods separate thousands, comma is the decimal point). The return
type of parse() is Number, so we
use the doubleValue() method to retrieve the value
of the Number as a double. Then
we use joe to format the number correctly for the
default (U.S.) locale.
Here's a list of the factory methods for text formatters in the java.text
package:
DateFormat.getDateInstance()
DateFormat.getDateInstance(int style)
DateFormat.getDateInstance(int style, Locale aLocale)
DateFormat.getDateTimeInstance()
DateFormat.getDateTimeInstance(int dateStyle, int timeStyle)
DateFormat.getDateTimeInstance(int dateStyle, int timeStyle,
Locale aLocale)
DateFormat.getInstance()
DateFormat.getTimeInstance()
DateFormat.getTimeInstance(int style)
DateFormat.getTimeInstance(int style, Locale aLocale)
NumberFormat.getCurrencyInstance()
NumberFormat.getCurrencyInstance(Locale inLocale)
NumberFormat.getInstance()
NumberFormat.getInstance(Locale inLocale)
NumberFormat.getNumberInstance()
NumberFormat.getNumberInstance(Locale inLocale)
NumberFormat.getPercentInstance()
NumberFormat.getPercentInstance(Locale inLocale)
Thus far we've seen how to format dates and numbers as text. Now
we'll take a look at a class, ChoiceFormat,
that maps numerical ranges to text.
ChoiceFormat is constructed by
specifying the numerical ranges and the strings that correspond to them.
One constructor accepts an array of doubles
and an array of Strings, where each string corresponds to the
range running from the matching number up through the next
number:
double[] limits = {0, 20, 40};
String[] labels = {"young", "less young", "old"};
ChoiceFormat cf = new ChoiceFormat(limits, labels);
System.out.println(cf.format(12)); // young
System.out.println(cf.format(26)); // less youngYou can specify both the limits and the labels using a special
string in another ChoiceFormat
constructor:
ChoiceFormat cf = new ChoiceFormat("0#young|20#less young|40#old");
System.out.println(cf.format(40)); // old
System.out.println(cf.format(50)); // oldThe limit and value pairs are separated by pipe characters (|--also known as a "vertical bar"), while the number sign serves to separate each limit from its corresponding value.
To complete our discussion of the formatting classes,
we'll take a look at another class,
MessageFormat,
that helps you construct human-readable messages. To construct
a MessageFormat, pass it a
pattern string. A pattern
string is a lot like the string you feed to
printf() in C, although the
syntax is different. Arguments are delineated by curly
brackets and may include information about how they
should be formatted. Each argument consists of a number,
an optional type, and an optional style. These are summarized in
Table 9.9.
| Type | Styles |
|---|---|
| choice | pattern |
| date | short, medium, long, full, pattern |
| number | integer, percent, currency, pattern |
| time | short, medium, long, full, pattern |
Let's use an example to clarify all of this:
MessageFormat mf = new MessageFormat("You have {0} messages.");
Object[] arguments = {"no"};
System.out.println(mf.format(arguments)); // You have no messages.
We start by constructing a MessageFormat
object; the argument to the constructor is the pattern on which
messages will be based. The special incantation {0} means "in this
position, substitute element 0 from the array passed as an argument to
the format() method." Thus, we construct a
MessageFormat object with a pattern, which
is a template on which messages are based. When we generate a message,
by calling format(), we pass in values to
fill the blank spaces in the template. In this case, we
pass the array arguments[] to
mf.format; this substitutes
arguments[0], yielding the result "You
have no messages."
Let's try this example again, except we'll show how to format a number and a date instead of a string argument:
MessageFormat mf = new MessageFormat(
"You have {0, number, integer} messages on {1, date, long}.");
Object[] arguments = {new Integer(93), new Date()};
System.out.println(mf.format(arguments));
// You have 93 messages on April 10, 1997.In this example, we need to fill in two spaces in the template, and
therefore we need two elements in the
arguments[] array. Element 0 must be a
number and is formatted as an integer. Element 1 must be a
Date and will be printed in the long
format. When we call format(), the
arguments[] array supplies these two
values.
This is still sloppy. What if there is only one message?
To make this grammatically correct, we can embed a
ChoiceFormat-style pattern string in
our MessageFormat pattern string:
MessageFormat mf = new MessageFormat(
"You have {0, number, integer} message{0, choice, 0#s|1#|2#s}.");
Object[] arguments = {new Integer(1)};
System.out.println(mf.format(arguments)); // You have 1 message.In this case, we use element 0 of
arguments[] twice: once to supply the
number of messages, and once to provide input to the
ChoiceFormat pattern. The pattern says to
add an s if argument 0 has the value zero or is two or more.
Finally, a few words on how to be clever. If you want to write
international programs, you can use resource bundles to supply the
strings for your MessageFormat objects.
This way, you can automatically format messages that are in the
appropriate language with dates and other language-dependent fields
handled appropriately.
In this context, it's helpful to realize that messages don't need to read elements from the array in order. In English, you would say "Disk C has 123 files"; in some other language, you might say "123 files are on Disk C." You could implement both messages with the same set of arguments:
MessageFormat m1 = new MessageFormat(
"Disk {0} has {1, number, integer} files.");
MessageFormat m2 = new MessageFormat(
"{1, number, integer} files are on disk {0}.");
Object[] arguments = {"C", new Integer(123)};In real life, the code could be even more compact; you'd only use a single
MessageFormat object, initialized
with a string taken from a resource bundle.