Vectors and hashtables are collection classes. Each stores a group of objects according to a particular retrieval scheme. Aside from that, they are not particularly closely related things. A hashtable is a dictionary; it stores and retrieves objects by a key value. A vector, on the other hand, holds an ordered collection of elements. It's essentially a dynamic array. Both of these, however, have more subtle characteristics in common. First, they are two of the most useful aspects of the core Java distribution. Second, they both take full advantage of Java's dynamic nature at the expense of some of its more static type safety.
If you work with dictionaries or associative arrays in other languages, you should understand how useful these classes are. If you are someone who has worked in C or another static language, you should find collections to be truly magical. They are part of what makes Java powerful and dynamic. Being able to work with lists of objects and make associations between them is an abstraction from the details of the types. It lets you think about the problems at a higher level and saves you from having to reproduce common structures every time you need them.
A Vector is a dynamic array; it can grow to
accommodate new items. You can also insert and remove elements at
arbitrary positions within it. As with other mutable objects in Java,
Vector is thread-safe. The Vector
class works directly with the type Object, so we
can use Vectors with instances of any class.[3]
We can even put different kinds of Objects in a
Vector together; the Vector
doesn't know the difference.
[3] In C++, where classes don't derive from a single
Objectclass that supplies a base type and common methods, the elements of a collection would usually be derived from some common collectable class. This forces the use of multiple inheritance and brings its associated problems.
As you might guess, this is where things get tricky. To do
anything useful with an Object after we take it
back out of a Vector, we have to cast it back
(narrow) it to its original type. This can be done with safety in Java
because the cast is checked at run-time. Java throws a
ClassCastException if we try to cast an object to
the wrong type. However, this need for casting means that your code
must remember types or methodically test them with
instanceof. That is the price we pay for having a
completely dynamic collection class that operates on all
types.
You might wonder if you can subclass Vector to
produce a class that looks like a Vector, but that
works on just one type of element in a type-safe way. Unfortunately,
the answer is no. We could override Vector's
methods to make a Vector that rejects the wrong
type of element at run-time, but this does not provide any new
compile-time, static type safety. In C++, templates provide a safe
mechanism for parameterizing types by restricting the types of objects
used at compile-time. The keyword generic is a
reserved word in Java. This means that it's possible that future
versions might support C++-style templates, using
generic to allow statically checked parameterized
types.
We can construct a Vector with default characteristics
and add elements to it using addElement() and
insertElement():
Vector things = new Vector(); String one = "one"; String two = "two"; String three = "three"; things.addElement( one ); things.addElement( three ); things.insertElementAt( two, 1 );
things now contains three String
objects in the order "one," "two," and "three." We can
retrieve objects by their position with
elementAt(), firstElement(), and
lastElement():
String s1 = (String)things.firstElement(); // "one" String s3 = (String)things.lastElement(); // "three" String s2 = (String)things.elementAt(1); // "two"
We have to cast each Object back to a
String in order to assign it a
String reference.
ClassCastException is a type of
RuntimeException, so we can neglect to guard for
the exception if we are feeling confident about the type we are
retrieving. Often, as in this example, you'll just have one type
of object in the Vector. If we were unsure about
the types of objects we were retrieving, we would want to be prepared
to catch the ClassCastException or test the type
explicitly with the instanceof operator.
We can search for an item in a Vector with the
indexOf() method:
int i = things.indexOf( three ); // i = 2
indexOf() returns a value of -1
if the object is not found. As a convenience, we can also use
contains() simply to test for the presence of the
object.
Finally, removeElement() removes a specified
Object from the Vector:
things.removeElement( two );
The element formerly at position three now becomes the second element.
The size() method reports the number of objects
currently in the Vector. You might think of using
this to loop through all elements of a Vector,
using elementAt() to get at each element. This
works just fine, but there is a more general way to operate on a complete
set of elements like those in a Vector.
The java.util.Enumeration interface can be used
by any sort of set to provide serial access to its elements. An object
that implements the Enumeration interface presents
two methods: nextElement() and hasMoreElements().
nextElement() returns an Object
type, so it can be used with any kind of collection. As with taking objects
from a Vector, you need to know or determine
what the objects are and cast them to the appropriate types before using
them.
Enumeration is useful because it is a general
interface for any kind of object that wants to provide "one shot" sequential
access to its elements. All you have to do is provide a
hasMoreElements() test
and a nextElement() iterator and declare that your
class implements java.util.Enumeration.
Another advantage of an Enumeration is that you
don't have to provide all values up front; you can provide each value
as it's requested with nextElement().
An Enumeration does not guarantee the order
in which elements are returned, however, so if order is important you don't want to use an
Enumeration. You can iterate through the
elements in an Enumeration only once; there is no
way to reset it to the beginning or move backwards through the
elements.
A Vector returns an
Enumeration of its contents when we call the
elements() method:
Enumeration e = myVector.elements();
while ( e.hasMoreElements() ) {
String s = (String)e.nextElement();
System.out.println( s ):
} The above code loops three times and calls
nextElement() to fetch our strings. The actual type
of object returned by elements() is a
VectorEnumeration, but we don't have to worry
about that. We can always refer to an Enumeration
simply by its interface.
Note that Vector does not implement the
Enumeration interface. If it did, that would put a
serious limitation on Vector because we could
cycle through the elements in it only once. That's clearly not the
purpose of a Vector, which is why
Vector instead provides a method that returns an
Enumeration.
As I said earlier, a hashtable is a dictionary, similar to an associative array. A hashtable stores and retrieves elements with key values; they are very useful for things like caches and minimalist databases. When you store a value in a hashtable, you associate a key with that value. When you need to look up the value, the hashtable retrieves it efficiently using the key. The name hashtable itself refers to how the indexing and storage of elements is performed, as we'll discuss shortly. First I want to talk about how to use a hashtable.
The java.util.Hashtable class implements a
hashtable that, like Vector, operates on the type
Object. A Hashtable stores an
element of type Object and associates it with a
key, also of type Object. In this way, we can
index arbitrary types of elements using arbitrary types as keys. As
with Vector, casting is generally required to
narrow objects back to their original type after pulling them out of a
hashtable.
A Hashtable is quite easy to use. We can use
the put() method to store items:
Hashtable dates = new Hashtable(); dates.put( "christmas", new GregorianCalendar( 1997, Calendar.DECEMBER, 25 ) ); dates.put( "independence", new GregorianCalendar( 1997, Calendar.JULY, 4 ) ); dates.put( "groundhog", new GregorianCalendar( 1997, Calendar.FEBRUARY, 2 ) );
First we create a new Hashtable. Then we add three
GregorianCalendar objects to the
hashtable, using String
objects as keys. The key is the first argument to
put(); the value is the second. Only one value can
be stored per key. If we try to store a second object under a key that
already exists in the Hashtable, the old element is
booted out and replaced by the new one. The return value of the
put() method is normally null,
but if the call to put() results in replacing an
element, the method instead returns the old stored
Object.
We can now use the get() method to retrieve each
of the above dates by name, using the String
key by which it was indexed:
GregorianCalendar g = (GregorianCalendar)dates.get( "christmas" );
The get() method returns a null
value if no element exists for the given key. The cast is required to
narrow the returned object back to type GregorianCalendar. I
hope you can see the advantage of using a Hashtable
over a regular array. Each value is indexed by a key instead of a
simple number, so unlike a simple array, we don't have to
remember where each GregorianCalendar is stored.
Once we've put a value in a Hashtable,
we can take it back out with the remove() method,
again using the key to access the value:
dates.remove("christmas");
We can test for the existence of a key with containsKey():
if ( dates.containsKey( "groundhog" ) ) { // yes
Just like with a Vector, we're dealing
with a set of items. Actually, we're dealing with two sets: keys
and values. The Hashtable class has two methods,
keys() and elements(),
for getting at these sets. The keys() method
returns an Enumeration of the keys for all of
the elements in the Hashtable. We can use this
Enumeration to loop through all of the keys:
for (Enumeration e = dates.keys(); e.hasMoreElements(); ) {
String key = (String)e.nextElement();
...
} Similarly, elements() provides an
Enumeration of the elements themselves.
If you've used a hashtable before, you've probably guessed that there's
more going on behind the scenes than we've let on.
An element in a hashtable is not associated with a key strictly by the
key's identity, but rather by the key's contents. This allows
keys that are equivalent to access the same object.
By "equivalent," we mean those objects that compare
true with
equals().
So, if you store an object in a Hashtable
using another object as a key, you
can use any object that equals() tells you
is equivalent to retrieve the stored object.
It's easy to see why equivalence is important if you remember our
discussion of
strings. It is easy to create two
String objects that have the
same text in them but which come from different sources in Java.
In this case, the == operator will tell you that the objects are
different, but the equals() method of the
String class will tell you that they are
equivalent. Because they are equivalent, if we store an object in
a Hashtable using one of the
String objects as a key, we can retrieve
it using the other.
Okay, since Hashtables have a notion of
equivalent keys, what does the
hashcode do? The hashcode is like a fingerprint
of the object's data content. The
Hashtable uses the hashcode to
store the objects so that they can be retrieved efficiently.
The hashcode is nothing more than a number (an integer) that is a function of
the data. The number always turns out the same for identical data,
but the hashing function is intentionally designed to generate as
random a number as possible for different combinations of data. That is,
a very small change in the data should produce a big difference in the
number. It is unlikely that two similar data sets
will produce the same hashcode.
A Hashtable really just keeps a number of
lists of objects,
but it puts objects into the lists based on their hashcode. So when it
wants to find the object again, it can look at the hashcode and know
immediately how to get to the appropriate list. The Hashtable still might
end up with a number of objects to examine, but the list should be short.
For each object it finds, it does the following comparison to see if the
key matches:
if ((keyHashcode == storedKeyHashcode) && key.equals(storedKey))
return object;
There is no prescribed way to generate hashcodes. The only requirement
is that they be somewhat randomly distributed and reproducible (based on
the data). This means that two objects that are
not the same could end up with the same hashcode by accident.
This is unlikely (there are 232 possible
integer values), but it doesn't cause a problem because
the Hashtable ultimately checks the actual
keys, as well as the hashcodes, to see if they
are equal. Therefore, even if
two objects have the same hashcode, they can still co-exist
in the hashtable.
Here's an interesting example. In Java 1.1, the
hashCode() method for the
String class does not take into account
all characters in the string. Instead, it samples a representative
few, in order to save time. Knowing how it works, we can construct
two Strings that are obviously not
equivalent but nonetheless have the same hashcode. We can show that
they can still serve as unique keys in a
Hashtable:
String s1 = "xxxxxxxxxxxxxxxx", s2 = "x1x2x3x4x5x6x7x8"; System.out.println( s1.hashCode()); System.out.println( s2.hashCode()); // Identical in 1.1 Hashtable h = new Hashtable(); h.put(s1, new Integer(1)); h.put(s2, new Integer(2)); System.out.println( h.get(s1) ); // gets integer 1 System.out.println( h.get(s2) ); // gets integer 2
Hashcodes are computed by an object's
hashCode() method, which is inherited from the
Object class if it isn't overridden.
The default hashCode()
method simply assigns each object instance a unique number to be used
as a hashcode. If a class does not override this method, each instance
of the class will have a unique hashcode. This goes along well with the
default implementation of equals() in
Object, which only compares objects for
identity using ==. You're probably used to overriding
equals() in any classes for which
equivalence is meaningful. Likewise, if you want equivalent objects
to serve as equivalent keys, you
need to override the hashCode() method as
well to return identical
hashcode values. To do this, you need to create some suitably complex
and arbitrary function of the contents of your object.
The only criterion for the function is that it should be almost
certain to return different values for objects with different data,
but the same value for objects with identical data.
java.util.Dictionary is the
abstract superclass of
Hashtable. It lays out the basic
get(), put(), and
remove() functionality for dictionary-style
collections. You could derive other types of dictionaries from this
class. For example, you could implement a dictionary with a different
storage format, such as a binary tree.