In Defense of Parameterized Types

Parameterized types, also known as generics, were added to the Java language with the 1.5 release (JLS 3rd edition), and have been a point of contention ever since. It doesn't take long to find a well-known pundit condemning parameterized types as “half implemented.” Usually, the condemnation is based on comparison with C++ templates, and how Java doesn't provide the same functionality.

In the interest of full disclosure, I have to say that I am a novice user of C++ templates. I was still working with C++ when they were introduced (in the second edition of Stroustrup), but did not make extensive use of them. As I recall, one reason was that they caused problems with shared library initialization in our cross-platform application. Since that time, the Standard Template Library has appeared, and I've read that the modern templating engine actually provides a Turing complete programming language. For all I know, it may be self-aware.

As a result, I focused on the benefits of Java's parameterized types rather than their limitations. However, before talking about those benefits, I need to face some of the limitations head-on.

The Ugly Bits

Not Actually Typesafe

One of the common uses for parameterized types are “typesafe collections”: collections that only accept values of a specific type. For example, the following code won't compile, because the list is parameterized with Integer, and the code is trying to add a String:

List<Integer> data = new ArrayList<Integer>();
data.add("a string");

However, the ArrayList instance doesn't know that it's only supposed to contain instances of Integer. That information is “erased” by the compiler; the bytecode deals with instances of Object. You can easily cast a parameterized list it to a non-parameterized list and add your string, with only a warning from the compiler. And then, when you run the code, thinking that you have a truly typesafe list, a ClassCastException tells you that you don't.

List data = new ArrayList<Integer>();
data.add("a string");
List<Integer> data2 = data;
Integer value = data2.get(0);

While many people condemn this behavior, I consider it a minor flaw: if one part of your program is putting strings into a collection that another part thinks should contain numbers, you've got bigger problems. Problems that a truly typesafe collection wouldn't solve. And moreover, if you think that C++-style templates would prevent the problem, you might want to read this.

Collections Don't Follow Array Inheritance Model

When I first saw parameterized collections in a 1.5 preview presentation, I remember being astonished that they did not follow the inheritance model established by arrays. Consider the following:

Object[] foo = new String[] {};

ArrayList<Object> bar = new ArrayList<String>();

ArrayList<? extends Object> baz = new ArrayList<String>();

The first line compiles, because String is a subclass of Object, and therefore Java considers String[] to be a subclass of Object[]. To a programmer who thinks of lists as equivalent to arrays, the second line should compile as well, but the compiler rejects it with an “incompatible types” error.

To make the compiler accept this assignment, you need to use the third line, which uses a wildcard parameterization (“extends Object” is unnecessary in this case, but it makes the example more clear). What astonished me was that the compiler clearly recognizes the that the two parameterizations are related, otherwise it wouldn't accept the wildcard. Why then, is it unable to infer the relationship without the wildcard?

The answer is that the compiler doesn't know what the parameterized type does with its parameters. Unlike arrays, collection classes are not part of the language: to the compiler, a List is an arbitrary class that could do anything at all. To make this distinction more clear, consider the following:

Comparator<String> foo = // some implementation class
Comparator<Object> bar = foo;

Clearly, a class designed to compare two strings could not be applied to arbitrary objects. Therefore, the compiler rightly rejects this assignment statement — which is equivalent to the second line of the previous example.

The wildcard gives the compiler more information about the use of the parameterized class: it tells the compiler that there is a relationship between instances with two differing parameterizations, provided that the types used for parameterization are compatible.

Too verbose

To quote Emperor Joseph II: there are “too many notes.” Consider the following, which might be used to cache database rows by their primary key:

Map<List<Object>,List<Object>> data = new HashMap<List<Object>,List<Object>>();

To me, the object is lost in all of the angle brackets. And it just gets worse when such objects are passed to or returned from a method. I truly wish that the typedef keyword had been added to the language at the same time as parameterization:

typedef List<Object> Key;
typedef List<Object> Row;
Map<Key,Row> data = new HashMap<Key,Row>();

Not the end of the world, to be sure. And you could always create concrete classes Key and Row — in fact, if you're writing a database cache, you should probably do just that.

The Good Stuff

At this point you might think my goal has been to “praise with faint damns,” but that's not the case. The ugly bits can get in the way of effectively using parameterized types, and I haven't even mentioned all of them. To better understand the mechanics of parameterization I recommend Angelika Langer's Generic's FAQ — which is over 500 pages long.

However, even with the ugly bits, parameterized types are extremely useful. And in my mind, their true value comes when you replace the generic parameters with concrete types. When you do this, you are add information to your program. Information that the compiler can use to check your work, and information that you as a programmer can use to understand the program flow.

Would you write a program in which each method specified its parameters and return type as Object? If not, then why would you do the same with Collection<T>? I find that it's very rare that I need to work with generic collections — they're typically confined to library methods. Elsewhere, I work with collections of concrete types, and it's senseless to leave those collections generified.

API with Typed Collections

A typical API — including the boundary layers within an application — deals with specific object types. If that API takes or returns a collection of objects, it's valuable to know the type of objects in the collection. As an example, consider the API of a workflow application: in particular, the object that represents a job, which has a method to retrieve the active tasks in that job.

public interface Job
{
    List<Task> getActiveTasks();

This is a case where you may feel that “not really typesafe” is an issue. However, in practice it isn't: as I said above, if you can't trust that your library won't slip some incompatible object into a collection, you have bigger problems. Even in the case of the parameters of library methods, it's not an issue: worst case is that the client passes invalid data, and gets a ClassCastException.

The only viable alternative is to use array types, and they come with their own limitations. Foremost of these is that arrays are fixed size, which is an inconvenience for both the client of our API (which can't easily remove tasks of no interest), and the server (which needs to know the number of tasks before building the array).

Associating Type Information with Untyped Collections

JDK 1.5 was released in 2004, yet there are still many libraries that use non-parameterized collections — including parts of the JDK itself. Often you will know the type of object held by the collection, so there's no reason that your code should be littered with casts. Instead, use a method like this to “attach” parameterization to the collection:

@SuppressWarnings("unchecked")
public static <T> List<T> cast(List<?> list, Class<T> klass)
{
    for (Object obj : list)
    {
        klass.cast(obj);
    }
    return (List<T>)list;
}

The point of this method is to limit the number of SuppressWarnings annotations in your code. The annotation is needed because the compiler isn't very happy about the cast in the return statement: it has no way to know that you validated the contents of the list before making that cast. While you could use this annotation in your mainline code, doing so would hide any real warnings.

The klass parameter serves two purposes. First, because it is parameterized, it establishes the parameterization returned by the method (note that this requires passing either an explicit X.class value or a variable that is similarly parameterized). Second, it allows us to verify that the list does indeed contain only instances of that class, ensuring that the unchecked cast is safe.

In almost all cases, this method is sufficient: it establishes the type of the list for the compiler, and the compiler won't let you subsequently add instances of any other type. If you are truly paranoid that something might add a different object, the Collections class provides the method checkedList(), which wraps the original list in a decorator that prevents addition of other objects. But again, unless your code intentionally bypasses the compiler's checks, this wrapper is unnecessary.

Template Methods

One of the “Gang of Four” design patterns, Template Method extracts boilerplate code into a superclass method, that then calls subclass methods to perform specific operations. It is useful when the same general sequence of steps can be implemented in different ways, with different data.

As an example, a Swing application might use a subclass of my AsynchronousOperation to perform long-running operations outside of the event dispatch thread. All such operations have the same two steps: (1) load data on the background thread, and (2) display data or handle exception on the event dispatch thread. The templated method in this case is the run() method, shown here sans exception handling:

public abstract class AsynchronousOperation<T>
implements Runnable
{
    public final void run()
    {
        final T result = performOperation();
        SwingUtilities.invokeLater(new Runnable()
        {
            public void run()
            {
                onSuccess(result);
            }
        });
    }

    protected abstract T performOperation();

    protected abstract void onSuccess(T result);
}

A typical subclass (also heavily abridged) might read the contents of a file and use it to update a JTable in the UI:

public class FileLoader
extends AsynchronousOperation<TableModel>
{
    private final JTable _table;

    public FileLoader(JTable table)
    {
        _table = table;
    }

    @Override
    protected TableModel performOperation()
    {
        // load the file and populate the model
    }

    @Override
    protected void onSuccess(TableModel result)
    {
        _table.setModel(result);
    }
}

Note that the methods use concrete types; only the class declaration is parameterized. How, then, are these methods called from the superclass, which refers to them via (erased) type parameters? The answer is that the compiler creates bridge methods that call the actual methods defined by the subclass. Running javap on the subclass shows this:

public class com.kdgregory.example.generics.TemplateMethodExample extends net.sf.swinglib.AsynchronousOperation{
    private javax.swing.JTable _table;
    public com.kdgregory.example.generics.TemplateMethodExample(javax.swing.JTable);
    protected java.lang.Object performOperation()       throws java.lang.Exception;
    protected javax.swing.table.TableModel performOperation()       throws java.lang.Exception;
    protected void onSuccess(java.lang.Object);
    protected void onSuccess(javax.swing.table.TableModel);
}

The Java compiler creates bridge methods whenever you provide a concrete implementation of a parameterized method, including implementations of parameterized interfaces. The bridge method verifies the type of the actual argument (possibly throwing ClassCastException), then calls the concrete method. In the case of parameterized interfaces, such as Comparator&lt;T&gt;, this behavior is similar to a typical pre-1.5 implementation, where the parameter is defined by the interface as Object, and you cast it and assign to a typed variable for use.

Closing Thoughts

You may have noticed that throughout this article I used the term “parameterized types.” Not “generics,” and certainly not “templates.” I did this as a reminder that the chief benefit to these types comes when you apply concrete parameters, rather than using the class in a generic fashion.

To the JVM, code that uses parameterization is no different from pre-1.5 code using explicit casts. The benefit accrues entirely to the programmer: first, by removing the clutter and finger fatigue of those explicit casts. And second, by giving the compiler enough information to double-check your assumptions.

Copyright © Keith D Gregory, all rights reserved