Lecture 15: Abstracting over types

8.10

Lecture 15: Abstracting over types

Working with generic types, abstracting over types

15.1 The need for more abstraction

A few weeks ago, in Lecture 5: Methods for self-referential lists, we implemented methods on Books and ILoBooks to answer some simple questions:

Produce the list of all Books published before a given year
Sort this list of Books by their authors’ names

These methods were neatly structurally recursive, and were the translation to Java of functions we had written in 1114 before.

But they weren’t particularly reusable: if we wanted to sort a list of Books by their prices, or produce a list of Books written by a particular author, we’d have to implement these methods a second time, with slightly different names and slightly different signatures and slightly different code, that nevertheless did nearly exactly the same thing as the original methods.

In the past two lectures, we’ve seen how to use function objects to generalize these two questions:

Produce the list of all Books that satisfy the given IBookPredicate predicate
Sort this list of Books by the given IBookComparator comparison function

Now we only have to implement these methods once, and we can create as many classes as we’d like that implement IBookPredicate or IBookComparator—one class for each particular predicate or sorting order we’d like—to use with these methods.

But again these methods and interfaces aren’t reusable enough: if we’d like to sort a list of Runners, or a list of Circles, or a list of ancestry trees, we’d have to re-implement these methods in each ILoWhatever interface, and define new predicate and comparator interfaces (IRunnerPredicate, ICircleComparator, etc.) that have slightly different names and slightly different signatures, but nearly identical purposes.

Do Now!
Write out explicitly the predicate and comparator interfaces for Runners and Authors.

What to do? Whenever we encounter such duplication, we know that the design recipe for abstraction suggests we find the parts that stay the same and find the parts that vary, and abstract away the parts that vary.

Do Now!
What varies between the interfaces you just defined?

15.2 Introducing generics

The only differences between an IRunnerPredicate and an IBookPredicate are their names, and the type of the argument supplied to the apply methods:

interface IBookPredicate {
boolean apply(Book b);
}
interface IRunnerPredicate {
boolean apply(Runner r);
}

We need a mechanism to express abstraction over this type, to define a general-purpose IPred interface that describes predicates over any particular type. In Java, such abstractions are known as generics, because they generalize away from specific types. We define a simple generic interface as follows:

interface IPred<T> {
boolean apply(T t);
}

Typical Java convention is to use T to be the name of an arbitrary type, and if additional type parameters are needed, they are often named U, V, or S (simply because those letters are near T in the alphabet).

We read this declaration in words as “the interface IPred of T”. The syntax <T> states that this interface is parameterized by a type, which we will name T within the definition of this interface. Said another way, T is bound as a type parameter within the scope of this interface (much like regular parameters to methods are bound within the scope of the method).

Note that the particular name T is meaningless: we could replace it with any other name we wanted, as long as we replaced it consistently throughout the interface definition. In other words, the definition above is exactly the same as

interface IPred<WhateverNameIWant> {
boolean apply(WhateverNameIWant t);
}

Do Now!
Suppose we forgot to write the <T> syntax. What would Java report as the error if we defined
interface IPred {
boolean apply(T t);
}
instead? (Read it carefully!)

The definition above is not parameterized, so Java thinks that T must be the actual name of some actual class or interface. If no such class or interface is defined, Java will complain that it can’t find any such definition.

15.3 Implementing generic interfaces: specialization

When we want to use this interface in our code, we must specialize the generic interface to some specific type. For example, we could revise our BookByAuthor as follows:

class BookByAuthor implements IPred<Book> {
public boolean apply(Book b) { ... }
...
}

Notice that we do not say implements IPred — such a statement is simply meaningless.

Do Now!
Try defining this in IntelliJ, and see what error message is generated.

What is an IPred — what kind of data is it a predicate about? Merely writing “IPred” doesn’t provide enough information; we must specialize the interface when using it.

Notice also that the argument to apply is a Book, and not a T — when we specialized the interface, we consistently substituted Book for T everywhere that it was mentioned in the interface.

15.4 Instantiating generic interfaces

In our Examples class, we now need to update our definitions: we can no longer write

IBookPredicate byAuthor = new BookByAuthor(...);

We must instead write

IPred<Book> byAuthor = new BookByAuthor(...);

since we have upgraded our code from the original interface to this new generic one.

Do Now!
Revise our definition of the RunnerIsInFirst50 predicate over Runners, and revise the examples that use it.

IPred<Runner> inFirst50 = new RunnerIsInFirst50(...);

The parameters to a generic interface are part of the type, and getting them wrong will result in a type error. For example, writing

IPred<Runner> oops = new BookByAuthor(...);

will result in a type error, because BookByAuthor implements IPred<Book>, not IPred<Runner>.

But there’s still more duplication we can eliminate...

15.5 Generic classes: implementing lists

We still have ILoString and ILoRunner (and ILoBook and ILoShape and many others) lying around. We’ve had to implement methods like sort and filter and length on all of them. Now with generics, we can finally resolve all that duplication.

The common features of all of these interfaces are pretty clear: they all represent lists of something. So we’ll define a generic interface IList<T> as follows:

interface IList<T> {
IList<T> filter(IPred<T> pred);
IList<T> sort(IComparator<T> comp);
int length();
...
}

How can we implement the classes? If we just write

class MtList implements IList<T> { ... }

Unless, of course, you’ve defined a class or interface named T. But why would you give a class such a meaningless name?

Java will complain that it does not know what T is. The solution is that we need to make the class itself be parameterized:

class MtList<T> implements IList<T> {
public IList<T> filter(IPred<T> pred) { return this; }
public IList<T> sort(IComparator<T> comp) { return this; }
public int length() { return 0; }
...
}

This definition says “An MtList of T is a list of T.” And now we can implement our methods once and for all on this class, and not worry about implementing them ever again. Except for ConsList, of course.

Do Now!
Define a generic ConsList class.

class ConsList<T> implements IList<T> {
T first;
IList<T> rest;
ConsList(T first, IList<T> rest) {
this.first = first;
this.rest = rest;
}
...
}

What type should the fields be? A ConsList of T items ought to have a T value as its first field, and another list of T items as its rest.

When we construct a new list now, we need to specify the type of the items in it:

// In Examples IList<String> abc = new ConsList<String>("a",
new ConsList<String>("b",
new ConsList<String>("c", new MtList<String>())));

Defining methods such as filter for ConsList<T> is straightforward, as long as we remember to specify the type of the new ConsList being constructed:

// In ConsList<T> public IList<T> filter(IPred<T> pred) {
if (pred.apply(this.first)) {
return new ConsList<T>(this.first, this.rest.filter(pred));
}
else {
return this.rest.filter(pred);
}
}

Do Now!
Implement sort for ConsList<T>. Implement whatever helper methods you need, as well.

15.6 Generic interfaces with more than one parameter

Suppose we wanted to produce the list of all Runners’ names. Without generics, we might come up with an interface definition like this:

interface IRunner2String {
String apply(Runner r);
}

The name is reasonably suggestive: it takes a Runner and produces a String. But we can see that this way lies the same code duplication problems that we had before: soon we’ll want a IBook2String to get titles, or IBook2Author to get Authors, and again we have nearly identical definitions that differ only in their types.

Instead, let’s define the following interface, that’s generic in two type parameters:

interface IFunc<A, R> {
R apply(A arg);
}

This interface describes function objects that take an argument of type A and return a value of type R. In 1114 notation, this describes functions with the signature A -> R. Now we can define a class that implements this interface:

class RunnerName implements IFunc<Runner, String> {
public String apply(Runner r) { return r.name; }
}

And we could use it to get the list of Strings of the names from a list of Runners: this is just one specialization of a generic map method on IList<T>.

Writing the signature for map is a bit tricky:

// In IList<T>: ??? map(IFunc<T, ???> f);

We’d like to put some other type parameter in for the ???, but the only type parameter we have so far is T. We need some additional syntax to define a type parameter just for this method:

// In IList<T>: IList map(IFunc<T, U> f);

All the angle brackets can be a bit hard to read. Read this line as “In IList<T>, map is a method parameterized by U, that takes a function from T values to U values, and produces an IList as a result.”

Do Now!
I claim that we need this parameter “just for this method”. But another possibility seems to be to add U as a type parameter to IList itself, like this:
interface IList<T, U> {
...
IList map(IFunc<T, U> f);
...
}
What goes wrong with that approach? (There are at least two big problems.)

First, having two type parameters for IList doesn’t make sense: we want a “list of Strings”, or a “list of Runners”, not a “list of Runners/Strings” (whatever that even means)!

Second, it’s too restrictive. We want to be able to map a IList<Book> into an IList<String> to get all the titles, but also to map it to an IList<Author> to get all the authors. There’s just one list of books involved; we shouldn’t need two separate types (i.e., IList<Book, String> and IList<Book, Author>) to get the two different mapping behaviors.

Now we can implement this for ConsList<T>:

// In ConsList<T> public IList map(IFunc<T, U> f) {
return new ConsList(f.apply(this.first), this.rest.map(f));
}

What should we do for the empty case? Can’t we just use our usual implementation?

// In MtList<T> public IList map(IFunc<T, U> f) { return this; }

No!

Do Now!
Why not?

Just because we have an empty list of T does not mean that we have an empty list of U — the types don’t match, and we get a compile error. Instead we need to write:

// In MtList<T> public IList map(IFunc<T, U> f) { return new MtList(); }

Much better.

15.7 Digression: lists of numbers and booleans

Java will not let us write the following definitions:

// Will not work IList<int> ints = new ConsList<int>(1,
new ConsList<int>(4, new MtList<int>()));
IList<double> dbls = new ConsList<double>(1.5,
new ConsList<double>(4.3, new MtList<double>()));
IList<boolean> bools = new ConsList<boolean>(true, new MtList<boolean>());

The technical reasons for this limitation are beyond the scope of this course, or even of object-oriented design. But if you are interested, take a compilers course or a programming languages course, which will explain the subtle details of what’s actually happening here.

This is because int and double and boolean are all primitive types, and Java only permits class or interface types as the type parameters to generics.

Instead, Java defines classes Integer, Boolean and Double, that essentially are just “wrappers” for the primitive values. So instead we can write

IList<Integer> ints = new ConsList<Integer>(1,
new ConsList<Integer>(4, new MtList<Integer>()));
IList<Double> dbls = new ConsList<Double>(1.5,
new ConsList<Double>(4.3, new MtList<Double>()));
IList<Boolean> bools = new ConsList<Boolean>(true, new MtList<Boolean>());

Do Now!
Define a function object to compute the perimeter of a Circle, and use it to compute a list of the perimeters of a list of Circles. (Ignore IShape for now.)

class CirclePerimeter implements IFunc<Circle, Double> {
public Double apply(Circle c) { return 2.0 * Math.PI * c.radius; }
}
// In Examples IList<Circle> circs = new ConsList<Circle>(new Circle(3, 4, 5),
new MtList<Circle>());
IList<Double> circPerims = circs.map(new CirclePerimeter());

15.8 Subtleties and challenges with generic types

Generic data types like IList<T> are very useful: they let us implement once-and-for-all a whole suite of functionality for “lists of anything”, and we’ll never have to implement such functionality again. But they come with some down-sides: the only methods we can implement are the ones that make sense for all possible types T. Let’s see what happens in more detail.

15.8.1 Flawed attempt 1

For example, in Lecture 5 we implemented a method totalPrice() for ILoBook. How could we do that for IList<Book>? If we naively write:

interface IList<T> {
...
int totalPrice(); // Oh by-the-way, this method only makes sense when T is Book! ...
}
class MtList<T> {
...
public int totalPrice() {
return 0; // because an empty list always has zero price }
...
}
class ConsList<T> {
...
public int totalPrice() {
... ??? ...
}
...
}

We get stuck at the implementation of totalPrice in the ConsList<T> class.

Do Now!
Why? What precisely is in our template? (Specifically, what are we allowed to do with this.first?)

Because ConsList<T> must be defined for all types T, we don’t have anything we can do with values of type T — since we don’t know what T is!

15.8.2 Flawed attempt 2

If the problem is that we don’t know what T is, then maybe we can specify it? Perhaps we can design the following class:

class ConsLoBook extends ConsList<Book> {
...
public int totalPrice() {
// Now we know that this.first is a Book, so we can call its price() method return this.first.price + this.rest.totalPrice();
}
}

In other words, we inherit all the general-purpose functionality from the generic ConsList class — and we specify that we’re only attempting to extend it when T is Book.

Do Now!
What goes wrong now?

We left out the constructor for this class, and that gives us a hint as to what goes wrong:

class ConsLoBook extends ConsList<Book> {
ConsLoBook(Book first, IList<Book> rest) {
super(first, rest);
}
}

The rest of this object is inherited from ConsList<Book>, and so is specified to be of type IList<Book>. As a result, while we can successfully invoke this.first.price(), we can’t invoke this.rest.totalPrice(), because there is no such method in the generic IList<T> interface!

15.8.3 Successful attempt

The problem we face is two-fold: we can’t define something type-specific like totalPrice in the generic interface, and if we try to use inheritance to specialize to a specific type, we still have an overly-general type for our fields.

Do Now!
Pretend for a moment that we had a [List-of Book] in Racket. Can you express total-price as a function in Racket, using foldr?

Do Now!
Translate your answer above from Racket to Java.

In Racket, we’d define total-price as

(define (total-price lob)
(foldr (λ(book total) (+ (book-price book) total)) 0 lob))

Perhaps we can define foldr as a method on IList<T>, and supply an appropriate function object to compute the total price?

// Interface for two-argument function-objects with signature [A1, A2 -> R] interface IFunc2<A1, A2, R> {
R apply(A1 arg1, A2 arg2);
}

// In IList<T> U foldr(IFunc2<T, U, U> func, U base);

// In MtList<T> public U foldr(IFunc2<T, U, U> func, U base) {
return base;
}

// In ConsList<T> public U foldr(IFunc2<T, U, U> func, U base) {
return func.apply(this.first,
this.rest.foldr(func, base));
}

class SumPricesOfBooks implements IFunc2<Book, Integer, Integer> {
public Integer apply(Book b, Integer sum) {
return b.price() + sum;
}
}

// Example of using foldr and the function object to obtain the total price class Utils {
Integer totalPrice(IList<Book> books) {
return books.foldr(new SumPricesOfBooks(), 0);
}
}

The IFunc2<A1, A2, R> interface is much like the IFunc<A, R> interface, except it represents functions with two arguments, of potentially (but not necessarily) different types. The signature of the foldr method is analogous to its signature in Racket: it takes a function(-object) from the element type (T) and the result type (U), to the result type; and it takes an initial value of the result type; and it produces something of the result type.

The SumPricesOfBooks class dodges the two problems we had earlier: it is what specializes to work with Books, and it avoids any problems of trying to access this.first or this.rest (because on its own, it has nothing to do with ILists at all). In a real sense, the problem with our original IList<T> interface wasn’t that it was too general: it wasn’t general enough, and didn’t have the useful foldr method!

15.9 Summary

Generic types let us describe families of related types that define nearly the same thing, but differing slightly in the types inside them. We can use generic types to define data, like IList<T>, and to define function object interfaces like IFunc<T, U>. This lets us remove much of the “boilerplate” repetitive code we have had to deal with up until now.

Next time, we’ll combine the material of these past four lectures to answer a seemingly simple question: How can I take an IList<IShape> and get a list of the perimeters of the shapes? (Remember, we only implemented area() as a method!)

Exercise
Try to design a solution to this problem. Where does the pattern above get stuck? How might the techniques from the last few lectures help?

contents ← prev up next →

	General
	Texts
	Lectures
	Syllabus
	Recitations
	Assignments
	Pair Programming Overview
	Code style
	Documentation

	Lecture 1: Data Definitions in Java
	Lecture 2: Data Definitions: Unions
	Lecture 3: Methods for simple classes
	Lecture 4: Methods for unions
	Lecture 5: Methods for self-referential lists
	Lecture 6: Accumulator methods
	Lecture 7: Accumulator methods, continued
	Lecture 8: Practice Design
	Lecture 9: Abstract classes and inheritance
	Lecture 10: Customizing constructors for correctness and convenience
	Lecture 11: Defining sameness for complex data, part 1
	Lecture 12: Defining sameness for complex data, part 2
	Lecture 13: Abstracting over behavior
	Lecture 14: Abstractions over more than one argument
	Lecture 15: Abstracting over types
	Lecture 16: Visitors
	Lecture 17: Mutation
	Lecture 18: Mutation inside structures
	Lecture 19: Mutation, aliasing and testing
	Lecture 20: Mutable data structures
	Lecture 21: Array Lists
	Lecture 22: Array Lists
	Lecture 23: For-each loops and Counted-for loops
	Lecture 24: While loops
	Lecture 25: Iterator and Iterable
	Lecture 26: Hashing and Equality
	Lecture 27: Introduction to Big-O Analysis
	Lecture 28: Quicksort and Mergesort
	Lecture 29: Priority Queues and Heapsort
	Lecture 30: Breadth-first search and Depth-first search on graphs
	Lecture 31: Dijkstra’s Algorithm for single-source shortest paths
	Lecture 32: Minimum Spanning Trees

15.1	The need for more abstraction
15.2	Introducing generics
15.3	Implementing generic interfaces: specialization
15.4	Instantiating generic interfaces
15.5	Generic classes: implementing lists
15.6	Generic interfaces with more than one parameter
15.7	Digression: lists of numbers and booleans
15.8	Subtleties and challenges with generic types
15.9	Summary