Lecture 18: Mutation inside structures

8.13

Lecture 18: Mutation inside structures🔗

Designing methods to modify objects, testing mutation methods, and indirect cycles

In the last lecture, we successfully constructed a Book and an Author that referred to each other directly, creating a cyclic data structure. But we only managed to construct that cycle in the examples class, which isn’t particularly useful if we want to use these cyclic data structures anywhere else in our program. In this lecture, we’ll see a better way to construct these cycles that is more reusable, more testable, and less error-prone.

18.1 Keeping mutation contained in the class where it matters🔗

Let’s revisit the test we wrote in the last lecture, that created our first cycle between a book and an author:

// In ExamplesClass boolean testBookAuthorCycle(Tester t) {
Author knuth = new Author("Donald", "Knuth", 1938, null);
Book taocp =
new Book("The Art of Computer Programming (volume 1)", 100, 2, knuth);
knuth.book = taocp;
return t.checkExpect(knuth.book.author, knuth);
}

But Knuth has written many books, not just one. So we’ll need to reuse our example objects: it makes sense to change them to be defined as fields of the examples class, rather than local variables. (Implicitly, this is why we’ve been defining our example data as fields of the examples classes all along, so that they would be available and in scope for all of our tester methods.) So our examples class might look something like this:

class ExampleBooks {
Author knuth = new Author("Donald", "Knuth", 1938, null); // no books yet Book taocp1 = new Book("The Art of Computer Programming (volume 1)", 100, 2, knuth);
Book taocp2 = new Book("The Art of Computer Programming (volume 2)", 100, 2, knuth);
boolean testTaocp1(Tester t) {
...
}
boolean testTaocp2(Tester t) {
...
}
}

Since we’ve lifted our definitions out of the test methods, there’s very little code that’s needed in the test methods themselves:

// In ExampleBooks boolean testTaocp1(Tester t) {
this.knuth.book = this.taocp1;
return t.checkExpect(this.knuth.book.author, this.knuth);
}
boolean testTaocp2(Tester t) {
this.knuth.book = this.taocp2;
return t.checkExpect(this.knuth.book.author, this.knuth);
}

Even this little bit of code is clearly repetitive, though. Worse, it’s not particularly clear why we have these assignment statements: because we’ve lifted the definitions of the data out of the method, the assignment statements are no longer adjacent to the declarations of the data they are modifying, so it’s no longer “obvious” that we’re trying to create a cycle. We should abstract these assignments into a helper method whose purpose is more explicit.

The first question we have to address is, what should this helper method do? We want it to modify the Author to update its book field. Second, where should this method be defined? It isn’t really a method that belongs in the examples class; it applies to Authors and we want to be able to use it throughout our program, so the Author class seems the most appropriate.

Third, what value should this method return? We have the Author object we want to modify. We know what Book we want to use. Unlike every single method we have defined so far, we really don’t need this method to construct a new value for us. All we want it to do is to have the side effect of modifying the Author — we don’t really want a return value at all! To accommodate this situation, Java defines a new keyword, void, which can be used to indicate that a method does not return any value. This type tells the reader of the code that the method’s purpose must be solely to have side effects — after all, if a method had no side effects and returned no value, of what use would it be?

When we design methods with side effects, we need to change how we phrase our purpose statement. Instead of saying what the method computes as a result, we’ll specify what the method’s side effects are:

// In Author // EFFECT: modifies this Author's book field to refer to the given Book void updateBook(Book b) {
this.book = b;
}

Now we can rewrite our two test methods to use this helper instead of directly writing assignment statements:

// In ExampleBooks boolean testTaocp1(Tester t) {
this.knuth.updateBook(this.taocp1);
return t.checkExpect(this.knuth.book.author, this.knuth);
}
boolean testTaocp2(Tester t) {
this.knuth.updateBook(this.taocp2);
return t.checkExpect(this.knuth.book.author, this.knuth);
}

Notice that the method invocation this.knuth.updateBook(this.taocp1) is just standing by itself on a line of its own. We are treating this method call as a statement, something that runs and computes no value — which makes sense, since we just defined this method not to return a value.

These tests are now much better-written code, in that we know (by looking at the effect statement for updateBook) what we expect the tests to accomplish. But are these tests good enough?

18.2 Testing methods with side effects: test fixtures🔗

The tests above only check that after calling updateBook, there is a cycle from the author to its book back to the author. But how do we know that the cycle was actually created by updateBook? Perhaps the cycle was already there (created at some earlier point in our program’s execution), and updateBook actually did nothing? How can we be certain that the side effect we wanted to happen actually did happen?

The only way to be sure that a particular piece of code is responsible for causing a particular side effect to happen is to write tests both before and after running the code being tested. Each test must be treated like a carefully controlled experiment: we must

First, ensure the initial conditions of the test are in a known state,
Second, run the code being tested and let it modify those initial conditions, and
Last, test that the expected changes to the state of the program have occurred.

In our case, we need to be sure that before invoking updateBook, knuth’s book was null, and after running it, that knuth’s book was the intended one and that there is now a cycle:

// In ExampleBooks boolean testTaocp1(Tester t) {
// 1. Check that the initial conditions are as expected boolean initialConditions =
t.checkExpect(this.knuth.book, null);
// 2. Modify them this.knuth.updateBook(this.taocp1);
// 3. Check that the expected changes have occurred boolean finalConditions =
t.checkExpect(this.knuth.book, this.taocp1) &&
t.checkExpect(this.knuth.book.author, this.knuth);
return initialConditions && finalConditions;
}

It’s still not thorough enough: to really be complete, we ought to test that nothing else in the program has changed. That’s impractical, though, so we use our judgement and the purpose and effect statements of the code being tested to guide us in writing effective tests.

This test is far more thorough than our earlier one. Let’s write the second test for taocp2:

// In ExampleBooks boolean testTaocp2(Tester t) {
// 1. Check that the initial conditions are as expected boolean initialConditions =
t.checkExpect(this.knuth.book, null);
// 2. Modify them this.knuth.updateBook(this.taocp2);
// 3. Check that the expected changes have occurred boolean finalConditions =
t.checkExpect(this.knuth.book, this.taocp2) &&
t.checkExpect(this.knuth.book.author, this.knuth);
return initialConditions && finalConditions;
}

Bizarrely, if we run these tests in IntelliJ it reports test failures. Even weirder, if we run the tests repeatedly, it might report different failures each time! What’s happening?

The tester library creates an instance of our examples class, and then runs our test methods one after another. When the examples instance is created, our code creates the Author object and the two Book objects. Suppose testTaocp1 runs first. It checks that knuth.book is null (which it is), then updates it and checks that the update happened correctly (which it does). Then testTaocp2 runs. When it checks the initial conditions, we see that knuth.book is not null: it’s taocp1, just as we set it in the previous test method! Moreover, because the tester library (deliberately) does not guarantee what order it runs our test methods in, we might see that testTaocp2 runs first, and so the test failure happens in testTaocp1.

To fix this, we cannot merely check that the initial conditions of our test are good: we have to actually ensure that they are. In particular, we need to “reset” our examples before each test of a method with side effects, to make sure that we’re testing the side effects of just that method and not the cumulative side effects of preceding tests. But writing the code to re-initialize every example at the start of every test is tedious and repetitive. Instead, we should abstract that initialization into a helper method, which we invoke at the start of every test method:

class ExampleBooks {
Author knuth;
Book taocp1, taocp2;

// EFFECT: Sets up the initial conditions for our tests, by re-initializing // knuth, taocp1 and taocp2 void initTestConditions() {
this.knuth = new Author("Donald", "Knuth", 1938, null);
this.taocp1 =
new Book("The Art of Computer Programming (volume 1)", 100, 2, this.knuth);
this.taocp2 =
new Book("The Art of Computer Programming (volume 2)", 120, 3, this.knuth);
}
boolean testTaocp1(Tester t) {
// 1. Set up the initial conditions this.initTestConditions();
// 2. Modify them this.knuth.updateBook(this.taocp1);
// 3. Check that the expected changes have occurred return
t.checkExpect(this.knuth.book, this.taocp1) &&
t.checkExpect(this.knuth.book.author, this.knuth);
}
boolean testTaocp2(Tester t) {
// 1. Set up the initial conditions this.initTestConditions();
// 2. Modify them this.knuth.updateBook(this.taocp2);
// 3. Check that the expected changes have occurred return
t.checkExpect(this.knuth.book, this.taocp2) &&
t.checkExpect(this.knuth.book.author, this.knuth);
}
}

Now all tests pass, regardless of the order in which they run. Setting up a fixed reference point against which to test methods with side effects is called creating a test fixture, and it is an essential step in writing consistent, reproducible test cases.

18.3 Subtleties of mutable data🔗

In our current model of Books and Authors, each Author can only write one book — there isn’t any place in our data type definitions to include additional books.

Do Now!
How could we revise our data types to accommodate more prolific authors?

As a result, we might set ourselves up for a particularly subtle form of bug. Consider the following test, where we try to have one author write two books:

// In ExampleBooks boolean testTwoBooks(Tester t) {
this.initTestConditions();
// Test 1: check that knuth hasn't written any books yet boolean test1 = t.checkExpect(this.knuth.book, null);
// Modify knuth to know about volume 1 this.knuth.updateBook(this.taocp1);
// Test 2: check that knuth's book was written by knuth boolean test2 = t.checkExpect(this.knuth.book.author, this.knuth);
// Modify knuth to know about volume 2 this.knuth.updateBook(this.taocp2);
// Test 3: check that knuth's new book was written by knuth boolean test3 = t.checkExpect(this.knuth.book.author, this.knuth);
// Test 4: check that both books' authors wrote those books boolean test4 =
t.checkExpect(this.taocp1.author.book, this.taocp1) &&
t.checkExpect(this.taocp2.author.book, this.taocp2);
return test1 && test2 && test3 && test4;
}

Do Now!
Which of these tests pass, and which of these tests fail?

When we initialize the test conditions, the book field of the knuth object is indeed initialized to null, so test 1 definitely passes. Both Books are created with knuth as their author. And after the first call to updateBook, it is definitely the case that knuth’s book is taocp1, so test 2 passes. After the second call, test 3 ought to pass by the same reasoning. Unfortunately, test 4 fails. We know that taocp2’s author is knuth, and knuth’s book is currently taocp2, so the second part of test 4 passes. But taocp1’s author is also knuth, and we just determined that knuth’s book is currently taocp2, so the first part of test 4 fails.

It would be nice if we could prevent this sort of data corruption bug from happening, perhaps by detecting if an Author’s book field has already been initialized, and if so throw an error. Since we have a helper method that we use every time we want to modify an Author’s book field, we can simply change our implementation of that method:

// In Author // EFFECT: modifies this Author's book field to refer to the given Book void updateBook(Book b) {
if (this.book != null) {
throw new RuntimeException("trying to add second book to an author");
}
else {
this.book = b;
}
}

Now we can rewrite our tests above to invoke this new method instead of assigning directly:

// In ExampleBooks boolean testBookAuthors(Tester t) {
this.initTestConditions();
// Test 1: check that knuth hasn't written any books yet boolean test1 = t.checkExpect(this.knuth.book, null);
// Modify knuth to refer to volume 1 this.knuth.updateBook(this.taocp1);
// Test 2: check that knuth's book was written by knuth boolean test2 = t.checkExpect(this.knuth.book.author, this.knuth);
// Try to modify knuth to refer to volume 2 this.knuth.updateBook(this.taocp2); // Crashes with an exception }

That’s better: now instead of corrupting our data, our program signals an error. We just have to check for that exception, using the checkException method (which is much like the checkConstructorException method we used in Lecture 10):

// In ExampleBooks boolean testBookAuthors(Tester t) {
this.initTestConditions();
// Test 1: check that knuth hasn't written any books yet boolean test1 = t.checkExpect(this.knuth.book, null);
// Modify knuth to refer to volume 1 this.knuth.updateBook(this.taocp1);
// Test 2: check that knuth's book was written by knuth boolean test2 = t.checkExpect(this.knuth.book.author, this.knuth);
// Test 3: check that modifying knuth to refer to volume 2 fails with exception boolean test3 = t.checkException(
new RuntimeException("trying to add second book to an author"),
this.knuth, "updateBook", this.taocp2);
// Test 4: check that knuth has not been modified boolean test4 = t.checkExpect(this.knuth.book, this.taocp1);
return test1 && test2 && test3 && test4;
}

18.4 Interlude: Using void for test methods🔗

All along, our test methods have been returning booleans, but we’ve never actually been particularly interested in the actual return values — they just get passed back to the tester library. These last few examples have somewhat highlighted the awkwardness of it: since we have to return a boolean, we have to define all these rather contrived test# variables to hold the results of invoking checkExpect, so that we can return their result at the end... But the tester framework clearly is doing a lot more for us than looking at a simple boolean value: it records the line number of the tests that failed and the values of the arguments, it counts how many tests passed or failed, etc. (In other words, the tester library must be using side effects internally to keep track of all this extra information, since the one thing it doesn’t bother to keep track of is that pesky boolean result!)

Now that we have introduced void, we can clean this up substantially. The tester library will happily run any test methods we define that return void instead of boolean. So we can eliminate all those test# variables, and just invoke the checkExpect methods as statements, which run the tests for the sole purpose of recording their pass/failure as a side effect:

// In ExampleBooks void testBookAuthors(Tester t) {
this.initTestConditions();
// Test 1: check that knuth hasn't written any books yet t.checkExpect(this.knuth.book, null);
// Modify knuth to refer to volume 1 this.knuth.updateBook(this.taocp1);
// Test 2: check that knuth's book was written by knuth t.checkExpect(this.knuth.book.author, this.knuth);
// Test 3: check that modifying knuth to refer to volume 2 fails with exception t.checkException(
new RuntimeException("trying to add second book to an author"),
this.knuth, "updateBook", this.taocp2);
// Test 4: check that knuth has not been modified t.checkExpect(this.knuth.book, this.taocp1);
}

18.5 More error handling: Books written by the wrong Authors🔗

Suppose in addition to Knuth’s books we also created an Author and a Book representing Shakespeare’s play The Comedy of Errors. We would like to prevent setting shakespeare’s book to taocp1 — it’s not his book! In other words, we’d like this test to succeed:

// In examples class void testBookAuthors(Tester t) {
this.initTestConditions();
Author shakespeare = new Author("William", "Shakespeare", 1564, null);
Book tcoe = new Book("The Comedy of Errors", 42, 1, shakespeare);
// Test 1: check that neither knuth nor shakespear have written any books yet t.checkExpect(this.knuth.book, null);
t.checkExpect(shakespeare.book, null);
// Test 2: check that setting shakespeare's book to taocp fails t.checkException(
new RuntimeException("book was not written by this author"),
shakespeare, "updateBook", this.taocp1);
}

Fortunately, we’ve already designed the updateBook helper method, so we know precisely where in our code to add the handling of this new error case:

Do Now!
Try to modify updateBook yourself to detect this mistake and throw the appropriate error.

// In Author // EFFECT: modifies this Author's book field to refer to the given Book void updateBook(Book b) {
if (this.book != null) {
throw new RuntimeException("trying to add second book to an author");
}
else if (!b.author.sameAuthor(this)) {
throw new RuntimeException("book was not written by this author");
}
else {
this.book = b;
}
}

Once again, defining helper methods has unexpected benefits: by being the one and only place in our code where we actually do the mutation, we therefore only have one place to fix if we decide to change how that mutation should be done.

18.6 Automatically creating the cycles🔗

It’s somewhat silly for us to have to keep specifying null as an argument to the Author constructor, since it’s always going to be null. And it’s annoying and error-prone for us to construct a Book and then have to remember to immediately call updateBook on the relevant author. Both of these issues seem like design flaws in how we’ve written our constructors, and we can fix that mistake easily enough.

Do Now!
Revise the constructor for Author so that it does not take a Book parameter, but still initializes the book field to null.

The constructor for Books does not need to change its signature: it should still take an Author as an argument. But it should fix up the author’s book field automatically for us: once we’ve created a Book, we’ll therefore know that the relevant Author is in a consistent state. Conveniently for us, we have a helper method to update the book of an Author...but what Book should we use?

class Book {
String title;
int price;
int quantity;
Author author;
Book(String title, int price, int quantity, Author ath) {
this.title = title;
this.price = price;
this.quantity = quantity;
this.author = ath;
// NEW! Fix up the author for us, using *this* newly-constructed Book this.author.updateBook(this);
}
}

Do Now!
If we change our constructor to include this improvement, what of our earlier example code breaks?

Now we can’t even construct taocp2, because when the constructor tries to update knuth’s book via updateBook, we’ll get an exception.

18.7 Indirect cycles🔗

The solution, it seems, is to allow Authors to have multiple Books to their name. (We can also modify Books to be written by multiple Authors.)

Do Now!
Revise the Author class to have an IList<Book> field instead of merely a single Book field. In your constructor, what value should you use instead of null to initialize this new books field?

At birth, an Author has not yet written any Books. But we have a perfectly good value to represent a list of no Books: new MtList<Book>().

class Author {
String first;
String last;
int yob;
IList<Book> books;
Author(String fst, String lst, int yob) {
this.first = fst;
this.last = lst;
this.yob = yob;
this.books = new MtList<Book>();
}
}

Do Now!
Revise updateBook to be a new method, addBook, that adds the new book to the list of books.

The reason we changed the Book book field into an IList<Book> books field was to permit multiple books per author, so “adding a second book” should no longer throw an error, but we still have to check for books written by the wrong author. The assignment statement now becomes slightly more subtle:

// In Author // EFFECT: modifies this Author's book field to refer to the given Book void addBook(Book b) {
if (!b.author.sameAuthor(this)) {
throw new RuntimeException("book was not written by this author");
}
else {
this.books = new ConsList<Book>(b, this.books);
}
}

Do Now!
At first glance, it might look like we are creating a cycle in the book list itself: it seems like we’re creating a new Cons whose first is the given book, and whose rest is this.books, which is a Cons whose first is the given book, and whose rest is this.books, ... What’s wrong with this reasoning?

This assignment statement only makes sense if we remember the rules from last lecture about how assignment statements work: they evaluate the right hand sides completely, and only then do they modify their left hand sides. So this assignment statement creates a Cons object whose first is the given book, and whose rest is the current list of books by this author, and then sets this.books to this new Cons object. In other words, we’ve modified this.books to include the given book at the front of the list.

We can now use addBook in the constructor of Book (instead of our now-defunct updateBook method), and now our initTestConditions method can successfully create taocp1 and taocp2.

Notice that this new data definition still creates cycles in our data, but it’s now indirect: after adding both taocp1 and taocp2 to knuth’s books field, we’ve actually created two cycles: taocp2’s author refers to knuth, whose books field refers to a ConsList<Book> object, whose first field refers to taocp2 again. Also, taocp1’s author field refers to knuth, whose books field refers to a ConsList<Book> object, whose rest field refers to a second ConsList<Book> object, whose first field refers back to taocp1.

Do Now!
Draw the object diagram that’s described above.

18.8 Discussion🔗

Using mutation well can be a subtle and error-prone process. We’ve seen how how to define helper methods whose reason for existence is their side effects, i.e., they return nothing (a void return type) and mutate some object. Defining these methods then requires testing them, and testing methods with side effects requires test fixtures to provide consistent initial conditions for the test to be reliable. Finally, having our mutations abstracted away into these helper methods makes it far easier to revise our data definitions as our design requirements change.

Exercise
Design a data representation for your registrar’s office. Information about a course includes a department name (a String), a course number, an instructor, and an enrollment, which you should represent with a list of students. For a student, the registrar keeps track of the first and last name and the list of courses for which the student has enrolled. For an instructor, the registrar also keeps track of the first and last name as well as a list of currently assigned courses. Construct examples of at least three courses, at least two professors (one of whom teaches more than one course) and at least four students (at least one of whom is enrolled in more than one class).

In the next lecture, we’ll see some additional consequences of having mutation present as an operation in our language, and see that one concept we had thought we’d covered is in fact more subtle than we’d thought.

contents ← prev up next →

	General
	Texts
	Lectures
	Syllabus
	Recitations
	Assignments
	Pair Programming Overview
	Code style
	Documentation

	Lecture 1: Data Definitions in Java
	Lecture 2: Data Definitions: Unions
	Lecture 3: Methods for simple classes
	Lecture 4: Methods for unions
	Lecture 5: Methods for self-referential lists
	Lecture 6: Accumulator methods
	Lecture 7: Accumulator methods, continued
	Lecture 8: Practice Design
	Lecture 9: Abstract classes and inheritance
	Lecture 10: Customizing constructors for correctness and convenience
	Lecture 11: Defining sameness for complex data, part 1
	Lecture 12: Defining sameness for complex data, part 2
	Lecture 13: Abstracting over behavior
	Lecture 14: Abstractions over more than one argument
	Lecture 15: Abstracting over types
	Lecture 16: Visitors
	Lecture 17: Mutation
	Lecture 18: Mutation inside structures
	Lecture 19: Mutation, aliasing and testing
	Lecture 20: Mutable data structures
	Lecture 21: Array Lists
	Lecture 22: Array Lists
	Lecture 23: For-each loops and Counted-for loops
	Lecture 24: While loops
	Lecture 25: Iterator and Iterable
	Lecture 26: Hashing and Equality
	Lecture 27: Introduction to Big-O Analysis
	Lecture 28: Quicksort and Mergesort
	Lecture 29: Priority Queues and Heapsort
	Lecture 30: Breadth-first search and Depth-first search on graphs
	Lecture 31: Dijkstra’s Algorithm for single-source shortest paths
	Lecture 34: Implementing Objects

18.1	Keeping mutation contained in the class where it matters
18.2	Testing methods with side effects: test fixtures
18.3	Subtleties of mutable data
18.4	Interlude: Using void for test methods
18.5	More error handling: Books written by the wrong Authors
18.6	Automatically creating the cycles
18.7	Indirect cycles
18.8	Discussion