Feeds:
Posts
Comments

Well, it’s not actually a JDK 21 feature; it was added in JDK 19. With the recent release of JDK 21, there’s been a lot of emphasis on the changes between JDK 17 and JDK 21, because both are long-term maintained releases. I figure it’s ok to call this a JDK 21 feature.

Anyway, my favorite feature is that Javadoc search now has a URL.

So what? Javadoc has had searching since JDK 9. Here’s what searching looks like in JDK 17:

Screenshot of a JDK 17 javadoc search results in a drop-down menu showing a few matches for "LinkedList"

(Why am I searching for LinkedList? Shut up.)

Anyway, searching javadoc is great: you click in the Search box, type some text, a menu of hits pops up, you click on the item that you’re interested in, and it takes you there. What could be better than that?

Here’s what searching is like in JDK 21:

Screenshot of a JDK 21 javadoc search results in a drop-down menu showing a few matches for "LinkedList" with an arrow pointing at a new "Go to search page" link at the bottom.

It’s pretty much the same as before, but with this “Go to search page” link at the bottom. What does that do? It takes you to this page:

Screenshot of a JDK 21 javadoc search results page as a regular web page instead of a popup menu. Notably, this page has its own URL. This image is a link to the actual page shown in the screenshot.

All right, it’s the same search results in a regular browser window instead of a menu-like popup. What’s the big deal?

The big deal is that this page has a URL. And URLs are the API for the web. Once something has a URL, you can program it. This URL has a query string with a simple syntax: after the ?q= you can put whatever string you want javadoc to search for. Opening this URL searches for the string and immediately displays the results.

This is a very big deal.

This means you can write a program that generates the URL for whatever you want to search for and then you can ask a browser to open this URL. For example, on a Mac, you can provide a URL to the open command and it will ask the system default browser to open that URL. I have a little shell script named jd that does this:

open "https://docs.oracle.com/en/java/javase/21/docs/api/search.html?q=$*"

This opens javadoc search on whatever I supply as command-line arguments. Why $*? Sometimes it’s useful to search for multiple words. Try this for example:

jd value based

In Firefox (and maybe other browsers too, I’m not sure) it’s possible to set up a bookmark so that you can type a keyword into the address bar and have it expand into the URL from the bookmark. If the URL has a %s in it, it will be replaced with whatever text you typed after the keyword in the address bar.

I created a bookmark whose URL is

https://docs.oracle.com/en/java/javase/21/docs/api/search.html?q=%s

It’s probably easiest to bookmark some javadoc page and then edit the bookmark and change the URL. While you’re doing that, go to the “Keyword” field of the Edit Bookmark dialog and add the keyword of your choice:

Firefox "Edit Bookmark" dialog box with "javadoc" filled into the keyword box.

Now you can go to the address bar in Firefox and type “javadoc LinkedList” and it takes you directly to the javadoc search results page. Neat! (Again, these are the instructions for Firefox. Other browsers may have similar features, but I don’t know how to use them.)

Since the URL is just a piece of text, you can easily write a script to construct it from a variety of sources it and send it along. A lot of this is pretty system-specific. For example, on most systems, there’s a way to get the text from the clipboard. Once you’ve gotten the clipboard text, you can append it to the base search URL and open this URL in a browser. On a Mac, you can create a simple Automator workflow to add “Javadoc Search” to your Services menu to do the same thing.

Thanks to Hannes Wallnoefer and the javadoc team for developing this feature.

Have fun with JDK 21!

I’m watching the latest JEP Café video from my colleague José Paumard, where he talks about the Comparator interface. One of the things you can do with a Comparator is to use it to sort a list:

list.sort(comparator);

If a class implements the Comparable interface, that means instances of that class know how to compare themselves. (The “comparison of themselves” is referred to as the natural order.) If the list contains elements that are Comparable, you don’t need to pass a Comparator argument to the List.sort() method. Instead, you pass null:

list.sort(null);

and the list will be sorted according to the elements’ natural order. José quite reasonably observes that it’s somewhat unpleasant to have to pass “null” there, and he suggests that it would be cleaner to have a no-arg overload that sorts the list in natural order.

Yeah, we should add that!

After all, there are two overloads of the Stream.sorted() method: a no-arg overload that sorts by natural order, and an overload that takes a Comparator argument. We should clearly do the same for List.sort(). Maybe its omission was just an oversight. On the other hand, it’s such an obvious thing; maybe the omission was deliberate.

What about compatibility?

The default methods feature was added in Java 8. This allowed the addition of new methods to interfaces. Prior to Java 8, adding a method to an interface was an incompatible change. With default methods, it’s possible to add a new method in a compatible way. However, incompatibility is still possible when adding a default method. For example, consider the List.sort(Comparator) method again. Its return type is void. Suppose there is a Java 7 application that has this List implementation:

class MyList<E> extends List<E> {
    public MyList<E> sort(Comparator<? super E> comparator) {
        // sort the list
        return this;
    }
}

If this class were recompiled on Java 8, an error would occur, because Java doesn’t allow overrides with a differing return type.

(Java does allow covariant return types in an overriding method, where the return type of the override is a subtype of the return type of the overridden method. That doesn’t apply here, because MyList<E> isn’t a subtype of void.)

Adding a default method to an interface can be compatible, but it might be incompatible if there is a conflict with a method in an existing class. It’s therefore necessary to be quite careful when adding default methods to interfaces. It’s even more important to be careful if the interface is widely implemented (like List), if the method name is short and common (like sort), and if it has few or no arguments. Thus, it seems likely that adding a no-arg List.sort() default method would conflict with a method in an existing subclass. Maybe adding this default method isn’t such a good idea after all.

This line of reasoning seems valuable. Maybe I should write it down somewhere!

This issue is fairly subtle, and it’s worth writing down so that somebody in the future doesn’t make a mistake. A blog post (like this one) is one way to preserve this information. However, this blog isn’t connected to the OpenJDK project, so somebody working on the JDK wouldn’t know to search here. Someplace closer to the JDK would be preferable.

Another place to store this information is the JDK Bug System (JBS), which is the bug database for the JDK. It contains a lot of history, including bug reports converted from the old Sun bug database dating back to the pre-JDK-1.0 era in the 1990s. It seems likely that information in JBS will persist longer than this blog. Since JBS is associated with the JDK project, it’s also more likely that somebody working on the JDK will find it. Plus, JBS is a database, with nice categories and querying capabilities, making it easy to find information.

How should this kind of information be recorded in a bug database? I could file a request to add this API, close it out, and put the rationale for not implementing it into the comments. Filing a request and closing it immediately might seem excessively fussy. Once it’s in the database, though, it would be easy for future maintainers to rediscover the request in the future if a similar issue were to arise.

Before filing a new issue, it’s always good practice to search the database to see if something similar exists already. Indeed, upon searching, I found this:

JDK-8078076 Create Overload List#sort() for Natural Ordering Case

Huh. That seems like it covers exactly the same issue. It was submitted in April 2015 by “Webbug Group”, which is the JBS username that’s used when a bug is received from an anonymous person on the internet. The bug’s status is Closed, and its resolution value is Won’t Fix. Who did that and why? Looking through bug report, the last comment (also from April 2015) is this one:

This was considered and rejected during JDK 8 development. We were fairly minimal with the addition of default methods. There have already been incompatibilities with the List.sort(Comparator) method; adding a no-arg List.sort() method would likely cause additional incompatibilities while adding very little value. Closing as Won’t Fix.

This comment was written by … me! Wow, I had completely forgotten about this. Not surprisingly, it turns out that the bug database has a better memory than I do. I just went through this line of reasoning and reached a conclusion. Then I found the same line of reasoning and the same conclusion that I had written down nearly eight years earlier. Fortunately, present me agrees with past me.

Suppose that I had watched José’s video and immediately decided to implement the new default method. Every change to the JDK requires a JBS entry, so I would have started by searching for an existing issue or filing a new one. It seems likely I would have run across the 2015 issue at that time. (There are only 14 collections bugs in the database that have both “list” and “sort” in the title.) Even if I had missed it, one of the reviewers of the change probably would have noticed the 2015 bug and called my attention to it. Either way, it’s clear that writing down the reasoning in 2015 is valuable to a future maintainer in 2023, whether that maintainer is me or somebody else. And it seems likely that having this issue in the database, along with other similar issues, will be of value to future maintainers.

Anyway, sorry about that José, that’s why we won’t be adding a no-arg List.sort() overload.

Now that JEP 421 (Deprecate Finalization for Removal) has been delivered in JDK 18, it seems like more people are talking about finalization and how to migrate to alternatives such as Cleaner. I had an interesting Twitter conversation about this with Heinz Kabutz the other day:

The code from SunGraphics2D that Heinz pointed out is this:

@SuppressWarnings("removal")
public void finalize() {
// DO NOT REMOVE THIS METHOD
}

Why did somebody bother to write an empty finalize() method, and why is it so important that there is a comment warning not to remove it?

The answer is that an empty finalizer disables finalization for all instances of that class and for all instances of subclasses (unless overridden by a subclass). Depending on the usage of that class, this can be a significant optimization.

To understand this, let’s recap the Java object life cycle.

A Java object without a finalizer is created, is used for a while, and eventually becomes unreachable. Some time later, the garbage collector notices that the object is unreachable and garbage collects it.

An object with a finalizer is created, is used for a while, and eventually becomes unreachable. Some time later, the object’s finalize() method is run. This is regular Java code, so the object is now actually reachable. Some additional time later, the object becomes unreachable again, and this time, the garbage collector collects the object. Thus, objects with finalizers live longer than objects without finalizers, and the garbage collector needs to do more work to garbage collect them. Using a lot of objects with finalizers increases memory pressure and potentially increases the memory requirements of the system.

Why would you need to disable finalization for some objects?

Let’s look at the case that Heinz pointed out. Instances of java.awt.Graphics (actually, its subclasses) keep a pointer to native resources used by that object. The dispose() method frees those native resources. It also has a finalizer that calls dispose() as a “safety net” in case the program didn’t call dispose(). Note that when a Graphics object becomes unreachable, it’s kept around in order for it to be finalized, even if the program had already called dispose().

The SunGraphics2D subclass is a “lightweight” object that never has any associated native resources. If it were to inherit the finalizer from Graphics, instances would need to be kept around longer in order to run the finalizers, which would call dispose(), which would do nothing. To prevent this, SunGraphics2D provides an empty finalize() method. An empty method has no visible side effects; therefore, it’s pointless for the JVM to extend the lifetime of an object in order to run an empty finalize() method. Instead, the JVM garbage collects such objects as soon it can determine they are unreachable, skipping their finalization step.

Let’s see this in action. It’s pretty easy to tell when an object is finalized by putting a print statement into its finalizer. But how can we tell whether an object with an empty finalizer was actually finalized or whether it was garbage collected immediately? This is fairly simple to do, using a new JFR event added in JDK 18.

Here’s a program with a small class hierarchy. Class A has a finalizer; B inherits it; C overrides with an empty finalizer; D inherits the empty finalizer; and E overrides with a non-empty finalizer. (I’ve made them static classes nested inside a top-level class EmptyFinalizer so they’re all in one file, but otherwise this doesn’t affect finalization. See the full program.)

    static class A {
        protected void finalize() {
            System.out.println(this + " was finalized");
        }
    }

    static class B extends A {
    }

    static class C extends B {
        protected void finalize() { }
    }

    static class D extends C {
    }

    static class E extends D {
        protected void finalize() {
            System.out.println(this + " was finalized");
        }
    }

The main program creates a bunch of instances but doesn’t keep references to them. It calls System.gc() a few times and sleeps to let the garbage collector run. The output is something like the following:

$ java EmptyFinalizer
EmptyFinalizer$E@cd4e940 was finalized
EmptyFinalizer$B@8eb6c02 was finalized
EmptyFinalizer$A@4de9e37b was finalized
EmptyFinalizer$E@57db5523 was finalized
EmptyFinalizer$B@7cee2871 was finalized
EmptyFinalizer$A@2f36c092 was finalized
EmptyFinalizer$E@2dc61c34 was finalized
EmptyFinalizer$B@203936e2 was finalized
EmptyFinalizer$A@2d193f34 was finalized
EmptyFinalizer$E@34324855 was finalized
EmptyFinalizer$B@2988c55b was finalized
EmptyFinalizer$A@40ef68ae was finalized
EmptyFinalizer$E@246b0f18 was finalized
EmptyFinalizer$B@23d8b20 was finalized
EmptyFinalizer$A@6df02421 was finalized

We can see that instances of A, B, and E were finalized, but C and D were not. Well, we can’t really tell, can we? Their empty finalizers might have been called. Starting in JDK 18, we can use JFR to determine whether these objects were finalized. First, enable JFR during the run:

$ java -XX:StartFlightRecording:filename=recording.jfr EmptyFinalizer
[0.365s][info][jfr,startup] Started recording 1. No limit specified, using maxsize=250MB as default.
[0.365s][info][jfr,startup] 
[0.365s][info][jfr,startup] Use jcmd 56793 JFR.dump name=1 to copy recording data to file.
EmptyFinalizer$A@cd4e940 was finalized
EmptyFinalizer$E@8eb6c02 was finalized
EmptyFinalizer$B@4de9e37b was finalized
EmptyFinalizer$A@57db5523 was finalized
EmptyFinalizer$E@7cee2871 was finalized
EmptyFinalizer$B@2f36c092 was finalized
EmptyFinalizer$A@2dc61c34 was finalized
EmptyFinalizer$E@203936e2 was finalized
EmptyFinalizer$B@2d193f34 was finalized
EmptyFinalizer$E@34324855 was finalized
EmptyFinalizer$B@2988c55b was finalized
EmptyFinalizer$A@40ef68ae was finalized
EmptyFinalizer$E@246b0f18 was finalized
EmptyFinalizer$B@23d8b20 was finalized
EmptyFinalizer$A@6df02421 was finalized

Now we have a file recording.jfr with a bunch of events. Next, we print this file in a readable form with the following command:

$ jfr print --events FinalizerStatistics recording.jfr
jdk.FinalizerStatistics {
  startTime = 16:43:37.379 (2022-04-27)
  finalizableClass = EmptyFinalizer$A (classLoader = app)
  codeSource = "file:///private/tmp/"
  objects = 0
  totalFinalizersRun = 5
}

jdk.FinalizerStatistics {
  startTime = 16:43:37.379 (2022-04-27)
  finalizableClass = EmptyFinalizer$B (classLoader = app)
  codeSource = "file:///private/tmp/"
  objects = 0
  totalFinalizersRun = 5
}

jdk.FinalizerStatistics {
  startTime = 16:43:37.379 (2022-04-27)
  finalizableClass = jdk.jfr.internal.RepositoryChunk (classLoader = bootstrap)
  codeSource = N/A
  objects = 1
  totalFinalizersRun = 0
}

jdk.FinalizerStatistics {
  startTime = 16:43:37.379 (2022-04-27)
  finalizableClass = EmptyFinalizer$E (classLoader = app)
  codeSource = "file:///private/tmp/"
  objects = 0
  totalFinalizersRun = 5
}

We can easily see that classes A, B, and E each had five instances finalized, with zero instances remaining on the heap. Classes C and D aren’t listed, so no finalization was performed for them. Also, it looks like the JFR internal class RepositoryChunk uses a finalizer, and there was one live instance, and none were finalized. (We’ll have to get the JFR team to convert this class to use Cleaner instead!)

JEP 421 has deprecated finalization for removal. Eventually it will be disabled and removed from the JDK. If your system uses finalizers — or, perhaps more crucially, if you don’t know whether your system uses finalizers — use JFR to help find out. See the JDK Flight Recorder documentation for more information about JFR.

(Updated with suggestions from Kim Barrett. Thanks, Kim!)

A new default method CharSequence.isEmpty() was added in the just-released JDK 15. This broke the Eclipse Collections project. Fortunately, the EC developers were testing the JDK 15 early access builds. They noticed the incompatibility, and they were able to ship a patch release (Eclipse Collections 10.4.0) before JDK 15 shipped. They also reported this to the OpenJDK Quality Outreach program. As a result, we were able to document this change in a release note for JDK 15.

Kudos to Nikhil Nanivadekar and Don Raab and the Eclipse Collections team for getting on top of this issue!

What’s the story here? Aren’t new JDK releases supposed to be compatible? In general, yes, we try really hard to keep everything compatible. But sometimes incompatibilities are unavoidable, and sometimes we just miss stuff. To understand what happened, we need to discuss two distinct concepts: source incompatibility and binary incompatibility.

A source incompatible change is one where a source file compiles just fine on an earlier JDK release but fails to compile on a more recent JDK release. A binary incompatible change is one where a compiled class file runs fine on an earlier JDK release but fails at runtime on a more recent JDK release.

In development of the JDK, we put in quite a bit of effort to avoid binary incompatible changes, since it’s unreasonable to force people to recompile everything, and potentially maintain different artifacts, for different JDK releases. Ideally, we’d like to enable people to provide a single binary artifact (e.g., a jar file) that runs on all of the JDK releases that their project supports.

We are somewhat more tolerant of source incompatible changes. If you’re recompiling something, then presumably you have access to the source code in order to make a few minor adjustments. We’re willing to make minor source incompatible changes to the JDK if the change provides enough value to justify the incompatibility.

It turns out that adding a default method to an interface is potentially both a source and binary incompatible change. I was a bit surprised by this. What’s going on?

Let’s first set aside default methods on interfaces and look just at adding methods to classes. Making changes to a class potentially affects subclasses. In most cases, adding a method to a class is a binary compatible change, even if the subclass has methods that are apparently in conflict with the new method in the superclass. For example, consider this class compiled on JDK 8:

class MyInputStream extends InputStream {
    public String readAllBytes() { ... }
    ...
}

This works fine. However, a method was added to InputStream on JDK 9:

public byte[] readAllBytes()

Now there is a conflict between InputStream and MyInputStream, since they have methods with the same name, the same parameters (none), but different return types. Despite this conflict, this is a binary compatible change. Any already-compiled classes that invoke the readAllBytes() method on an instance of MyInputStream will do so using this bytecode:

invokevirtual #6 // Method MyInputStream.readAllBytes:()Ljava/lang/String;

(I determined this by compiling a program that uses MyInputStream on JDK 8, and then running the javap -c command on the resulting class file.) Roughly, this says “invoke the method named «readAllBytes» that takes no arguments and returns a String.” That method exists on MyInputStream and not on InputStream, so the method invocation works even on JDK 9.

However, this is a source incompatible change. When I try to recompile MyInputStream.java on JDK 9, the result is this:

MyInputStream.java:13: error: readAllBytes() in MyInputStream cannot override readAllBytes() in InputStream
public String readAllBytes() {
^
return type String is not compatible with byte[]

The compatibility analysis of adding methods to classes is fairly straightforward. There is only one path from the current class up the superclass chain to the root class, java.lang.Object. Any conflicts among methods can only occur on this path.

Analysis of adding default methods to interfaces is more complicated, because a class or interface can inherit from multiple interfaces. This means that, looking upward from the current class, instead of there being a linear chain of superclasses up to Object, there is a branching tree (actually a DAG) of interface inheritance. This gives rise to several inheritance possibilities that cannot occur with class-only inheritance.

Also, since default methods are a relatively recent feature, the Java community has relatively less experience evolving APIs using default methods. Default methods were added in Java 8, which was released in 2014, so we have “only” six years of experience with it.

It was possible to have conflicts among interfaces, even before Java 8, for example, if two unrelated interfaces declared the same method but with different return types. Prior to Java 8, though, interfaces were essentially impossible to evolve, and so having such conflicts arise from interface evolution hardly occurred. Finally, in the pre-Java 8 world, interface methods were all abstract. If a class inherited the “same” method (same name, parameters, and return type) from different interfaces, that was OK, as both could be satisfied by a single implementation provided by the class or one of its superclasses.

With the addition of default methods in Java 8, a new problem arose: what if a default method were added to an interface somewhere, such that conflicts between method implementations might arise somewhere in the superclass and superinterface graph? More specifically, what if the superinterface graph contains two default implementations for the same method? The full rules are described in the Java Language Specification, sections 8.4.8 and 8.4.8.4, and there are lots of edge cases, but briefly, the rules are as follows:

  • Methods inherited from the class hierarchy take precedence over default methods inherited from interfaces.
  • Default methods in interfaces are allowed to override each other; the most specific override takes precedence.
  • If multiple default methods are inherited from unrelated interfaces (that is, one doesn’t override the others), that’s a compile-time error.

Here are some examples of these rules in action:

class S {
    public void foo() { ... }
}

interface I {
    default void foo() { ... }
}

interface J extends I {
    default void foo() { ... }
}

interface K {
    default void foo() { ... }
}

Given this class and these interfaces, how do the inheritance rules work?

class C extends S implements I { }
// ok: class wins, S::foo inherited

class D implements I, J { }
// ok: overriding default method wins, J::foo inherited

class E implements I, K { }
    ERROR: types I and K are incompatible;
    class E inherits unrelated defaults for foo() from types I and K

So now we have to think harder about the compatibility impact of adding a default method. If a class already has the method, we’re OK. If there’s another interface that has a default method that overrides or is overridden by the default method we’re adding, that’s OK too. A problem can only occur if there is another default method somewhere in the interface graph inherited by some class.

That’s what’s going on with source compatibility. If you run through the examples above, you can see the kind of compilation error that might arise. What about binary compatibility? It turns out that the rules for binary compatibility with default methods are actually quite similar to those for source compatibility.

Here’s what the Java Virtual Machine Specification says about how invokevirtual finds the method to call. It first talks about method selection:

A method is selected with respect to [the class] and the resolved method (§5.4.6).

Section 5.4.6 says:

The maximally-specific superinterface methods of [the receiver class] are determined (§5.4.3.3). If exactly one matches [the method]’s name and descriptor and is not abstract, then it is the selected method.

OK, what if there isn’t exactly one match? In particular, what if there are multiple matches? Back in the specification of invokevirtual, it says:

If no method is selected, and there are multiple maximally-specific superinterface methods of [the class] that match the resolved method’s name and descriptor and are not abstract, invokevirtual throws an IncompatibleClassChangeError.

Thus, the JVM has to do quite a bit of analysis at runtime. When a method is invoked on some class, it has to not only search for that method up the class hierarchy. It also has to search the graph of interface inheritance to see if a default method might have been inherited, and that there is exactly one such method. Thus, adding a default method to an interface can easily cause problems for existing, compiled classes — a binary incompatibility.

We always examine the JDK for incompatibilities and avoid them if possible. In addition, we look at popular non-JDK libraries to see if problems might occur with them. This kind of incompatibility can occur only if a non-JDK library has a signature-compatible default method in an interface that is unrelated to the JDK interface being modified. It also requires that there be some class that inherits both that interface and the JDK interface. That seems pretty rare, but it can happen.

In fact, this is exactly the case that came up in Eclipse Collections! The Eclipse Collections library has an interface PrimitiveIterable that implements a default method isEmpty, and it also has a class CharAdapter that implements PrimitiveIterable and CharSequence:

interface PrimitiveIterable {
    default boolean isEmpty() { ... }
}

class CharAdapter implements PrimitiveIterable, CharSequence {
    ...
}

This works perfectly fine in JDK 14 and earlier releases. Consider some code that calls CharAdapter.isEmpty(). The bytecode generated would be as follows:

invokevirtual #13 // Method org/eclipse/collections/impl/string/immutable/CharAdapter.isEmpty:()Z

This works on JDK 14, because invokevirtual searches all the superclasses and superinterfaces of CharAdapter, and it finds exactly one default method: the one in PrimitiveIterable.

On JDK 15, the situation is different. A new default method isEmpty() was added to CharSequence. Thus, when the same invokevirtual bytecode is executed, it searches the superclasses and superinterfaces of CharAdapter, but this time it finds two matching default methods: the one in PrimitiveIterable and the one in CharSequence. That’s an error according to the JVM Specification, and that’s exactly what happens:

java.lang.IncompatibleClassChangeError: Conflicting default methods: org/eclipse/collections/api/PrimitiveIterable.isEmpty java/lang/CharSequence.isEmpty

What’s to be done about this? Fortunately, the fix is pretty simple: just add an implementation of isEmpty() to the CharAdapter class. (A couple other classes, CodePointAdapter and CodePointList, are in a similar situation and were also fixed.) In this case the implementations of isEmpty() are so simple that the code this.length == 0 was just inlined. If for some reason it were necessary to have CharAdapter inherit the implementation from PrimitiveIterable, then the implementation in CharAdapter could have been written like this:

@Override
public boolean isEmpty()
{
    return PrimitiveIterable.super.isEmpty();
}

As mentioned above, this fix was delivered in Eclipse Collections 10.4.0, which was delivered in time for JDK 15. Again, thanks to the EC team for their quick work on this.

OK, that’s how the JVM behaves. Why does the JVM behave this way? That is, why does it throw an exception (really, an Error) if it detects multiple default methods among the superinterfaces? Couldn’t it, for example, remember what method was called on JDK 14 (the one on PrimitiveIterable), and then continue to call that method even on JDK 15?

The explanation requires understanding of some background about virtual methods. Consider a simple class hierarchy in a library:

class A {
}

class B extends A {
    void m() { }
}

class C extends B {
}

Suppose further that an application has this code:

void exampleCode(B b) {
    b.m();
}

What method is called? Clearly, this will invoke the B::m. Now suppose that the library is modified as follows:

class A {
    void m() { } // method "promoted" from B
}

class B extends A {
}

class C extends B {
    void m() { } // a new overriding method
}

and the application is run again. Even though the code is invoking method m on B, we don’t know which method will actually be invoked. If the variable b is an instance of B, then A::m will be invoked. But if variable b is an instance of C, then C::m will be invoked.

The method that actually gets invoked depends on the class of the receiver object and the class hierarchy that has been loaded into in this JVM. There is nothing written down anywhere that says that the application used to call B::m. In fact it would be a mistake for something to be written down that causes B::m to continue to be invoked. When an overriding method is added to class C, calls that used to end up at B::m should now be calling C::m. That’s what we want virtual method calls to do.

It’s similar with superinterfaces (though more complicated of course). The JVM needs to do a search at runtime to determine what method to call. If it finds two default methods, such as PrimitiveIterable::isEmpty and CharSequence::isEmpty, there is no information to tell the JVM that the code used to call PrimitiveIterable::isEmpty and that the CharSequence::isEmpty method was added in the most recent release. All the JVM knows is that it’s been asked to invoke a method, it found two, and it has no further information about which to call. Therefore, the only thing it can do is throw an error.

Finally, could this problem have been avoided in the first place? The JDK team had done some analysis to determine whether adding CharSequence.isEmpty() would cause any incompatibilities. The analysis probably looked for no-arg methods with the same name but with a different return type. It might have looked for a method named isEmpty() with a non-public access level, another cause of incompatibilities. But these are both source incompatibilites. Or maybe the analysis missed Eclipse Collections entirely.

One thing that future analyses ought to look for is interfaces with a matching default method. That would have turned up PrimitiveIterable, and which runs the risk of binary incompatibility. By itself this isn’t a problem, but it would cause a problem for any class that implements both interfaces. It turns out that CharAdapter (and related classes) do implement both, so that’s clearly a binary incompatibility.

Even if CharAdapter and friends didn’t exist (and even now after they’ve been fixed) there is still a possibility that further incompatiblities exist. Consider some application class that happens to implement both PrimitiveIterable and CharSequence. That class might work perfectly fine with Eclipse Collections 10.3.0 and JDK 14. But it will fail with JDK 15. The problem will persist even if the application upgrades to Eclipse Collections 10.4.0, since the incompatibility is with the application class, not with CharAdapter and friends. So, that application will have to be fixed, too.

Now that we’ve described the problem and the possibility of incompatibilities, does it mean that it was a mistake to have added CharSequence.isEmpty()? Not necessarily. Even if we had noticed the incompatibility in Eclipse Collections prior to the addition of the isEmpty() default method, we might have gone ahead with it anyway. The criterion isn’t to avoid incompatibility at all costs. Instead, it’s whether the value of adding the new default method outweighs the cost and risk of incompatibility. That said, it would have been better to have noticed the incompatibility earlier and discussed it before proceeding, instead of putting an external project like Eclipse Collections into the position of having to fix something in response to a change in the JDK.

In summary, adding a default method to an interface can result in source and binary incompatibilities. The possibility of the source incompatibility is perhaps obvious, but the binary incompatibility is quite subtle. Both of these have been a possibilities since Java 8 was delivered in 2014. But to my knowledge this is the first time that the addition of a default method has resulted in a binary incompatibility with a real project (as opposed to a theoretical exercise or a toy program). It behooves us to do a more rigorous search for potentially conflicting methods the next time we decide to add a default method to an interface in the JDK.

I was saddened to hear news of Bill Shannon’s recent passing. He joined Sun very early, as employee number 11, soon after Sun’s founding in 1982. As far as I know, he was the earliest Sun employee remaining at Oracle. He was an engineering leader already by the time I joined Sun in 1986. I had the privilege of working with him — and sometimes against him — on several occasions.

Back in the day, people at Sun would refer to each other by their Unix logins. (I was “smarks”, and to some extent, I still am.) To this day I think of Bill simply as “shannon”. The other day I tweeted a few memorable quotes from shannon. Each of them is backed by a funny story, which I’ll relate here. If you ever heard Bill speak, please imagine these spoken in his imperious baritone.

Sometime in the 1990s, Sun’s internal network was organized into domains that corresponded to the overall functional area in which one worked. The engineering groups were under “eng.sun.com”, the corporate management was under “corp.sun.com”, and so forth. We had email addresses that were tied to the domain name, so I was smarks@eng.sun.com. At some point it as decided that everything would be reorganized into geographic domains. I worked in the San Francisco Bay Area region, so the old domains would be replaced with the sfbay.sun.com domain. An announcement went out that described this change, and it said something like,

Please inform all of your contacts that your new email address will be login@sfbay.sun.com instead of the old login@eng.sun.com. The eng.sun.com email addresses will stop working in 90 days.

I thought, this is ridiculous. I’ve handed out countless business cards that have my eng.sun.com email address on them, and I can’t track down everybody I’ve ever given a business card. I’ve written that email address on papers that have been published in conference proceedings, and those can’t be changed. I can’t be the only one with this problem, either. But, I thought naïvely, it should be pretty simple to set up an MX record (a DNS mail exchanger record) to handle email sent to addresses in the old eng.sun.com domain. I filed a ticket to request that, but it was summarily closed by the network administrators with some explanation like, “Mail forwarding is not possible.” Oh well, I guess I don’t know anything about running a corporate network with thousands of nodes, and I let it drop.

A couple days later, shannon sent mail to all of engineering, describing exactly the same problem I was concerned about. I replied to him, saying that I had requested an MX record be published, but the ticket had been closed. He said, “Yes, that’s what should be done. I’ll talk to the network administrators about it.”

A couple days later, he followed up with this:

    You're right, these people are idiots.

A project that shannon and I worked on together was a large joint development project with another company (which I won’t name, but whose initials are H.P.) Well, OtherCompany had a penchant of coming up with incredibly complex, fragile designs that tried to solve problems that didn’t really need solving.

In desktop systems, it’s pretty common to have a portion of a window that lets users edit text. This is usually implemented by a “text editor” widget provided by the windows tookit library, but created and managed by the application. Apparently this was unsatisfactory for OtherCompany, so they wanted to have a single, “daemon” process that managed all of the text editor widgets for every application on the desktop. At Sun we all thought this was a terrible idea, but OtherCompany wouldn’t let go of it.

At one point there was a conference call where shannon and and others at Sun had a review for this design with OtherCompany. It went something like this.


shannon: Now let me get this straight. Instead of each
application owning its own text widgets, all the text
editing functions are centralized into a single process?

OtherCompany: Yes.

shannon: And instead of each application process handling
keyboard events for its text widgets, those events will be
handled by this centralized daemon process?

OtherCompany: Yes.

shannon: So all the text data that the user has entered will
be in this daemon process, not in the application?

OtherCompany: Yes.

shannon: And if this other process crashes, what happens
to that data?

OtherCompany: (discussion) All the text data is lost.

shannon: And if this daemon process hangs, then what will
happen to the applications on the desktop?

OtherCompany: (discussion) They will all hang.

shannon: ...

OtherCompany: ...

shannon: Do you see anything wrong with this architecture?

Bill made a big impression on me early on, well before I actually met him. I joined Sun in 1986, as an impressionable young engineer fresh out of school. Fairly early on I heard about some guy “shannon” who was a bigwig in the Systems group. I was in a separate group, the Windows group, so we didn’t interact.

Some time soon after I joined, shannon sent an email to all of engineering, with a policy statement. (This was before I started to save email compulsively, otherwise I would have dug up the original.) As I recall, it went something like this:


This is a statement on the Systems Group's policy for code
that is checked into SunOS. The policy is:

    * All code must conform to the Sun C Style Guide

Non-conforming code that is posted for review will be
rejected until it does conform.

Non-conforming code that is checked into the source base
will be backed out and will not be permitted to be checked
in until it does conform.

If you do not understand this policy, I will come to your
office and explain it until you do.

This only applied to SunOS code, not Windows code, so it didn’t affect my day-to-day work. But as a young engineer I found it to be hair-raising! The lesson I took from this was, you do not want to cross shannon.

It’s a lesson that served me well over the years. 🙂

Like Bill, I stayed on at Sun all the way up until the 2010 acquisition by Oracle, and we stayed at Oracle until the present day. We didn’t work together too closely in recent years, though we both worked on Java – he worked on Java EE, and I worked on Java ME and Java SE. We were even in the same building on Sun’s (later Oracle’s) Santa Clara campus for several years. It’s amazing that he’s been around nearby for literally my entire career. It’s huge loss that he’s gone. Bye shannon, we’ll miss you.

Here are some links to other pages about Bill.

The other day on Twitter I said, “Scanner is a weird beast. I wouldn’t necessarily use it as a good example for anything.” The context was a discussion about classes that are both an Iterator and are AutoCloseable. As it happens, Scanner is such an example. It’s an Iterator, because it allows iteration over a sequence of tokens, and it’s also AutoCloseable, because it might have an external resource (like a file) contained within it. I wouldn’t hold it up as an example of good object design, though. This article explains why.

Scanner has a pretty complicated API, but once you figure out how to use it, it’s incredibly useful. Its main issue is that it’s trying to do too many things at once. The good news is that you can use parts of the API for stylized uses and mostly ignore other parts of the API.

At its core, Scanner is about regex pattern matching. Unlike the Pattern and Matcher classes, which can only match on a fixed input such as a String, Scanner allows you to match over arbitrary input that might not even exist in memory. There are several Scanner constructors that allow input to be read from various sources such as files, InputStreams, or channels. Scanner handles buffering, and it reads additional input as necessary, and it discards any input that was skipped over during matching. This is really cool. It means you can do matching over arbitrarily sized input data using just a few KB of memory.

(Naturally this depends on the patterns used for matching as well as the well-formedness of input. For example, you can attempt to read a file line by line, and this will work for an arbitrarily sized file if it’s broken up into reasonably sized lines. If the file doesn’t have any line separators, Scanner will bring the whole file into memory, as the file conceptually contains one long line.)

Scanner has two fundamental modes of matching. The first mode is to break the input into tokens that are separated by delimiters. The delimiters are defined by the regex pattern you provide. (This is rather like the String.split method.) The second mode is to find chunks of text that result from matching the regex pattern you provide. In other words, the token mode provides the text between matches, and the find mode provides the text of the matches themselves. What’s odd about the Scanner API is that there are groups of methods that apply in one mode but not the other.

The methods that apply to the tokens mode are:

  • delimiter
  • locale
  • hasNext* (excluding hasNextLine)
  • next* (excluding nextLine)
  • radix
  • tokens
  • useDelimiter
  • useLocale
  • useRadix

The methods that apply to the find mode are:

  • findAll
  • findInLine
  • findWithinHorizon
  • hasNextLine
  • nextLine
  • skip

(Additional Scanner methods apply to both modes.)

Here’s an example of using Scanner for matching tokens:

    String story = """
        "When I use a word," Humpty Dumpty said,
        in rather a scornful tone, "it means just what I
        choose it to mean - neither more nor less."
        "The question is," said Alice, "whether you
        can make words mean so many different things."
        "The question is," said Humpty Dumpty,
        "which is to be master - that's all."
        """;

    List<String> words = new Scanner(story)
        .useDelimiter("[- \\.\n\",]+")
        .tokens()
        .collect(toList());

(Note, this example uses the new Text Blocks feature, which was previewed in JDK 13 and 14 and which is scheduled to be final in JDK 15.)

Here, we set the delimiter pattern to match whitespace and various punctuation marks, so the tokens consist of text between the delimiters. The results are:

    [When, I, use, a, word, Humpty, Dumpty, said, in, rather, a, scornful,
    tone, it, means, just, what, I, choose, it, to, mean, neither, more,
    nor, less, The, question, is, said, Alice, whether, you, can, make,
    words, mean, so, many, different, things, The, question, is, said,
    Humpty, Dumpty, which, is, to, be, master, that's, all]

In this example I used the tokens() method to provide a stream of tokens. Scanner implements Iterator<String>, which allows you to iterate over the tokens that were found, using the typical hasNext/next methods. Unfortunately, Scanner does not implement Iterable, which would allow you use it within a for-loop.

Scanner also provides pairs of hasNext/next methods for converting tokens to data. For example, it provides hasNextInt and nextInt methods that search for the next token and convert it to an int (if available). Corresponding pairs of methods are also available for BigInteger, boolean, byte, double, float, long, and short. These pairs of methods are “iterator-like” in that the hasNextX/nextX method pairs are just like the hasNext/next method pair of an Iterator, with the addition of data conversion. But there’s no way to wrap them in an Iterator, like Iterator<BigInteger> or Iterator<Double>, without writing your own adapter code. This is unfortunate, since Scanner is an Iterator<String> but its Iterator is only over tokens, not the value-added iterator-like constructs that include data conversions.

The other main mode of Scanner is the find mode, which provides a succession of matches from a pattern you provide. Here’s an example of that:

    List<String> words = new Scanner(story)
        .findAll("[A-Za-z']+")
        .map(MatchResult::group)
        .collect(toList());

Here, instead of matching delimiters between tokens, I’ve provided a pattern that matches the results I want to get. Note that return of findAll() is Stream<MatchResult> and which must be converted to strings; that’s what the MatchResult::group method does. The resulting list is the exact same list of words as the previous example. Personally, I find this mode more useful than the tokens mode. You’re providing the pattern for the text you’re interested in, as opposed to a pattern for the delimiters between the text you’re interested in. Also, you get back MatchResult objects, which are useful for extracting substrings of what you matched. This isn’t available in tokens mode.

I started off this article saying that Scanner is weird but useful. It’s weird because it has these two distinct modes. It has groups of methods that apply to one mode but not the other. If you look at the API carefully (or at the implementation) you’ll also see that there is also a bunch of internal state that applies to one mode but not the other. It seems like Scanner should have been split into two classes. Another weird thing about Scanner is that it’s an Iterator<String>, which elevates one part of one of the modes to the top level of the API and relegates the other parts to second-class status.

That said, Scanner provides some very useful services. It does I/O and buffering for you, and if regex matching needs more input, it handles that automatically. I’m also partial to the streams-returning methods like findAll() and tokens() — I have to admit, I added them — but they make bulk processing of arbitrary input quite easy. I hope you find these aspects of Scanner useful as well.

And what we might do about it

Brian Goetz and I gave this presentation in Antwerp, Belgium on November 7, 2019. The original title of this talk (as posted on the conference program) was “Why We Hate Java Serialization And What We’re Doing About It.” We made a slight adjustment to the title just before the presentation.

I initially proposed this talk to Brian because I felt we needed to correct the record about Java serialization. It’s very easy to criticize Java serialization in retrospect. We hear a lot of comments like “Just get rid of it!” but in fact serialization was introduced because it solves — and continues to solve — a very important problem. Like many complex systems, it has flaws, not because its designers were stupid, but because of typical software project difficulties: disagreements over the fundamental goals, being designed and implemented in a hurry, and a healthy dose of corporate politics.

We wanted to document very precisely where we think Java serialization’s flaws are: at the binding to the object model. In addition, Brian and the Java team had been thinking a lot about what the future of serialization would be, and we wanted to present that as well. Those ideas are described in more detail on Towards Better Serialization (June, 2019).

(This is a backdated article, posted on April 2, 2021. I’ve represented to the best of my ability my perspective at the time the presentation was given in November, 2019.)

Oracle Code One 2019

Here’s a quick summary of Oracle Code One 2019, which was last week.

It essentially started the previous week at the “Chinascaria”, Steve Chin‘s Community BBQ for JUG leaders and friends. Although Steve is now at JFrog, he’s continuing the BBQ tradition. Of course Bruno Souza, Edson Yanaga, and some other cohorts from Brazil were manning the BBQ, and there was plenty of meat to be had. I didn’t get many photos, but Ruslan from JUG.RU was there and he insisted that we take a selfie:

Hi Ruslan! Oh, here’s a tweet with the chefs from the BBQ:

Java Keynote

The conference kicked off with the Java keynote, The Future of Java is Now, led by Georges Saab. The pace was pretty brisk, with several walk-on guests. We heard from Jessica Pointing talk about quantum computing, and from Aimee Lucido on her new book, Emmy in the Key of Code.  This sounds really cool, a book written in Java-code-like verse. This should be interesting to my ten-year-old daughter, since she’s reading the Girls Who Code series right now. I have to say this is the first time I’ve shown a segment of a conference keynote to my family!

Naturally a good section of the keynote covered technical issues. Mikael Vidstedt and Brian Goetz ably covered the evolution of the JVM and the Java programming language. Notably, Mark Reinhold did not appear; he’s taking a break from conferences to refocus on hard technical problems.

My Sessions

This year, I had two technical sessions and a lab. This was a pretty good workload, compared with previous years where I had half a dozen sessions. I felt like I made a good contribution to the audience, but it left time for me to have conversations with colleagues (the “hallway track”) and to attend other sessions I was interested in.

My sessions were:

Collections Corner Casesslidesvideoccc.jshell

This session covered Map’s view collections (keySet, values, entrySet) and topics regarding comparators being “inconsistent with equals.”

Local Variable Type Inference: Friend or Foe?slidesvideo

(with Simon Ritter)

When Simon and I did an earlier version of this talk at another conference, we called it “Threat or Menace.” This probably doesn’t translate too well; to me, it has a 1950s red scare connotation, which is distinctly American. I think that’s why Simon changed it to Friend or Foe. It turns out that Venkat Subramaniam also had a talk on the same subject, entitled “Type Inference: Friend or Foe”!

Lambda, Streams, and Collectors Programming Laboratorylab repository

(with Maurice Naftalin and José Paumard)

This lab continues to evolve; there are now over 100 exercises. Thanks to Maurice and José for continuing to maintain and develop the lab materials. I recalled that we first did a Lambda Lab at Devoxx UK in 2013, which was before Java 8 was released. Maurice and Richard Warburton and I got together an hour beforehand and came up with about half a dozen exercises. It was a bit ad hoc, but we managed to keep a dozen or so people busy for an hour and a half.

More recently we (mostly José) have added and reorganized the exercises, converted the project to maven, and converted the test assertions to AssertJ. I’ve finally come around to the idea that maven is the way to go. However, the lab attendees still had their fair share of configuration problems. The think the main problem is the mismatch between maven and the IDE. It’s possible to build the project on the command line using maven, but hitting the “Test” button in the IDE does some magic that doesn’t necessarily invoke maven, so it might or might not work.

Meet the Experts

One thing that was new this year was the “Meet the Experts” sessions. In the past we’d be asked to sign up for “booth duty” which consisted of standing around for a couple hours waiting for people to ask questions. This was mostly a waste of time, since we didn’t have flashy demos. Instead, we scheduled informal, half-hour time slots at a station in the Groundbreakers Hub, and these were put onto the conference program. The result was that people showed up! I signed up for two of these. I didn’t have a formal presentation; I just answered people’s questions. This seemed considerably more useful than past “booth duty.” People had good questions, and I had some good conversations.

Everything You Ever Wanted To Know About Java And Didn’t Know Whom To Askvideo

I hadn’t signed up for this session, but the day before the session, Bruno Souza corralled me (and several others) into participating in this. Essentially it’s an impromptu “ask me anything” panel. He convinced about 15 people be on the panel. This included various JUG leaders, conference speakers, and experts in various areas. During the first part of the session, Bruno gathered questions from the audience and a colleague typed them into a document that was projected on the screen. Then he called the panelists up on stage. The rest of the session was the panel picking questions and answering them. I thought this turned out quite well. People got their questions answered, we covered quite a variety of topics, and it provoked some interesting discussions.

Other Sessions of Interest

I attended a few other sessions that were quite useful. I also watched on video some of the sessions that I had missed. Here they are, in no particular order:

Robert Seacord, Serialization Vulnerabilitiesvideo

Mike Duigou, Exceptions 2020 (slide download available)

Sergey Kuksenko, Does Java Need Value Types? Performance Perspectivevideo

Brian Goetz, Java Language Futures, 2019 Editionvideo

Venkat Subramaniam, Type Inference: Friend or Foe?video

Robert Scholte, Broken Build Tools and Bad Behaviors (slide download available)

Nikhil Nanivadekar, Do It Yourself: Collections

Here’s the playlist of Code One sessions that were recorded.

Unfortunately, not all of the sessions were recorded. Some of the speakers’ slide decks are available for download via the conference catalog.

It was recently announced that Jakarta EE will not be allowed to evolve APIs in the javax.* namespace. (See Mike Milinkovich’s announcement and his followup Twitter thread.) Shortly thereafter, David Blevins posted a proposal and call for discussion about how Jakarta EE should transition its APIs into the new jakarta.* namespace. There seem to be two general approaches to the transition: a “big bang” (do it all at once) approach and an incremental approach. I don’t have much to add to the discussion about how this transition should take place, except to say that I’m pleasantly surprised at the amount of energy and focus that has emerged in the Jakarta EE community around this effort.

I’m a Java SE guy, so the details of Java EE and Jakarta EE specifications are pretty much outside my bailiwick. However, as Dr Deprecator, I should point out that there is one area of overlap: the dependence of Java EE / Jakarta EE APIs on deprecated Java SE APIs. One example in particular that I’m aware of was brought to my attention by my colleague Sean Mullan, who is tech lead of the Java SE Security Libraries group.

The Java SE API in question is java.security.Identity, which was deprecated in JDK 1.2 (released 1998) and deprecated for removal in Java 9. Since this API has been deprecated for a very long time, and we’d like to remove it from Java SE. For most purposes, it can be replaced by java.security.Principal, which was added in JDK 1.1 (released 1997).

The EJB specification uses the Identity type in a couple methods of the EJBContext class. If we were to remove Identity from some release of Java SE, it would mean that EJB — and any Java EE, Jakarta EE, or any other framework that includes EJB — would no longer be compatible with that release of Java SE. We’ve thus held off removing this type for the time being, in order to avoid pulling the rug out from underneath the EE specs.

Identity is used only in two methods the EJBContext class. It appears that these methods were deprecated in EJB 1.2, and replacements that use Principal were introduced at that time. Since J2EE 1.2 was introduced in 1999, things have been this way for about 20 years. I think it’s time to do some cleanup! (See EJB-spec issue #130.)

For better or for worse, these methods still appear in Java EE 8. As I understand things, the next specification release will be Jakarta EE 9, which will be the earliest opportunity to change the EE specification to remove the dependency on the deprecated SE APIs.

The usual argument against removing stuff is that it’s both source and binary incompatible. If something falls over because of a missing API, it’s pretty hard to work around. This is the reason that deprecated stuff has stayed around for so many years. On the other hand, if these deprecated APIs aren’t removed now, when will they be removed?

I’d argue that the upcoming package renaming (whether incremental or big bang) is an opportunity to remove obsolete APIs, because such renaming is inherently both source and binary incompatible. People will have to run migration tools and change their code when they transition it from Java EE 8 to Jakarta EE 9. There can be no expectation that old jar files will run unchanged in the new Jakarta world. Thus, the package renaming is an opportunity to shed these obsolete APIs.

I’m not aware of any EE APIs other than EJBContext that depend on Java SE APIs that are deprecated for removal. I did a quick check of GlassFish 5 using the jdeprscan tool, and this one was the only API-to-API dependency that I found. However, I’m not an expert in EE and GlassFish, so I’m not sure I checked the right set of jars. (I did find a bunch of other stuff, though. Contact me if you’re interested in details.)

I had a brief Twitter exchange with David Blevins on this topic the other day. He pointed me at the parts of the TomEE implementation that implements EJBContext, and it turns out that the two methods in question simply throw UnsupportedOperationException. This is good news, in that it means TomEE applications aren’t using these methods, which means that those applications won’t break if these methods are removed.

However, that doesn’t mean these methods can simply be removed from EE implementations! The TCKs have what is called a “signature test,” which scans the libraries for the public classes, fields, and methods, to make sure that all the APIs required by the specifications are present and that there are no extra APIs. I’m fairly sure that the EE TCK signature test contains entries for those methods. Thus, what needs to happen is that the Jakarta EE specification needs to remove these methods, the EE TCK needs to be updated to match, and then implementations can remove — in fact, will be required to remove — these methods when they’re brought into conformance with the new specification.

Note that all of this is separate from the question of what to do with other deprecated Jakarta EE APIs that don’t depend on deprecated Java SE APIs. Deprecated Jakarta EE APIs might have been deprecated for their own reasons, not because of their dependency on SE APIs. These should be considered on their own merits and an appropriate removal plan developed. Naturally, as Dr Deprecator, I like removing old, obsolete APIs. But the deprecation and potential removal plan for deprecated Jakarta EE APIs needs to be developed with the particular evolution path of those APIs in mind.

This is a very belated post that covers a session that took place at the JavaOne conference in San Francisco, October 2017.

Here’s a recap of the BOF (“birds-of-a-feather”) session I led on software maintenance. The title was Maintenance – The Silent Killer. This was my feeble attempt at clickbait. This was an evening session that was held during the dinner hour, and maintenance isn’t the most scintillating topic, so I figured attendance needed all the help I could give it.

When the start time arrived, I was standing on the podium in an empty room. I thought, well, if nobody shows up then I can go home early. Then about fifty people flooded in! It turns out they had lined up outside waiting for their badges to be scanned, but then a conference staffer came by and told them that badges weren’t scanned for the evening sessions and that they should just go in.

Overall I thought it went quite well. I gave a brief presentation, and then set up some discussion questions for the audience. The people who showed up really were interested in maintenance, they offered a variety of interesting insights and views, and they were quite serious about the topic. There was enough discussion to fill the allotted time, and there was plenty of interaction between me and the audience and among audience members themselves. I’ll declare the session to have been successful, though it’s difficult for me to draw any grand conclusions from it. I was heartened by the amount of participation. I was really concerned that nobody would show up, or perhaps that three people would show up, since most tech conferences are about the latest and greatest new shiny thing.

The session wasn’t recorded. What follows is some notes on my slide presentation, followed by some additional notes from the discussion that followed. These are unfortunately rather sparse, as I was participating at the same time. However, I did capture a few ideas that I hadn’t considered previously, which I found quite beneficial.

Slide Presentation (PDF)

Slide 2: Golden Gate Bridge. I grew up in Marin County, which is connected to San Francisco by the Golden Gate Bridge. We crossed the bridge frequently. Back in 1974 or so the toll was raised from 50¢ to 75¢, and my parents complained incessantly about this. At one point I had the following conversation with my Dad about the toll:

Me: Why do they collect tolls?
Dad: To pay off the bridge.
Me: When will the bridge be paid off?
Dad: Never!

As I kid I was kind of perplexed by this. If you take out a loan, and make regular payments on it, won’t it eventually be paid off? (Sub-prime mortgages weren’t invented until much later.) Of course, the original construction loans have long since been paid off. What the tolls are used for, and which indeed will never be paid off, is the continuous maintenance that the bridge requires.

Slide 3: This is me driving my car through Tunnel Log in Sequoia National Park. The point isn’t about a tunnel through a tree, but the cost of owning and operating a car. The first time I used my car for business expenses, I was surprised by the per-mile reimbursement amount. If you consider the 2017 numbers, this car’s gasoline costs about 14¢-20¢ per mile, and the IRS standard reimbursement rate is 53.5¢ per mile. Hey, I’m making money on this deal!

No. This is a 1998 BMW, and you will not be surprised to learn that the cost of maintenance on this car is quite significant. Indeed, I’ve added up the maintenance costs over the lifetime of the car, and they outweigh the cost of gasoline. Counting maintenance and depreciation, I’m decidedly not making money on mileage reimbursement.

Slide 4 has some points on maintenance as a general phenomenon. One point that bears further explanation is my claim that “deferred maintenance costs can grow superlinearly.” Continuing with the car example, consider oil changes. It might cost a couple hundred dollars a year for regular oil changes. You could save money for a couple years by not changing the oil. This might eventually result in a several thousand dollar engine rebuild. “Superlinear” isn’t very precise, but the point is that the cost of remediating problems caused by deferred maintenance is often much greater than the sum of incremental maintenance costs.

Slide 5, quotation from Kurt Vonnegut. Perhaps profound if you’ve never heard it before, but a cliché if you pay attention to maintenance. It does seem to be true that in general creative activities get all the attention at the expense of maintenance activities.

Slides 6-7. Physical systems exhibit wear and friction and this contributes to the need to do regular maintenance. Software doesn’t “wear” out. But there are a bunch of phenomena that cause software systems to require maintenance. Primarily these seem to be related to the environment in which the software exists, not the software itself.

Slides 8-9. Most planning and costing efforts around software projects are concerned with software construction. Maintenance is a significant cost, accounting for perhaps 50% to 75% (Boehm) or 40% to 80% (Glass) of the total life cycle costs. However, comparatively little planning and budgeting effort goes toward maintenance.

Glass points out that software maintenance and construction are essentially the same activity, except that maintenance requires additional effort to “understand the existing product.” As a programmer, when you’re developing software, you know what you’re trying to do and you’re familiar with the code you’re developing at the moment. When maintaining software, you often have to deal with code that you might never have seen before and figure out what it does before you can modify it successfully. The cost incurred in re-acquiring knowledge and understanding of an existing software system is significant.

Slide 10. OpenJDK is an open source implementation of Java. It’s an old code base; Java 1.0 was released in 1996, and it was in development for a couple years prior to that. It’s been continually evolved and maintained since then. Evolution consists of usual software activities such as adding features, improving performance, fixing bugs, mitigating security vulnerabilities, and maintaining old releases. Maintenance activities are a large portion of the team’s activities. I’m not sure how to measure it, but the estimates from Boehm and Glass above are quite plausible.

In addition to the above development activities the team also puts effort into deprecation and removal of obsolete features. This is important because, among other things, it helps to reduce the long-term maintenance burden. See some of my prior materials on the topic of deprecation:

The cost of knowledge re-acquisition mentioned previously is somewhat mitigated by systems maintained by the JDK group that preserve history.

The open version of the JDK source code in the Mercurial version control system, and it includes changesets dating back to December 2007. The earlier source code history is in a closed, Oracle-internal system and dates back to August 1994.

The JDK Bug System (a JIRA instance) contains over 265,000 bugs and feature requests dating back to late 1994. Many of these bugs were converted from a Sun Microsystems internal bug database.

Personally, I’ve found that the ability to search over 20 years of source code history and bug history to be of immense value in understanding existing code and diagnosing problems.

Slide 11. A big driver of software maintenance is security vulnerabilities. This has gotten worse in recent years, as “everything” is connected to the internet. Another significant contributor to maintenance issues is the large number of dependencies among software components, many of which are in open source. By reusing external software components, you can reduce development time. However, doing so takes on the maintenance burden of those components. Either you have to keep all the external components up to date, or you have to maintain them yourself.

Slide 12. Questions and Audience Discussion

The slide has several questions to spark discussion with the audience. We didn’t address them directly, but there was a relatively free-flowing conversation. Here are some notes from that conversation.

One audience member compared maintenance to a fence. Suppose you have a pasture, and wolves keep coming to it and attacking your sheep. So you put up a fence. The fence just sits there. The sheep grace peacefully. Wolves stay away because they realize they can’t get past the fence. Nothing happens. The fact that nothing is happening is a huge benefit! Like a fence, a well-maintained system just does its thing without calling attention to itself. This may lead people to forget about it. A poorly-maintained system is constantly breaking, attracting lots of attention.

An attendee suggested thinking about maintenance planning the same way a project manager thinks about risk management. With less maintenance there is a greater risk of failure, and vice-versa.

Another attendee suggested insurance as a model for maintenance. Maintenance costs are like insurance premiums: you pay them regularly, and you’re protected. Not paying them saves money temporarily, until some disaster strikes. (Rather like my car oil change example above.) Of course, insurance is closely related to risk management, and as a social institution it seems poorly understood by most lay individuals.

An audience member suggested just biting the bullet and declaring that maintenance is just a cost of doing business. There’s no use complaining about it; you just have to accept it. Another audience member said that his department allocated 10% of its budget to maintenance costs.

Regarding keeping up with the software updates, one attendee pointed out that it’s not necessarily important to be on the latest software release, but instead it’s important to be on the latest patch or update level even if you’re on an old release. Many commercial software products have support contracts where they will maintain old releases for many years. They don’t have the most features or the highest performance, but they are maintained with fixes for current security vulnerabilities and other high priority problems.

(This is a big component of the business of my company, Oracle. This is also true of products from many other software companies.)