Friday, July 15, 2005

Application Versioning

Tom ball explains how class loading can be used to achieve simple application versioning -

Versioning, which I'm defining for this entry as how a Java application manages its external library dependencies, has been a tough issue ever since Java first released. Back when Java was born, the vision was that each machine would have a single Java runtime and standard libraries which would always be fully backwards-compatible. The reality has been that for most apps, the only reasonable alternative to testing a full matrix of released JREs and libraries is to instead package everything the app needs, install the whole hairball on each customer's system and use a custom classpath to access it. The problem with a custom classpath is that it is easy for your customers to break in subtle (and not so subtle) ways, which makes them cranky and can drive your tech support engineers crazy. Some work has been done in the JDK via its Package Versioning Specification and API, but there are still times when your app really needs to keep specific libraries under tight control.

NetBeans has this problem with its javac bridge, which allows its editor and refactoring modules access to javac's error checking and parsing support. The problem is that javac doesn't have a public API, so while a tool can leverage a specific version of javac, it cannot rely on whatever is on the customer's machine since its internal API may be radically different. We have a recent version of javac that works with our bridge, but just adding it to the NetBeans classpath won't work for two reasons:
  • NetBeans supports many different JDKs, each of which have their own version of javac; and
  • The Mac OS X includes the javac classes in its bootclasspath (and supportable products shouldn't whack the bootclasspath if possible).
The solution proved fairly trivial to implement while being quite robust: define a ClassLoader to isolate the bridge's use of javac classes from the rest of the IDE. Here is the ClassLoader implementation we use; NetBeans has several similar loaders for specific modules, but Tomas Hurka wrote this Factory class for the javac bridge:

    private static class GJASTClassLoader extends URLClassLoader {
private final PermissionCollection permissions = new Permissions();

public GJASTClassLoader(URL gjastJar) {
super(new URL[] {gjastJar}, Factory.class.getClassLoader());
permissions.add(new AllPermission());
}

protected Class loadClass(String n, boolean r) throws ClassNotFoundException {
if (n.startsWith("com.sun.tools.javac") || n.startsWith("org.netbeans.lib.gjast")) {
// Do not proxy to parent!
Class c = findLoadedClass(n);
if (c != null) return c;
c = findClass(n);
if (r) resolveClass(c);
return c;
} else {
return super.loadClass(n, r);
}
}

protected PermissionCollection getPermissions(CodeSource codesource) {
return permissions;
}
}
As you can see, we rely on URLClassLoader to do all the heavy lifting. Our version isolation support is in loadClass(), where a test is made of the requested class name to see if it is in one of the packages to be isolated (here, we test whether the class is a javac or bridge class). If it is an isolated class, URLClassLoader.findLoadedClass() and findClass() look it up in the jar file we supplied in the constructor; otherwise we let URLClassLoader.loadClass() delegate to the parent classloader.

Now, we need to interact with classes loaded by this classloader. What works best for us is to define a simple interface which the versioned classes and their client code shares, and a factory class that uses reflection to load the class which implements that interface (in 1.0, you needed a default constructor and used Class.newInstance()). Here's a simplified example (from the same Factory class):

public interface ErrorChecker {
int parse() throws CompilerException;
}

public final class Factory {
private static Factory instance = null;
private static Constructor newErrorChecker;

public static synchronized Factory getDefault() {
if (instance == null) {
instance = new Factory();
Class[] newCheckerTypes = new Class[] {
ECRequestDesc.class
};

File gjastJar = InstalledFileLocator.getDefault().locate("modules/ext/gjast.jar", "org.netbeans.modules.javacore", false);
try {
ClassLoader loader = new GJASTClassLoader(gjastJar.toURI().toURL());
Class c = Class.forName("org.netbeans.lib.gjast.ASErrorChecker", true, loader);
newErrorChecker = c.getConstructor(newCheckerTypes);
} catch (Exception e) {
}
}
return instance;
}

public ErrorChecker getErrorChecker(ECRequestDesc desc) {
try {
return (ErrorChecker) newErrorChecker.newInstance(new Object[] { desc });
} catch (Exception e) {
Throwable t = e.getCause();
throw new RuntimeException("Cannot create errorChecker: " +
t != null ? t : e);
}
}
}
In the above, we fetch the ASTErrorChecker constructor via reflection, then use it whenever the client requests a new ErrorChecker implementation. Because the interface doesn't directly or indirectly reference any class types in our private javac copy (CompilerException is also shared), objects created using its classes can interact with the client without conflict.

There is one thing to watch for (there always is), however: sometimes you can find yourself pondering the impossible, like I did yesterday:

debugshot.png

What caught me off-guard is that the debugger shows the type of "ex" is EmptyScriptException, but if it were that type then it should have been caught by previous catch block. Worse, a "(ex instanceof EmptyScriptException)" watchpoint returns "false", when it "obviously" should be true. The issue is that a class isn't just defined by its bytecode (the classfile's contents), but by the combination of bytecode and classloader. Here, there were two copies of EmptyScriptException loaded: once by Jackpot's private classloader, and once by the NetBeans one. Instances of one class copy will fail instanceof and catch tests with the other. I frequently forget this subtlety until reminded by a few head bangs against my monitor. The fix is to add the class to your list of classes which your classloader ignores and therefore shares with its parent classloader.

Over time, I have learned the value of this behavior (the classes not mixing, not the head banging). Since I'm pretty lazy, the extra work required to share classes between classloaders means that my designs do as little class sharing as possible. It is easier to maintain a really strict isolation with only a few, simple interfaces, than it is to maintain a big list of shared classes and deal with the headaches of managing their dependencies. A nice bonus is that this sort of isolation lends itself to distributed and parallel designs, where the more lightly coupled remote objects are to each other, the better they work together. Besides, it's hard to convince your manager you need the latest fire-breathing multi-processor workstation if your design is hopelessly interlocked.

This blog entry is way too long. I hope however that it dispells the idea that writing a classloader is rocket-science or limited to a few obscure uses. Managing application versioning is a problem many application teams face, and some judicious classloading can make it much easier.

Remember, no rocket science, this. :-)

No comments: