Cranium

Over the last few weeks I've been trying to hunt down a memory leak in a servlet based web application. Periodically the Java virtual machine in which Tomcat was running would inexplicably run out of PermGen space and become so unresponsive that the only solution was to kill and restart the server process. After a lot of hunting through logs and trawling the Internet for pointers, I've found that the problem actually occurs when a web application is redeployed, although the out of memory error may occur later (which is why it was difficult to spot in the logs).

It turns out that when an application is redeployed the old classloader should be garbage collected which should free up both heap and PermGen memory by removing all the information related to the discarded web application. Unfortunately if something outside your web application holds a reference to even one class within the application which was loaded via the applications classloader then the classloader itself, and hence all the class information it has loaded, will not become eligible for garbage collection and this, eventually, results in exhaustion of the PermGen memory pool. If that isn't initially clear, never fear, as Frank Kieviet wrote a brilliant article (with diagrams) which explains the problem in more detail.

Looking back through the Tomcat logs it seems as if something within one of the libraries I was using is leaking a Timer instance which stops the classloader being garbage collected. I haven't actually managed to fix the problem yet but I did learn quite a few things along the way which I've collected together and turned into....
Cranium is a web application (distributed as a WAR file) that provides information on the memory usage of the servlet container in which it is being hosted. This includes information on all the memory pools (both heap and non-heap) as well as class loading and garbage collection. It also incorporates two different ways of triggering garbage collection to help monitor for memory leaks etc. Rather than trying to explain in detail what Cranium allows you to monitor I'm hosting it as a demo for you to look at (although I've disabled the garbage collection tools so that they cannot be used to make the server unstable).

As with most of my software Cranium is open-source and you can grab the code from my SVN repository or you can simply grab a pre-built WAR file. If you want to track development of Cranium then you can monitor it via my Jenkins server which also produces a bleeding edge WAR file on each build.

I know a lot of the information Cranium displays is available through other tools but I'm already finding it really useful and I hope that at least one other person does too!