java.think(): March 2008

Monday, March 31, 2008

Chronicles of a Terracotta Integration: Compass

Last week, I met up with Shay Banon, author of Compass, at the The Server Side: Vegas conference. We thought it would be great to see if we couldn't crank out an integration between Terracotta and Compass. You can read more about our integration from Shay himself.

I wanted to write a log of our efforts, because I thought it might provide some insight for anyone considering integrating Terracotta into their own project. I was particularly happy with our effort, because it outlines what I feel is the best approach for developing with Terracotta. The approach is actually quite simple. Because Terracotta adds clustering concerns to your application using configuration, you don't write code directly to Terracotta. Instead, you just write a simple POJO application without Terracotta, and then add the clustering later.

So the approach I recommend is the following:

Figure out how to implement the solution using a single JVM. NO TERRACOTTA. Use just simple POJOs and threads.

Implement and test your solution.

It helps to have envisioned, beforehand, what part of your implementation will become a Terracotta root. But it's not necessary. If your application is stateful, it will have a root.

Using the root, start with a basic Terracotta config file, and build up the appropriate config file to cover all the instrumentation and locking.

Test your application again, with a single jvm, but this time with Terracotta.

Tune your implementation.

Move to 2 or more JVMs.

That's it. So how did this play out for the Compass integration? Here is my rough recollection of the action.

10:00 am - Shook Hands - Shay and I met up at the conference.

10:05 am - Started coding - First we chatted a bit about our strategy. It seemed easiest to start with the existing Lucene RAMDirectory implementation and tune it up a bit.

10:30 am - Strategy decided - Based on my knowledge of Terracotta, and Shay's knowledge of Lucene/Compass, we decided on the following:

Start with the Lucene RAMDirectory implementation, but rewrite it as necessary to fit a simple POJO model

Since RAMDirectory is mostly unmaintained, we knew we had to just go through the implementation and clean it up. It comprises about 4 classes total, about 100 lines long, so the task was feasible.

Because Terracotta can just "plug" in to a well written application, and Shay has a comprehensive unit test suite (over 1,000 tests), a load test, and a concurrency test, we'd write the implementation first in POJOs

After verifying that the implementation works as expected in pure POJOs, then we would work on the configuration to inject Terracotta clustering

After running the solution with Terracotta, we would tune it.

And finally, we would wrap up various bits and pieces into a Terracotta Integration Module (TIM)

11:30 am - POJO Implementation done - We ended up rewriting the RAMDirectory, which was fine because it was in need of an overhaul anyway. Rewriting its implementation meant we now had a good understanding of the implementation.

Just a quick note - it was a real joy coding with Shay. He is a super smart guy, and it's great to work with someone like that. Of note, he really understands synchronization, which is really important to write applications correctly. Even better, he really got the principle of writing better code by writing less code. We went through the RAMDirectory implementation with a weed wacker, and what was left was about 1/2 the code. That was more readable and more maintainable. And is better performing. That was fun.

12:00 pm - Unit Tests pass - With some minor corrections, we had unit tests passing. We were both running out of power, and hungry, so we took a break to eat lunch, and agreed to resume in the afternoon.

1:30 pm - Write the Terracotta config file - While writing the POJO implementation, we already knew the key concepts we were going to need for writing up our config file. We added the appropriate instrumentation. We added the locking. A few config statements later, we had a working Terracotta configuration.

2:00 pm - We had Compass running on Terracotta! - Approx. time elapsed? 2 1/2 hours (most of which was spent rewriting the RAMDirectory implementation)

2:30 pm - Tuning Time- At first Shay threw me - he said oh man it looks like it's running really fast. Except it turns out he wasn't testing the right thing. And then he tells me oh man its really slow!

Now don't misunderstand this. I know Terracotta can go really fast. But I wasn't in the least bit surprised. And you shouldn't be either. How many pieces of code have you ever written that compiled and ran correctly - on the first try? Right. One, if you are lucky.

Terracotta is kind of like that. The first step is to get it right. And that means synchronization, and locking, and once you have all that, your application runs correctly, but slowly.

Fortunately, it's easy to fix.

And so I taught Shay how to tune up his Terracotta integration. Or rather, I showed him the tools he needed, and he went to town. I just sort of stood by watching, giving the occasional comment or two.

This was the fun part. It was time to take out the Admin console. The Terracotta Admin console gives you a wealth of information about your application. Of note:

You can browse your clustered data in realtime

You can monitor realtime statistics - including Terracotta txns/sec, Java Heap Memory, and CPU

You can access lock statistics using the lock profiler

You can snapshot over 30 metrics using the Statistics Recorder and visualize them using the Snapshot Visualizer

We started first with the object browser. Once convinced that we had the right data in the cluster, we moved on to performance.

On our first run, we measured the Terracotta txns/sec. I was actually pretty impressed to see that our server on his MacBook Pro was cranking out 10k/sec. But I knew we wanted this number to be lower. A lot lower.

So here comes the first rule for tuning Terracotta: adjust your locking to match your workload. It turns out that we had enabled an autolock for every single byte being written to the Lucene "files" - and this was hurting us pretty bad. Because we already had a lock on our byte array that we were writing to, we actually just deleted the synchronization, and the lock config from the method that wrote bytes into the "file" - and we observed a big drop in the Terracotta txns/sec. We went from the aforementioned 10k/sec to about 1750/sec.

Now what this means is that the Terracotta server was working just about 10x less for the same workload. And that means we were doing more work/transaction, and so our performance improved accordingly. You get the same effect with Hibernate - it batches up a bunch of little POJO updates into a single SQL statement - and that means you can do more real work because each SQL statement has more data in it. Lots of little SQL statements means lots of overhead, and maybe more SQL queries executed/sec, but much less application txns/sec. Same concept here with locking.

How did we identify what lock(s) to target?

That's the second rule of tuning with Terracotta: USE THE ADMIN CONSOLE

We used the lock profiler feature included in the Admin Console to determine the exact stack trace that generated these locks. The process is simple:

Enable lock profiling with stack traces in the Admin Console,

run your application,

then refresh the view to get a count of the lock acquires/releases/held times etc.

sort on # of lock acquires, and now you know what lock is being requested the most, what stack trace caused that lock, and what Terracotta config was responsible for making that lock.

Armed with this knowledge, Shay set about eliminating most of our superfluous locking. Turns out that creating a Lucene "file" is a single threaded affair, so were able to create a single lock to cover the entire process of "writing" to a file, and that cut out about 90% of our locking.

At the end we got down to about 750 Terracotta txns/sec, which improved the application performance quite a bit.

Still not satisfied, we moved on to the Terracotta Statistics Recorder. This is a new feature in Terracotta 2.6.

Turning on this feature records just about everything you ever wanted to know about your application, Terracotta, the JVM, and your system (including CPU, disk stats, and network stats). You can export these stats as a CSV file, and import them into our Snapshot Visualizer Tool. The SVT gives you a view like so:

4:30 pm - TIM time - We were pretty satisfied with the performance. Even though we wanted more, Shay felt it was best to focus on turning Compass into a TIM (Terracotta Integration Module).

5:30 pm - Time to call it quits - We had hacked up the ant build.xml file to get ourselves a TIM in no-time - except that it wouldn't quite load correctly. (Later we learned we had just specified the filename wrong - easy fix).

Overall, I thought we had a pretty good day. We wrote and tuned a Terracotta integration in about 6 hours flat. With a few more hours of work, Shay was able to complete the integration.

I was really happy to use some of the recent tools we have been building, like the Lock Profiler and the Statistics Recorder. Seeing the real-world use of those was invaluable, and confirmed that our commitment to enabling the developer to self-tune by providing enhanced visibility is spot on.

I am looking forward to people downloading 2.6, trying out these awesome tools for themselves and providing feedback!

Sunday, March 30, 2008

Fun with Distributed Programming

Something about the nature of distributed programming makes it quite divisive. You either love it or despise it. It's rare that I've run into someone who is ambivalent about it.

Those that love it, love it because it's hard core. They're proud to know all the ins and outs of dealing with failures, at the system, network, and application level. All of that specialized knowledge is also what turns off the rest of us.

It's kind of like database programming. There are only a select few who really like it. The rest of us only do it because we have to.

Well, I honestly think that Terracotta changes the game. The key is that Terracotta makes distributed programming fun because it takes away most of the distributed programming part, leaving you with just the fun part.

It reminds me of when Linux came out. Everyone loved it because they could just tinker with different schedulers, and not have to think about building an OS from scratch, just to try out a new idea. That's what Terracotta is like. It manages all the hard networking and distributed programming parts, so you get to just play with the algorithms.

Interested? Let's look at a real (if contrived) example. Let's suppose that you have to build the following:

a service that executes periodically to do some work

you don't care where this service runs, only that it runs

it has to run, but one and only one system can run it

you've got a cluster of n systems, you'd like any one of them to be responsible for running the service

If it were a single JVM, you could do a thousand things, like use a java.util.Timer, or Quartz, or even your own simple Thread with a delay loop in it.

But in a cluster? The choices for synchronizing the behavior of a number of JVMs across a cluster quickly eliminate the fun part, leaving just tedious, boring, and mundane work to be done. Cluster synchronization you're thinking. What should I use? TCP? Multicast? Shared file system locking? A shared database? RMI? JMS? EJBs? Oh dear.

But wait. Terracotta provides synchronization primitives that work across the cluster just like in a single JVM. So that means getting this right in a single JVM means getting it right across the cluster. Could it really be that easy? And fun? Yes!

Let's have a look. For the sake of simplicity, let's do the simple thing. We'll write the delay loop version. We'll implement it as a Singleton that implements Runnable, so we can pass the Singleton to a Thread. Here it is:

public class SimpleWorkRunner implements Runnable
{
    // mark as a Terracotta root
    private static SimpleWorkRunner singleton = new SimpleWorkRunner();

    // singleton pattern - private constructor so there is only one
    private SimpleWorkRunner() { }

    public static SimpleWorkRunner instance() { return singleton; }

    public synchronized void run()
    {
        while (true) {
            // do work
            ...
            try { Thread.sleep(2000); } catch (InterruptedException e) { }
        }
    }
}

That's it! In every JVM, kick off a new thread against the singleton:


...
new Thread(SimpleWorkRunner.instance()).start();

And we're done!

You might have noticed one thing - the run method is synchronized. In a single JVM, this will mean that more than one Thread executed against this Singleton will result in only one Thread winning the synchronization race, and executing the run() method.

In a single JVM, this may not be that important, since there might only ever be one Thread. But with more than one JVM, we will always start at least one Thread per JVM, and that means we have to ensure, per our requirements, that only one Thread ever enters run() at a time.

Terracotta takes care of that for us. We just write the synchronized block, and Terracotta converts that into a cluster lock. And just like in a single JVM, only one Thread - across the cluster - will win the race to enter the run() method.

(Also of note is that this particular implementation assumes that one and only one Thread should assume control and never relinquish it. That was the purpose of the implementation, if you wanted to "bounce" the control around the cluster then we should implement the run method differently depending on the requirements.)

The Terracotta config for this class is trivial. We need to tell Terracotta that the singleton should be a Terracotta root. A Terracotta root will always be the same instance across the entire cluster, which is exactly what we want for a singleton. And we need to autolock the run method so the synchronization is applied to the cluster, not just a local JVM. Here's the config for that:

<tc:tc-config xmlns:tc="http://www.terracotta.org/config"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.terracotta.org/schema/terracotta-4.xsd">

  <application>
    <dso>
      <locks>
        <autolock>
           <method-expression>void SimpleWorkRunner.run(..)</method-expression>
        </autolock>
      </locks>
      <roots>
        <root>
          <field-name>SimpleWorkRunner.instance</field-name>
        </root>
      </roots>
    </dso>
  </application>
</tc:tc-config>

We didn't have to worry about the dirty details. Teracotta did. And that means distributed programming becomes fun again!

Find out more:

Terracotta.org - home page
Quick Start - download and see the demos
Cookbook - simple recipes that demonstrate Terracotta in action

Note that this example is very similar to the Single Resource recipe. Try it out first to get started.

Extra Credit
How does another JVM in the cluster gain control of the task? (Hint: Is it possible for more than one Thread to enter the critical section in run()? In normal Java - no. But what happens in Terracotta with more than one JVM?)

Saturday, March 22, 2008

A Clustered ClassLoader

If you're building a distributed system, or contemplating building a distributed system, you might have run into this one before:

You write and compile your classes in Eclipse
You try out your classes on your laptop -- they work (woohoo!)
Its a distributed system so you need to make sure your classes work in a true distributed environment
You publish your classes to the distributed systems
You try out your classes -- and they don't work (boo!)
You fix the problem.
You publish the classes again.
Rinse, repeat.

After doing this a few dozen times, you find that publishing your classes to distributed systems is a total PITA that you would rather avoid altogether.

Or, you might have an application, like Master/Worker in which you deploy some part of the application at deploy time, but you deploy other parts of it during run-time. In the Master/Worker case, you deploy the Master and the Worker, but the Work comes and goes, and you'd like to be able to deploy new Work easily and trivially. In the Master/Worker case, since Masters are usually in control, and there is a farm of Workers, you'd like to deploy some new work to the Master, and let it send the Work to the Workers. Knowing about the Work up front on the Workers is a non-starter.

Some solutions to this problem?

Java has dynamic code loading capabilities already. Deploy your class files to a shared filesystem like NFS, and deploy your code to a shared directory.
Java also supports loading code from URLs (thanks to it's Applet heritage) so deploy your code to an HTTP server and you're set
Factor your application such that new classes aren't needed - just make the new definitions "data" driven
Embed a scripting engine, so you can pass Strings and interpret them as code - BeanShell, Jython, JRuby, Javascript, and Groovy all come to mind here...

Those are all fine solutions, but it never hurts to have more tools in your toolbox does it? Especially if you're already using Terracotta, wouldn't it be nice if there was some way to just leverage Terracotta's core clustering capabilities to build a clustered classloader?

I've done just that. Here's how it works:

Your application tries to instantiate a class, which means it asks the currently in scope ClassLoader to instantiate the class (by name)
By launching the application under the clustered classloader, it is in scope.
The clustered class loader has a Map<String, byte[]> that correlates classnames to bytes
The clustered class loader looks in this Map, if the classname is found, it uses the byte[] to create the requested class using defineClass()
If the class wasn't found in the Map, then it looks in the filesystem to find the class
If the class bytes are found on the filesystem, then it reads them into a byte[], and stashes them in the clustered Map<String, byte[]>
If the bytes aren't found, it just delegates to the parent classloader

I've omitted some of the finer details. The Map used is actually a Map<String, ClassMetaData> where ClassMetaData is a class that holds a long modified and byte[] bytes.

Let's have a look at the important parts of the ClusterClassLoader:


public class ClusterClassLoader extends ClassLoader
{
    private static final String NAME = "ClusterClassLoader";
    
    private static Map<String, Class> classes = new HashMap<String, Class>();
    private static Map<String, ClassMetaData> bytes = new HashMap<String, ClassMetaData>();
    private static transient boolean loaded;

    ...

ClusterClassLoader is defined to extend ClassLoader. It has a NAME field, which will be used to give a name to this classloader. This is a requirement for a classloader used by Terracotta. Normally Terracotta does this for you, but we are defining a new classloader, so we have to follow the naming rules for Terracotta (naming gives ClassLoaders across the cluster a unique identity).

A classes field is defined, which caches the result of the defineClass operation in the local JVM only. A bytes field is defined. This field is marked as a root, so that it can be shared with every other instance of ClusterClassLoader in the cluster.

The constructor detects if Terracotta is loaded using some reflection, and if so registers the classloader and sets a flag to enable cluster classloading features:

    public ClusterClassLoader()
    {
        super(ClusterClassLoader.class.getClassLoader());
        try {
            Class namedClassLoader = findClass("com.tc.object.loaders.NamedClassLoader");
            Class helper = findClass("com.tc.object.bytecode.hook.impl.ClassProcessorHelper");
            Method m = helper.getMethod("registerGlobalLoader", new Class[] { namedClassLoader }); 
            m.invoke(null, new Object[] { this });
            loaded = true;
        } catch (Exception e) {
            // tc is not present, so don't do anything fancy
            loaded = false;
        }
    }

Next, the definition of loadClass is overridden:

    @Override
    public Class<?> loadClass(String name) throws ClassNotFoundException
    {
        return findClass(name);
    }

and so is findClass:

     @Override
    protected Class<?> findClass(String name) throws ClassNotFoundException
    {        
        if (!loaded) {
            return getParent().loadClass(name);            
        }
        
        Class result = null;
        synchronized (classes) {
            result = classes.get(name);
            if (result != null) { return result; }

            result = loadClassBytes(name);
            if (result == null) { return getParent().loadClass(name); }
            classes.put(name, result);
        }
        
        return result;
    }

This is the bulk of the algorithm. The loaded flag is set when the class loader is instantiated. It used a bit of reflection to determine if Terracotta was even present in the JVM. If not, it is set to false, and the ClusterClassLoader just delegates to the parent class loader.

If Terracotta is present, then it checks to see if the class has already been defined. If so, it is returned directly from the classes cache. If it has not, then it gets the bytes from the loadClassBytes method. If that cannot find the bytes, then it asks the parent class loader to load the class.

The bulk of the implementation is done in the loadClassBytes method:


    private Class loadClassBytes(String name) throws ClassNotFoundException
    {
        ClassMetaData metaData;
        
        synchronized (bytes) {        
            try {
                File f = null;
                metaData = bytes.get(name);
                URL resource = ClassLoader.getSystemResource(name.replace('.',File.separatorChar)+".class");
                // if resource is non null, then the class is on the local fs (in the cp)
                if (resource != null) {
                    f = new File(resource.getFile());        
                }
                
               if (metaData != null) {
                   // if it's cached, but not on the fs, return it.
                   // if it's cached, but on the fs, check to see if it's 
                   // up to date
                   if (f == null || metaData.modified >= f.lastModified()) { 
                       return defineClass(name, metaData.bytes, 0, metaData.bytes.length, null);
                   }
                }
                
                // load from the fs
                byte[] classBytes = loadClassData(f);
                Class result =  defineClass(name, classBytes, 0, classBytes.length, null);

                try {
                    result.getDeclaredField("$__tc_MANAGED");
                    // it's managed so cache it
                    bytes.put(name, new ClassMetaData(f.lastModified(), classBytes));
                } catch (NoSuchFieldException e) {
                    // not managed don't cache it
                }
                return result;
            } catch (IOException e){
                return null;
            } 
        }
    }

This method looks for the cached bytes, and for a file that corresponds to the class. If both are found, then it compares the modified date of the two. If the modified date of the bytes are greater than or equal to the file, then it returns the bytes in the cache. Otherwise it loads the bytes from the file. Once the bytes are loaded, defineClass is called to turn the bytes into a class file.

At this point, the ClusterClassLoader can check to see if the class is instrumented by Terracotta. Every instance of a class that is shared by Terracotta must be instrumented, so it's not necessary to cache class bytes for classes that are not instrumented by Terracotta. If the class is instrumented by Terracotta, then the ClusteredClassLoader stashes the bytes into the class bytes cache.

Click here if you would like to see the source code to ClusterClassLoader in its entirety

UPDATE: This project has been included in the tim-tclib project, and is a runnable sample. More details can be found in the sample readme.html

I've put the whole thing together as a simple runnable example. You just have to check out the source for the project, and run a few simple Maven commands. You can get the demo from here:


$ svn checkout http://svn.terracotta.org/svn/forge/projects/labs/tim-clusterclassloader clusterclassloader
$ cd clusterclassloader

The demo defines a main project, and two sub projects. The first sub project, sample, is responsible for putting classes into a queue. The second sub project, sample2, reads from the queue. To test the effectiveness of the cluster class loader, the second sample of course does not have the classes from the first sub project.

To run the demo:

Build the project:
```
$ mvn install
```
Cd to the sample directory, compile and start a tc server:
```
$ cd sample
$ mvn package
$ mvn tc:start
```
Start the sample process:
```
$ mvn tc:run
```
In another terminal, cd to the sample2 directory:
```
$ cd sample2
```
Compile, and run the example:
```
$ mvn package
$ mvn tc:run
```

If you did everything correctly, you should see:


[INFO] [node] Waiting for work...
[INFO] [node] This is Callable2 calling!

In the second terminal (sample2). The message printed ("This is Callable2 calling!") is printed by a class that is only present in the classpath of the first instance (sample).

Monday, March 17, 2008

Stupid Simple JVM Coordination

If you think cross-jvm coordination is easy - then this post is not for you. If it makes you cringe inside, just trying to remember the JMS interfaces, or JGroups api, java.io classes, or figuring out how to mess with a database, then carry on, intrepid reader. This post is for you.

I'm going to show you how stupid simple it is to use Terracotta to send a message from one JVM to the other. We'll use two JVMs - a producer and a consumer. I want the producer to create and send a message to the consumer. I want the producer to wait for the consumer to consume the message. When the message is consumed I want the producer to use the return value from the consumer.

This would be stupid hard if it weren't for two amazing technologies. The first is the java.util.concurrent package. The second is Terracotta JVM Level Clustering. Putting them together gives you stupid simple JVM coordination.

The scenario I outlined is actually ridiculously easy in a single JVM using the java.util.concurrent package. It was built to handle these scenarios and more at the flick of a wrist. Instantiate a queue, fire off a couple of threads, use a FutureTask, and you're done.

And you know what? Could it get any more simple than writing one line of code to cluster that queue, and move from two threads in one JVM to one thread in two JVMs. It can't.

Here's the main method that does it all:


public class Main
{
   public static final Main instance = new Main();

   private AtomicInteger counter = new AtomicInteger(0);
   private BlockingQueue<FutureTask> queue = new LinkedBlockingQueue<FutureTask>();

   public void listen() throws InterruptedException
   {
       while (true) {
           queue.take().run();
       }
   }

   public void run() throws Exception
   {
       if (counter.getAndIncrement() == 0) {
           System.out.println("Waiting...");
           listen();
           return;
       }
     
       FutureTask task = new FutureTask(new MyCallable());
       queue.put(task);
       System.out.println("Task completed at: " + task.get().toString());
   }

   private static class MyCallable implements Callable
   {
        public Object call() throws InterruptedException
        {
            System.out.println(new Date().toString() + ": Sleeping 2 seconds...");
            Thread.sleep(2000);
            System.out.println("Hello world");

            return new Date();
        }
   }

   public static void main(String[] args) throws Exception
   {
       instance.run();
   }
}

And the Terracotta config:


<tc:tc-config xmlns:tc="http://www.terracotta.org/config"
 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 xsi:schemaLocation="http://www.terracotta.org/schema/terracotta-4.xsd">

 <application>
   <dso>
      <instrumented-classes>
        <include>
          <class-expression>Main$MyCallable</class-expression>
        </include>
      </instrumented-classes>
     <roots>
       <root>
         <field-name>Main.instance</field-name>
       </root>
     </roots>
   </dso>
 </application>
</tc:tc-config>

That's all there is to it. Output looks like this:

Node 1:


$ javac *.java
$ start-tc-server &
$ dso-java Main
Waiting...
(after starting other node...)
Mon Mar 17 17:53:45 PDT 2008: Sleeping 2 seconds...
Hello world

Node 2:


$ dso-java Main
Task completed at: Mon Mar 17 17:53:47 PDT 2008

I've actually written this entire example up as a Recipe on Terracotta.org. Full details and instructions are listed there in the FutureTask recipe.

Saturday, March 08, 2008

The Trouble With Data Partitioning (cartoon)

Wednesday, March 05, 2008

It's the little things that matter...

It's often the little things in a design that make the biggest difference. Sure you have to get the big things right too, but all too often products suffer from a great idea implemented poorly.

So at Terracotta, I often have conversations along these very lines. The job we've carved out for ourselves - clustering the entirety of the Java Virtual Machine, is pretty big. That's why it's such a great place to work - the challenge we face is enormous, and it's enormously fun to tackle it. Let me tell you right now, clustering the VM itself isn't going to happen if you don't get the big ideas right. I'll wager that we have, but only history can prove that one right. But just as important is getting the little things right.

Today I just happened to discover one of those little things. What is it? Well, if you don't already know, Terracotta maintains Object Identity across a cluster of JVMs. That in itself is an amazing feat (no other piece of technology I have ever run into can do this). So Object Identity is the big thing. What's the little thing?

Here goes.

First, my sample code (Main.java):

public class Main
{
    public static final Main instance = new Main();

    private Map<Object, Object> map = new HashMap<Object, Object>();

    public void run() throws Exception
    {
        Object key = new Object();
        Object value = new Object();

        while (true) {
            synchronized (map) {
                map.put(key, value);
            }
            Thread.sleep(500);
        }
    }

    public static void main(String[] args) throws Exception
    {
        instance.run();
    }
}

Those of you not familiar with Terracotta might wonder what's so interesting about this. Well, with Terracotta, you can cluster any Java object, so with the following bit of config, I have done just that:

tc-config.xml:

<tc:tc-config xmlns:tc="http://www.terracotta.org/config"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.terracotta.org/schema/terracotta-4.xsd">

<application>
  <dso>
    <locks>
      <autolock>
         <method-expression>void Main.run(..) </method-expression>
      </autolock>
    </locks>
    <roots>
      <root>
        <field-name>Main.instance</field-name>
      </root>
    </roots>
  </dso>
</application>
</tc:tc-config>

The map in the Main class listed above is now a clustered map (because the root field, instance, holds a reference to it, and therefore transitively it becomes clustered). Anything I put in the map is clustered (transitively again), meaning every object I put in the map is available to all other JVMs in the cluster. That's pretty cool in it's own right (I happen to think), but how is that different from a normal get/put API in a traditional clustered cache, say EHCache, JCS, or OSCache?

Well, I monitored the number of transactions the little test above generated. How many would you guess? 1? 100? 1 every 500ms?

The answer is actually : 1. Because of Object Identity, after the first iteration through the loop, Terracotta knows that it can optimize out the subsequent calls - there is no need for Terracotta to "re-put" an object for a key that already has that same relationship in the map - so it can save the roundtrip work to the server.

In the clustering world, anything you do on the network is orders of magnitude slower than main memory, so every little thing you can do to keep operations local means a big improvement in latency and throughput. So it may be a minor optimization, but it's got a big effect on the latency and throughput of this application. My application may be trivial, but consider if that map was an HTTP Session Context, or a distributed cache.

Furthermore, this optimization is simply not possible with serialization based solutions (which must implement a copy on read, copy on write strategy), because it's simply not possible for a serialization based approach to track object identity, or changes to objects, and optimize out this kind of a scenario.

However because Terracotta is at the VM level, it knows implicitly when objects change, because of Object Identity, so it is not necessary for a caller of the map to "re-put" objects into the map to make sure it's updated (and it's thus valid to eliminate the subsequent put calls that are superfluous). So in the end - Terracotta would work exactly the same with or without this optimization - the correctness is unaffected by it - but with it, it can, depending on your usage, make your application run orders of magnitude faster.

So, in summary, you gotta get the big things right. Object Identity is the big thing. But it's in getting the little things right - for example optimizing away unnecessary network calls by eliminating redundant map.put() calls, that turn out to take a great idea and make it truly impressive.

Note that I can't take credit for this, since I had nothing to do with creating the feature or even suggesting it. I just happened to have realized that it's trivial to test to see if it's implemented or not, and I did test it and hoped that you would find the results interesting too.

To find out more,

Read about Terracotta

Check out some bite-sized code samples in the cookbook section

Or just download it already :)

Note that the code posted in this demo is 100% runnable - just

save it to Main.java and tc-config.xml in a new directory

type "javac Main.java",

start the Terracotta server - start-tc-server.sh&

run the program - dso-java.sh Main

java.think()

Monday, March 31, 2008

Chronicles of a Terracotta Integration: Compass

Sunday, March 30, 2008

Fun with Distributed Programming

Saturday, March 22, 2008

A Clustered ClassLoader

Monday, March 17, 2008

Stupid Simple JVM Coordination

Saturday, March 08, 2008

The Trouble With Data Partitioning (cartoon)

Wednesday, March 05, 2008

It's the little things that matter...

About Me

Twitter Updates

Twitter Updates

DZone - Vote Now!

Nerd Score

Links

Blog Archive

java.think()

Monday, March 31, 2008

Chronicles of a Terracotta Integration: Compass

Sunday, March 30, 2008

Fun with Distributed Programming

Saturday, March 22, 2008

A Clustered ClassLoader

Monday, March 17, 2008

Stupid Simple JVM Coordination

Saturday, March 08, 2008

The Trouble With Data Partitioning (cartoon)

Wednesday, March 05, 2008

It's the little things that matter...

About Me

Twitter Updates

Twitter Updates

DZone - Vote Now!

Subscribe To java.think()

Nerd Score

Links

Blog Archive