Wednesday, January 7, 2009

Declarative Concurrency in Java

Good, solid, safe, effective concurrent programming is hard. Modern languages and paradigms make it easier, but for most, it's still a challenge to get right, right away. Many people have predicted the end of the great Ghz race. They're probably right. I don't have any great insight into the CPU design community. Honestly, it just doesn't hold my attention. Multi-core systems are all the rage these days, though, and that's pretty damn cool. None of this is new; plenty of smarter people than myself have pointed it out.

One of the purported benefits of functional programming is how it lends itself to concurrent programming. Luckily, I work with a smart guy who's both patient and polite enough to talk to me about FP without serving kool-aid (thanks Adam). Many of those conversations entail discussion about state, immutability, and side effects in software implementation. This, of course, leads me to think about how some of these things apply to one of our weapons of choice where we work - Java.

Java accomplishes concurrency via thread objects. Big deal; nothing new here. Most of the confusion comes into play not when deciding what should run concurrently - that's usually obvious - but when figuring out how to protect shared state. Again, in Java-land, we do this with different types of locks, either implicitly with synchronized blocks or explicitly with the grab bag of fun from the java.util.concurrent.locks package. Many of the Sun docs talk about how we use locks to establish happens before relationships between points in code. What's interesting is that this language seems so natural and simple. So why is lock management such a pain?

Maybe imperative locking isn't the right approach. Maybe, there's a more natural way to establish a happens before relationship. It sounds like dependency declaration. I'm wondering if we can't find a way to declare dependencies within source code. Maybe there's a way to declare dependencies, with something like annotations, where instrumentation can infer what we're looking for. This, of course, is sugar for what we have now, but I don't think sugar is always bad.

 public class MyClass {

   private int counter = 0;

   @Concurrent( stateful = true )
   public void execute() {
     /* Do something that might touch shared state. */
     this.counter++;
   }

   @Concurrent( unitName = "otherExecute", stateful = false )
   public void otherExecute(String someArg) {
     /* Do something that promises not to alter ourselves. */
   }

   @Concurrent(
     unitName      = "somethingElse",
     stateful      = true,
     happensBefore = "otherExecute"
   )
   public void somethingElse() {
     /* This can be run concurrently, could touch state, but
      * must happen before "otherExecute" is called.
      */
   }

   static public void main(String[] args) {
     ConcurrentController controller;
     ConcurrentUnit       unit1;
     ConcurrentUnit       unit2;

     controller = ConcurrencyController.forClass(MyClass.class);

     unit1 = controller.getUnit("somethingElse").setThreadPoolSize(10);
     unit2 = controller.getUnit("otherExecute").setThreadPoolSize(5);

     unit1.start();
     unit2.start();
   }
 }

The @Concurrent annotations would instruct an instrumentation library to perform an operation in parallel. The hints stateful and happensBefore could be used to perform additional automatic member variable monitor acquisition or something equally snazzy. The unitNames could be used to grab a handle, of sorts, to a concurrent unit of work and be used to establish relationships or to report on concurrency plans (which could be similar to an RDBMS query execution plan). Who knows... I'm tossing ideas around.

I don't think it covers every situation. In fact, I'm sure it doesn't cover everything. It's beyond flawed and probably not possible. I'm just trying to get some wheels turning. The goal is to have simpler, coarse-grained, declarative concurrency definition that can be externalized.

I'm intrigued by the idea of simple concurrency models that don't remove the fine-grained control given to us by the language and APIs. If concurrency isn't going away, it has to get easier for the majority of people to do it correctly.

I'm especially interested in feedback on this.

4 comments:

Guy Korland said...

Nice idea!

Seems like an idea that can complete a transactional memory.

We at Deuce (http://sites.google.com/site/deucestm/) thought of adding such thing.
Using such annotation you can relate between transactions and data access.

Unknown said...

I believe that concurrency mgmt should not be done on the "thread side", but on the "resource side". Threads shouldn't be aware that their execution is accessing concurrent resource, thus I don't think that the @Concurrent annotation is judicious on the execute() method. Maybe annotations like @Mutex on the resource class would be better, but that's finally what the synchronized keyword acheives...

E. Sammer said...

@Guy Korland:

Deuce looks very cool. We've been talking about STM a lot here at work, recently. I think the ideas like
"microthreading" and lockless mechanisms are going to be critical to achieving performant concurrent applications as we see multicore architectures grow. Your point about transactions and data access is a good one and probably analogous to the JPA annotations and declarative (via annotations or something like AspectJ) RDBMS transactions.

Really, the idea is about creating something sufficiently high level without tossing away the more advanced use cases of concurrent application design. That said, I still haven't really convinced myself it's even possible.

E. Sammer said...

@Francois:

I get what you're saying, but what synchronized doesn't cover is the higher level ordering between threads (at least not easily). For instance, if you have resource R1 and threads T1 and T2, you may want to make sure that T1 gets R1 *first*, then T2. With what we have now, this arguably simple case requires non-trivial lock management infrastructure and coordination. The synchronized keyword or simple ReentrantLock objects only serves to prevent two threads from touching the same thing at the same time. The "happens before" relationship is only that one of the threads will happen before the other, but not in any defined way. You can accomplish this with things like barriers, latches, or conditional locks, but it's easy to get that kind of management wrong.

Maybe you're right about the protection being applied to the resources rather than blocks, but I do think we need a higher level construct for day to day work. Think of things like GridGain's @Gridify method (which is worth checking out if you haven't already done so).

Either way, you raise some interesting points. Thanks for commenting!