Tuesday, February 24, 2009

Principles of Architecture - Reduce and Simplify

Just a few days ago, I had a weekly one on one meeting with my boss. It's times like that where work becomes kind of like a game of Can I go a full hour without putting my foot in my mouth? Turns out, I came out unscathed this time.

Recently, around the office, we've been talking a lot about the principles behind the agile manifesto. Pav (aka John Pavley, CTO) mentioned that I probably had a similar list of principles of software architecture I operate by. He pointed out that I hadn't really vocalized what those are in any obvious way and that doing so would probably be beneficial or at least interesting. Thinking about what those principles are and needing to actually enumerate them also helps me think about what's really important and why.

Before we get into the first principle I want to discuss, I want to clarify why I'm using the term principle rather than, well, anything else. During our discussions of the principles of the agile manifesto, we used this word because, like what I hope to describe in software architecture, those items are considered intrinsic properties of software development. As Pav would say, they're discovered, not invented. I tend to think he's right and that the choice of the word principle is deliberate and intentional. This is also my intention here.

One of the first two principles that came to mind was the idea of reduction and simplicity. When designing software, we strive to reduce or eliminate complexity wherever possible. There are times where a task is inherently complicated, but the design of the system need not necessarily be complicated. If that sounds counter-intuitive, consider the separation between designing the system's architecture - the way it behaves, the layering, the major objects in play, the way it interacts with constituent systems or resources, its fault semantics, and so on - from its implementation. In many cases, what you'll find is that the implementation may have some inherent level of complexity to meet the business requirements, but that a well designed system is almost obvious and easily fits in your brain without confusing you. Let's consider something concrete.

If you were to design an SQL query execution engine... I don't even need to finish that sentence for it to sound scary. Take a few minutes to think about how you might design a query execution engine. In five or ten minutes you might actually be able to work out a simple model that makes sense (within reason). The details of how to implement that design is where one gets into the shady details of how to make the magic happen. Even the design of a compiler is simple enough, in most cases, where as the implementation is where the complexity lies. A compiler will have a grammar, a lexer, a parser, probably an AST, a chain of optimization strategies, maybe a number of output generation strategies. Within each of those major components, you could break things down further and come up with an easily understood design for a modular compiler. I'm not trying to trivialize building a compiler (try implementing a C++ compiler some day) but I do think that with some thought, the design process would effect a reasonable, intuitive result.

The point I'm trying to make is that a well designed system should be intuitive to the person or team implementing the system as well as the architect. If you find it difficult to communicate a design, there's a high chance that the implementation of that design will not make things any simpler. In fact, it's probably impossible. To be clear, some things have an inherent degree of complexity, but we should always strive for the simplest, but still most complete, design possible.

These are all relative terms; simple, complex, intuitive, complete. You'll always have to rely on your judgement, experience, and best practices of the trade. By properly deconstructing an application, its components, their components, and so on, even the most complex system can be easily understood and digested.

Some techniques I find useful for making this happen are:

  • Apply Divide and Conquer. Break down components recursively until you get to easy to understand units of functionality.
  • Never work alone. Statistically, you're more than likely to be surrounded by people who can contribute experiences and ideas during the design process that will yield a better result. The added benefit here is that you're constantly having to explain your thought process and ideas to other humans; the degree of complexity is probably proportional to the number of times a junior developer says Huh?
  • Follow patterns and best practices. Silly questions like Does your class do one thing and do it well? have saved me from my own cleverness more than once (but admittedly not always).
  • Trust your gut. If it sounds too complicated, it probably is. Take a break, look at similar problems, ask around, and try a different approach.

There's no complexity blasting ray gun of designly awesomeness. There's no third party library that you can just drop in to make it simple. Sometimes, the business requirements are as tough as they sound. Most of the time though, you can reduce and simplify.

Tuesday, February 10, 2009

Class Categories in Java

Class categories have existed in a few incarnations over the years. My personal knowledge of comp sci history is thin, at best, but my small amount of research into the material is that it came from Smalltalk-80. Certainly, my first exposure to class categories came from working with Objective-C on NeXTSTEP and later Mac OS X.

What a category does is relatively simple to understand. Basically, the idea is that a developer may take an existing class and, effectively, append methods to it without having access to the source code or subclassing it. This is best illustrated in code.

// A normal, and terribly boring class.
public class Foo {

  public void displayMessage(String message) {
    System.out.println("A useless message:" + message);
  }

}

// Extending the class by creating a category on it.
public class Foo (MyExtensions) {

  public String getDefaultMessage() {
    return "Hello world.";
  }

}

// ...and finally, what one would expect.
Foo f = new Foo();

f.displayMessage(f.getDefaultMessage());

Normally, when I explain class categories to Java developers, I barely get the words out before I'm hit with what you might expect. People question whether this is breaking encapsulation, if this bloats code, if it creates tight coupling, if it breaks access rules, and so on. Some of the more dynamic languages like Ruby and Perl will happily let you do things like this (albeit, sometimes safer than others), but that happens at runtime.

I'm proposing we bring categories to Java. Yep. I said it. I'm going to focus more on how this might work rather than why I think it's a good idea, although I'll try and briefly address that too. Here's how I think it could work and why.

The Basics

The syntax would work like Objective-C's syntax. Creating a category would be done by specifying the same package and class definition, in the interest of simplicity, with the addition of a category name enclosed in parenthesis (see the above example). It would not be legal to specify inheritance when creating a category; inheritance would always be defined by the original definition (i.e. the uncategorized class declaration), although there's no reason to prohibit the implementation of additional interfaces in a category. This would allow those creating categories on a class to extend an existing class to implement a new interface without modifying source code.

Access, Security, and Visibility

In many ways, the access and visibility rules of subclassing applies to categories as the result is very similar from the perspective of the original class; the new functionality is unknown and untrusted.

It would only be legal to access public or protected members of a class when creating a category. This would respect access and visibility restrictions on code developed prior to the existence of categories. Overriding a method in a category would not be permitted, although method overloading is fine. Private members within a category would not be visible outside of the category.

Category Availability

The biggest differentiation between categories in Java and class reopening in Ruby, for instance, is that the contributions made to a class via a category would be known and could be checked for at compile time. This would allow developers to see and avoid cases of competing categories or member addition during development, which is usually not possible with languages that allow for this kind of functionality.

It would not be legal to create a category on a class declared as final. This extends the meaning of declaring a class to be final, but only slightly as creating a category is similar in intention to subclassing (in theory). This rightly implies that there is no way to prevent the creation of categories, but allow subclassing as there's no obvious reason to draw a distinction as categories can only access public and protected members of an existing class, just as a subclass would.

External classes including unrelated as well as subclasses of a class with categories would see all members including those defined in categories as usual. The category information of a member should be made available via the standard reflection classes and methods. Given the above example Foo class, the following would work as expected.

/* Includes both methods from the original class declaration
 * as well as methods from categories.
 */

Methods[] methods = Foo.class.getMethods();

/* Additionally, category information should be made available
 * via reflection.
 */

Category[] categories = Foo.class.getCategories();

for (Category category : categories) {
  System.out.println("Methods in category:" + category.getName());

  for (Method method : category.getMethods()) {
    System.out.println("method name:" + method.getName());
  }
}

Some Quick Reasons Why

There are a few nice advantages to having categories available in a language, especially at compile time. There are the obvious advantages such as simple code organization. What I tend to think is more interesting, though, is creating categories to apply specialized functionality to core classes. For instance, one may want to add methods to collections to glue validation logic to core components. In cases where Spring is used, it's not uncommon to see many adapter type objects that simply exist to make an object more amenable to participate in DI. I believe that a lot of code and class structures could be greatly simplified by being able to make minor alterations to existing classes rather than resorting to multiple objects to mediate or adapt existing code to new systems and frameworks.

Like anything else, there is the obvious ability to abuse something like categories. I think there are times when more traditional approaches are the best option, and there's no replacement for good design and education, but to remove a valuable tool because some subset of the population may misuse it only serves to hurt those that could make proper use.

My plan is to attempt to draft this as a JSR and submit it for review. I don't know if I have the ability to chew through the politics (I'm assuming are) attached to that, but it might be fun to try. Clearly there's more to work out (I haven't looked deeply into what this does to the compiler and runtime at a low level, for instance), but I'm interested in what people think about categories in Java.