Tuesday, February 10, 2009

Class Categories in Java

Class categories have existed in a few incarnations over the years. My personal knowledge of comp sci history is thin, at best, but my small amount of research into the material is that it came from Smalltalk-80. Certainly, my first exposure to class categories came from working with Objective-C on NeXTSTEP and later Mac OS X.

What a category does is relatively simple to understand. Basically, the idea is that a developer may take an existing class and, effectively, append methods to it without having access to the source code or subclassing it. This is best illustrated in code.

// A normal, and terribly boring class.
public class Foo {

  public void displayMessage(String message) {
    System.out.println("A useless message:" + message);
  }

}

// Extending the class by creating a category on it.
public class Foo (MyExtensions) {

  public String getDefaultMessage() {
    return "Hello world.";
  }

}

// ...and finally, what one would expect.
Foo f = new Foo();

f.displayMessage(f.getDefaultMessage());

Normally, when I explain class categories to Java developers, I barely get the words out before I'm hit with what you might expect. People question whether this is breaking encapsulation, if this bloats code, if it creates tight coupling, if it breaks access rules, and so on. Some of the more dynamic languages like Ruby and Perl will happily let you do things like this (albeit, sometimes safer than others), but that happens at runtime.

I'm proposing we bring categories to Java. Yep. I said it. I'm going to focus more on how this might work rather than why I think it's a good idea, although I'll try and briefly address that too. Here's how I think it could work and why.

The Basics

The syntax would work like Objective-C's syntax. Creating a category would be done by specifying the same package and class definition, in the interest of simplicity, with the addition of a category name enclosed in parenthesis (see the above example). It would not be legal to specify inheritance when creating a category; inheritance would always be defined by the original definition (i.e. the uncategorized class declaration), although there's no reason to prohibit the implementation of additional interfaces in a category. This would allow those creating categories on a class to extend an existing class to implement a new interface without modifying source code.

Access, Security, and Visibility

In many ways, the access and visibility rules of subclassing applies to categories as the result is very similar from the perspective of the original class; the new functionality is unknown and untrusted.

It would only be legal to access public or protected members of a class when creating a category. This would respect access and visibility restrictions on code developed prior to the existence of categories. Overriding a method in a category would not be permitted, although method overloading is fine. Private members within a category would not be visible outside of the category.

Category Availability

The biggest differentiation between categories in Java and class reopening in Ruby, for instance, is that the contributions made to a class via a category would be known and could be checked for at compile time. This would allow developers to see and avoid cases of competing categories or member addition during development, which is usually not possible with languages that allow for this kind of functionality.

It would not be legal to create a category on a class declared as final. This extends the meaning of declaring a class to be final, but only slightly as creating a category is similar in intention to subclassing (in theory). This rightly implies that there is no way to prevent the creation of categories, but allow subclassing as there's no obvious reason to draw a distinction as categories can only access public and protected members of an existing class, just as a subclass would.

External classes including unrelated as well as subclasses of a class with categories would see all members including those defined in categories as usual. The category information of a member should be made available via the standard reflection classes and methods. Given the above example Foo class, the following would work as expected.

/* Includes both methods from the original class declaration
 * as well as methods from categories.
 */

Methods[] methods = Foo.class.getMethods();

/* Additionally, category information should be made available
 * via reflection.
 */

Category[] categories = Foo.class.getCategories();

for (Category category : categories) {
  System.out.println("Methods in category:" + category.getName());

  for (Method method : category.getMethods()) {
    System.out.println("method name:" + method.getName());
  }
}

Some Quick Reasons Why

There are a few nice advantages to having categories available in a language, especially at compile time. There are the obvious advantages such as simple code organization. What I tend to think is more interesting, though, is creating categories to apply specialized functionality to core classes. For instance, one may want to add methods to collections to glue validation logic to core components. In cases where Spring is used, it's not uncommon to see many adapter type objects that simply exist to make an object more amenable to participate in DI. I believe that a lot of code and class structures could be greatly simplified by being able to make minor alterations to existing classes rather than resorting to multiple objects to mediate or adapt existing code to new systems and frameworks.

Like anything else, there is the obvious ability to abuse something like categories. I think there are times when more traditional approaches are the best option, and there's no replacement for good design and education, but to remove a valuable tool because some subset of the population may misuse it only serves to hurt those that could make proper use.

My plan is to attempt to draft this as a JSR and submit it for review. I don't know if I have the ability to chew through the politics (I'm assuming are) attached to that, but it might be fun to try. Clearly there's more to work out (I haven't looked deeply into what this does to the compiler and runtime at a low level, for instance), but I'm interested in what people think about categories in Java.

6 comments:

Anonymous said...

Have a look at a Smalltalk browser and you will see that more is required: Class categories and METHOD CATEGORIES. Would be easy to extend Java by using special annotations used by the IDE.

See also http://www.bergner.se/protocols/ for eclipse and the discussion at
http://forums.java.net/jive/thread.jspa?threadID=182

Anonymous said...

Take a look at the way C# 3.5 did this: They call it extension methods and I find it extremely convenient. The differences are that you cannot access protected members, but you can add extension methods to interfaces, which is pure gold: you can declare very simple interfaces that provide only the core functionality but define all kinds of convenience methods in the extension methods, thus delivering both a rich interface api and the ease of implementing the interface. In fact in c# extension methods are just a compiler trick to call static methods in a different class that have the "this" argument as the first argument - simple and yet very effective.

Anonymous said...

An interesting idea. I like it. I can think of lots of reasons where they could be useful.

Many times in a project you are working with the core libraries but in a very specific ways (dates in a certain format, BigDecimal scale 2, a Collection with a tiny pieces of extra functionality....)

What I've seen is lots of inheritance (which just calls out for bugs), composition with tons of methods that just delegate, or stupid utility classes that are just a hodgepodge of junk.

Imagine how nice it would be to add a couple static factories to the date formatter class or BigDecimal class. Google Collections could have added to the Collections class directly via Categories rather than creating Collections2. (assuming these classes are not final which I haven't checked).

Something else to think through... Could you make a legacy class work with newer Java constructs via Categories? (i.e. for-each loop on Enumerations)

BTW. I don't think Categories are the same as extension methods as the other anon poster suggested.

Larry Singer said...

This is also called a mixin.

Anonymous said...

Take a look at AspectJ to see what is technically possible now ...

Blair Zajac said...

By the way, Scala provides a way of doing this by having implicit methods convert a normal class into a rich class with the additional methods. The idiom is called "pimp-my-library."

Pimp my Library by Odersky,
Pimp my Library - Wiki,
Scala can make your Java Objects look smarter.