Friday, January 9, 2009

Of Maven Dependencies and Repositories

I think of Maven the same way I tend to think of Git; excellent features, but just a little more complicated and obtuse than is really reasonable for the task. I know that's insanely unpopular to say about Git, but luckily this isn't about Git.

Recently, I was converting a project at work to Maven (from Ant) as an experiment. This is a relatively standard, mid-sized, Java project that makes heavy use of a number of what I would consider common Java libraries. In our case, we use Spring very heavily, along with other staples like Hibernate. One of Maven's killer features is the ability to resolve dependencies and pull the correct versions from the Maven Repository, but we already know that.

I found the selection of dependencies from the central repository to be one of the worst things I've had to do in recent days. I was spending more time setting up different repositories and wading through the duplicate packages than I was enjoying the benefits of such features. It's almost more of a headache than doing it all by hand.

In OSGI bundles, which have some similarities in dependency specification, at least, with Maven, one has the ability to express dependencies in a few ways. The obvious unit of dependency is specifying another bundle. This is, effectively, the same as Maven. OSGI bundles, though, may also opt to specify only what Java packages the bundle imports and let the runtime figure out what bundle to take those packages from. This is similar to how many Linux package managers operate when more than one package can fulfill a dependency and it's traditionally referred to as a virtual dependency.

Maybe what we need from Maven is the notion of the virtual dependency. A Maven POM could specify virtual packages as dependencies that could be filled by any one of a number of providers. Java lends itself to this very well because the majority of standards define the APIs with service providers being distributed separately. Think of things like JPA (provided by Hibernate EM), JAXP (Xerces and friends), and so on. I suppose it's a little different because Java developers want to pick an implementation for a specific reason, but having virtual dependencies would eliminate many of the overly specific dependency graphs created when dealing with complex packages such as Spring, for instance.

It's worth noting that the most significant issue I have with Maven is the quality of the metadata. It is just plain awful. Some of the things I ran into were:

  • Packages that weren't updated with bug fixes or recent versions
  • Many copies of the same package with different names and odd descrepancies in versions
  • Missing (or unavailable) dependencies
  • When using Spring's Maven repositories, duplicates of the dependencies are pulled in because Spring depends on versions not in the central rep.
  • Because Spring came from Spring's repository, packages like GridGain which depend on Spring, grab the version from the central repository, but Spring Integration which is only available from Spring's rep has a dependency on the version of Spring from Spring's rep... AARRRRRRRRRRRRGGGGGGGGGGHHHHH!

I get that this is hard and it requires a lot of coordination. I get that I could repackage things in my local repository or a corporate shared repository. Should I have to? A lot of the advantage of Maven is lost when one has to manually follow dependencies to figure out why there are two (full) versions of Spring Core in the project. It's annoying, wasteful, and prone to error.

Maven, I want to like you. Really I do. But like a real, live, flesh and blood human, you make it so difficult sometimes, just like your sister (Git).

7 comments:

Nadeem Bitar said...

Advice from someone who've used maven for a long time. Don't use it. You would end up spending a lot of time fixing bad POMs, maintaining a local repository -- since not everything is in a public repository, and fighting with buggy released plugins where a fix is available but is never released.

E. Sammer said...

That's what I'm afraid of. Why is it that we get decent ideas and poor implementations? :sigh:

Alex Miller said...

Have you looked at provided scope dependency in Maven?

It might solve some of your problems. You can say that a dependency is "provided" at runtime but needs to be pulled in at compile time. It's good for stuff that's run in a container.

You might also want to look at transitive dependency exclusion to avoid pulling in jars required by jars.

I agree with you that Maven can be deeply frustrating. Dependency management is both its strength and its weakness.

E. Sammer said...

@Alex

I haven't wrapped my head around the scoping support and what it means to each plugin yet. The docs, in my Maven-newbie opinion, are unfriendly.

Anyway, thanks! This will certainly help. I'm determined to give Maven a fair shot (before I launch it directly into the sun in a rocket).

Nadeem Bitar said...

If you decide to use maven, I have found that using mvn dependency:analyze is the only way to keep a complex project's pom working.
But unfortunately that's where you find all the broken 3rd party poms that you end up having to file bugs for, or fix yourself.

But even if you have a perfectly working pom you still have to suffer with maven plugins. For example, we had to patch the maven scm plugin and build our own version in order for the maven release plugin to work with git. Every time we need something that is not supported by maven we have to take a detour in order to make maven work with it. We have a module that uses groovy and it took us 3 days to patch and fix the plugin to bundle and compile our groovy code. You'll soon start worrying of using new stuff when you commit to maven.

Also, it takes some time for some project to push their latest releases to a public maven repository so you have to either put it in your local repo or wait, in order to upgrade.

I currently have a branch to try to port our complex build to work with buildr, I am really hoping that buildr would gain some traction since it's a joy to work with compared to maven.

I also tried gradle which makes maven a little bit better but is also a little immature at this point.

I highly encourage you to look at those projects since they might solve your problem and offer a better long term solution.

David Savage said...

At the risk of coming across as a hijacker it sounds like you may be interested in http://sigil.codecauldron.org.

Basically Sigil is a set of tools for doing OSGi development using the same semantics at compile time as OSGi enforces at runtime i.e. import package and require bundle. It's built on top of ant&ivy and eclipse at the moment though we may support maven and other IDEs in the long term.

It's still early days for the project but I'm using it day to day on projects for my company and it's been used to compile a project with over 600 modules which was an absolute doddle to set up.

E. Sammer said...

@David

We've spoken, somewhat indirectly, in the past about OSGI; I'm a fan of OSGI in general and the work you and Paremus are doing in that area.

I'm happy to have people "hijack" if it means I don't have to fight with tools anymore. I'm interested in the idea of using OSGI metadata at compile time. It just makes sense. I'll definitely take a look at Sigil.

I think Maven is a good tool, but I find it cumbersome. Maybe it's just because I'm attempting to retrofit it onto an existing project. I'm sure I'll continue to bark up that tree for a while, but maybe not with this project.

Thanks again for the heads up and stay in touch!