Thursday, September 27, 2007

Tired of the Fighting

Lately, I've been reading a lot of blogs. There's such an amazing wealth of really smart people out there with interesting opinions. But, then there's everyone else. I'm getting really tired of the mudslinging about languages, methodologies, and frameworks.

Really.

I can't count the number of Ruby vs. Java[1] stuff I've read in the last 48 hours. Granted, it's an interesting discussion, but definitely not the end all, be all debate our industry is seeing. More to the point, the amount of misinformation and TL;DR is astounding. Is it true that, for some people, sounding smart is more important than being smart? That makes me sad.

I have a hard time imagining how one can get so entrenched in something as to think there's nothing beyond the walls that box them in. I can only liken it to what it must be like to have some kind of profound spiritual experience[2]; it was so life altering, you forget where you came from so quickly, the experience doesn't mean anything anymore - it's so drastic, you are no longer what you were before.

Let's face it; there will be countless languages, frameworks, platforms, and methodologies that will come and go during our careers. We should be so lucky! Why on Earth would you shut yourself off from something to such a degree?

Ideally, we all take the requisite time to evaluate everything prior to passing judgement. I'm sure we aren't all so pure of heart, mind, and principal. There are so many things we don't like about the things we see. They're easy to point out, too. That said, I have never met anyone who had nothing left to learn (even if they wouldn't agree). There is something good about 99.9999% of the things under heated debate. The things that aren't debated are usually not because everyone agrees they're great or they're terrible.

I'll settle it (to my own satisfaction) for you. The following is for comedic value.

  • Ruby needs more IDE support
  • Java documentation is rather opaque at times
  • Ruby's syntax is inconsistent, at times
  • Java doesn't let programmers express themselves like the true artists they are
  • Ruby doesn't have as much commercial backing and support
  • Java has been molested by commercial backing and support
  • Ruby-ists can, at times, be arrogant, claiming to have invented some ideas that have been around for a long time
  • Java-ists can, at times, be arrogant, claiming every other language to be toy
  • Ruby is slow
  • Java is a memory hog

Are you getting the point? Hint: If you caught yourself saying but wait... Eclipse has Ruby support now or but wait... Java has an excellent community you need to start from the beginning. Go ahead. I'll wait...

Good. Feel better?

If you write Java for a living, go (yes right now) to this site and find something interesting. If you write Ruby for a living, go read up on all the neat stuff you can do in Hibernate.

So there you have it. Now don't make me reach back there. You don't want me to stop this car.

...and if I hear one more thing about how you're an idiot if you do / don't use git / subversion / svk, I will find you and force you to go back to using RCS[3]. Try me.

[1] - At my day job, I write neither Java nor Ruby. I like both for different reasons. I have no real stake in this debate.

[2] - I'm not trying to disparage those that are spiritual or those that have had any kind of spiritual awakening. If you're upset by this, you're reading too deeply.

[3] - If you actually like using RCS for source code control, I'm more than happy to have offended you by this.

Wednesday, September 26, 2007

Version Control and Release Management With SVK

SVK is self described as a decentralized version control system built with the robust Subversion filesystem. Possibly more important (although, who doesn't like words like decentralized and robust), are some of the features it supports:

  • Repository mirroring
  • Disconnected (i.e. network-less) operation
  • History-sensitive merging

There are other features it supports, such as patch generation, but those above are the biggies. At this point, those in the open source development community have heard about SVK and I wouldn't claim that it's news, per se. What is interesting is that it works.

For the last year or so, I have been primarily responsible for managing the software release process at my place of employment. This, for me, entails coordination of release cycles (weekly, for us), merging of bug fixes from other developers, and branch management. We used to be a CVS shop, but at some point, I pushed hard to get everyone into Subversion for reasons that should be obvious (if they're not, rest assured they were valid). Our release process was not terribly complicated and was mostly informal, but SVK made it much easier to maintain. Here's it is...

  1. Work in dev branch - //rep/branches/dev
  2. Developers merge stable code (by their definition) to /rep/trunk
  3. Every 7 days, a Release Manager create a stable branch from trunk - svk cp //rep/trunk //rep/branches/stable_yyyymmdd
  4. QA team tests stable_yyyymmdd while devs continue to work in //rep/branches/dev and integrate changes into //rep/trunk.
  5. QA team gives the stable branch their blessings
  6. Release Manager blesses the stable branch into a release by making a copy - svk cp //rep/branches/stable_yyyymmdd //rep/branches/release_yyyymmdd (where the dates are the same)

Points of interest:

  • //rep/trunk is where devs can do integration testing
  • //rep/branches/stable_yyyymmdd is where QA and executives can preview what is coming
  • //rep/branches/release_yyyymmdd is safe to deploy to new machines at any given time
  • //rep/branches/release_yyyymmdd is copied from the stable branch so if there are bug fixes, during that testing, the release branch is unchanged and still safe for re-deployment
  • devs can cut personal or per-project branches off of //rep/branches/dev should they need to do so, and manage them themselves

What SVK brings to the same, and what many see as one of its killer features, is the smart merging. This means that SVK remembers the source of a branch and the last merge point. This means that if I want to bring all outstanding changes from the dev branch into trunk, I can do:

svk smerge          \
  -m "Merging everything from dev into trunk with original log messages" \
  -l                \
  --to //rep/trunk

The -l option tells SVK to include the original log messages (in addition to the message specified by -m) and the --to <rep> specifies the repository location to merge changes to.

If bug fixes need to be applied, developers merge them from dev to trunk, then request that those changes be merged to whatever the appropriate stable branch is for testing by QA, and eventually the release. As time goes on, we kill off old branches to keep things tidy, but we can always go back and resurrect them should we need to do so.

SVK has significant advantages when traveling or working in an area where the Subversion repository is unavailable, as well. This is probably the other major killer feature. Because SVK mirrors the Subversion repository and keeps it in a local repository, you can continue to commit changes when you don't have access to the core Subversion repository. When you get back to a trust location, you can instruct SVK to synchronize all changes.

While still clunky in a few places, SVK brings some fantastic features to the table with its smart merging and distributed operation. Its patch generation features as well as the ability to synchronize to an external repository make it the perfect tool for customizing or working on open source projects without commit access. At some point, it would be great to see these kinds of features make it back into Subversion, but until then, SVK does what was otherwise a real pain and it does it well. Check it out.

SVK Wiki

Many thanks to Chia-liang Kao and Best Practical for all of their work and support of a great project.

Tuesday, September 25, 2007

What a Senior SysAdmin Should Know

Prior to taking on the System Architect role where I am, my title was Manager of System Administration. I've worked in system administration enough to know about all the things I don't know. Sadly, I've also learned a lot about what, does not, a sysadmin make when interviewing people.

In the Linux world, there are a growing number of people who call themselves system administrators. Technically, everyone who sits behind a computer administers it to some degree. The question, in the case of those that want to make it their livelihood is, to what extent do you know your field? Granted, these aren't the days of punch cards, nor do you have to be an electrical engineer to touch a computer (put down that soldering iron, if you please), but you do have to know your stuff, to be sure. But what stuff, and to what degree? Fair enough.

Networking concepts and Protocols

This is a hard and fast requirement; you must understand basic networking concepts and common protocols. It's that simple. You must be able to explain how TCP/IP works (if SYN, ACK, SYN/ACK doesn't mean anything to you, you're in trouble here). You must know the difference between stateful and non-stateful firewalls. You must know what an MTU size is and what changing it means. You must know what happens if a DNS response packet is larger than 512 bytes. You must understand, conceptually, how routing protocols work. This stuff isn't brain surgery (recently, I heard that brain surgeons joke that brain surgery isn't brain surgery), but it's a bit dry. You absolutely can not be a (structural) architect if you can't do basic math.

Linux as an Operating System

Should I need to say that you have to know about Linux-ish things to be a Linux sysadmin?

You must know as much as you can stand about the Linux kernel; building one, updating (more on tools like package managers later) kernels, where modules live, when they're loaded, why, how, how to manually load and unload them, and other exciting stuff. You must understand the Linux boot process and init. You must know about the boot managers and how they differ (lilo, grub, etc.). You must be able to write, modify, and maintain shell scripts (and understand concerns like quoting errors, for $deity's sake). You must understand basic system libraries such as (g)libc, and things like PAM (trust me, you'll thank me later). You must know how to tunnel everything under the sun over ssh. You must understand the interesting file systems like ext{2,3}fs (and you must understand what that {2,3} syntax means), udev / devfs, tmpfs, and especially proc. You must understand how to modify file system options (hint: tune2fs).

Security

You can not survive as a system administrator without a better than average understand of security. No excuses.

You must be able to read and interpret bug reports and know about things like the CVE list. You need to understand system hardening. You need to know iptables / netfilter. You need to know the security implications around certain bits that can be twiddled in /proc. You must understand common attacks and know how to mitigate them. You must understand what IDSs and NDSs are. You need to understand, at least conceptually, cryptography and how it applies to security (hint: lolzEncryptEverything!!!111 doesn't count as a valid answer). You must understand PKI. You must know how to respond to security incidents.

A Bit About Tools

Who doesn't like things that make our lives easier? If I were to read this, and admittedly, even as I type it, I would think that this sounds like a lot of stuff that is made easier by tools provided by vendors, open source projects, or even distros, themselves. That is entirely true. There are RPMs or DEBs of kernels, so why should you care where modules live or how the system boots? Because it's your job and knowing what tools do and how they work will not only help you, but will bail you out of trouble. I think projects like Webmin, Firewall Builder, and the like are great, but you still must know what they do and the concepts behind them before you can really understand how to use them properly. I'm guessing that the authors of those projects would agree.

This isn't about being elitest, but about doing what we do, and doing it well. Take everything as an opportunity to learn and grow and you'll quickly find that it's much easier to succeed when you know rather than when you guess correctly.

Monday, September 24, 2007

Apache ActiveMQ and Perl

I write an awful lot of Perl code as part of my day job. In fact, I've been writing Perl for quite a while, now (about ten years). Perl is one of those languages that does a lot of what it says it does. It's Perl; no more, no less. I tend to use Perl as if it were very strict, strongly typed, object oriented and don't get very Perl-ly with it. I like encapsulation, accessor methods, design patterns, and other things of that sort. I suppose, in the end, I'm not really a Perl guy.

Recently, I've been working on a project that aims to decouple systems via messaging. Not wanting to necessarily build a messaging infrastructure from the ground up, I went in search of something to do the work for me. I'm a fan of the JMS feature set and wanted to find a way to bring such a thing to Perl. The Apache project has developed a JMS compliant message broker called ActiveMQ. ActiveMQ, as you might expect of a JMS broker, is implemented in Java and supports a few different wire protocols. One of these protocols is called the Streaming Text Oriented Messaging Protocol, or more succinctly, Stomp. The idea behind Stomp seems to be to provide an interoperable wire format so that any of the available Stomp Clients can communicate with any Stomp Message Broker to provide easy and widespread messaging interop among languages, platforms and brokers. That's truth in advertising.

As with most things, someone else has thought of talking to JMS brokers from Perl long before I have and has manifested this as the Net::Stomp Perl module on CPAN. I, for one, am grateful.

For Perl monkeys, it's about as straight forward as it gets.

  use strict;
  use Net::Stomp;

  my ($stomp);

  $stomp = Net::Stomp->new({
    hostname     => $host,
    port         => $port,
  });

  $stomp->connect({
    login        => $username,
    passcode     => $password,
  });

  # Sending a message to a topic
  $stomp->send({
    destination  => '/topic/MagicDoSomethingBucket',
    body         => 'Hello from Perl land.',
  });

  $stomp->disconnect();

A rather contrived example, but you get the idea. In my case, I opted to use a binary message body (mostly due to time constraints) and used Storable's nfreeze() function to serialize a data structure for transport. If you're interested in doing content based routing or filtering, XML is a better way to go. Either way, I found this to be fast (about 20k messages per second when talking to a local message broker and posting to a publish / subscribe topic with one consumer), flexible, simple, and reliable. In its final home, I will be using ActiveMQ's store and forward functionality with a series of brokers configured in a grid. It really is neat stuff.

As always, I'm interested in any experience people have with ActiveMQ and other messaging solutions, especially as a method of communication between different languages. If you're not familiar with messaging solutions, take a few moments to read up.

You'll be glad you did.

Sunday, September 23, 2007

Caving in to the iPhone (why it matters)

I was in need of a new phone. I usually sport a rather modest Blackberry which I use mostly for email and data. Well, I traded up.

The newer blackberry I was looking at was nearing the $300 mark with a two year renewal. I quickly realized for just a bit more, I too could own an iPhone (potentially against my better judgement). I say against my better judgement more so out of embarrassment for buying such a hip product than out of technical quality. The conventional thinking should always be to question the conventional thinking, especially for those in any kind of scientific field (which we, arguably, are). I think we - because hopefully it's not just me - sometimes get caught up so much in the questioning that we often don't realize that the popular answer is the correct one. Clearly the word correct is extremely subjective and will depend on the context, but for me, this is one of those cases. It really is just a nice phone.

So what's the point? I fear I've done this in technical cases, in the past. I have been harsh toward Java because of this. In the last few years I think that Java has grown and progressed in places where I used to fault it, like performance. I've also had a similar epiphany with regard to development processes (although admittedly not until I learned about agile methods a number of years ago). I now feel that it is a lot of the unnecessary ceremony and inflexibility that i actually didn't care for.

I think all I'm trying to say is learn before you burn. Make sure you understand what you're panning before you completely write it off. Consider the intended audience and the context surrounding something. Be it functional or object oriented programming, Java, Ruby, Linux, Windows, or even the iPhone; the goal is to make an educated, intelligent decision, not to hold up some misplaced notion of purity or loyalty which is, in fact, just thinly veiled ignorance and bigotry.

On the other hand, sometimes you run smack into something that turns out to be exactly what you thought it was. When that happens, well, you know what to do...

Saturday, September 22, 2007

The development process - a look at OpenUP

I've encountered a number of companies that eschew any kind of formal process. The thought seems to be that a development process, any process, impedes flexibility and bogs down the process (lowercase p) of development. Personally, I've always been skeptical of processes for similar reasons. As a developer, you feel constrained by what your role is defined as, in any specific development methodology.

The longer I do what I do, the more I find that process isn't bad; it's the processes that we have (or had) that were bad. It's the type of process that needlessly constrains and restricts more so than having a process at all. Because of this belief, I was relieved when I first discovered agile methods. I was late to the game, to be sure, but I was equally impressed, none the less.

I read, on some random blog (sorry, I don't recall which one), about OpenUP - described as a lean Unified Process that applies iterative and incremental approaches within a structured lifecycle. In my reading of the wiki, I found that it's very easy to understand in about fifteen minutes, even for mere mortals (read: non-PMPs or trained project managers) provided you have a rudimentary understanding of software development. It has some interesting advantages:

  • Low-ceremony
  • Tool agnostic
  • Project type independent
  • Attention to small time windows (talks about days and months, not years)

Of course, it's also chock full of the normal agile development goodies like acceptance of change, iterative thinking, and so forth. It's interesting to see people defining simple, easy to understand and implement, processes that, themselves, are open to refinement. It just makes sense.

Give it a look. It's neat stuff.

OpenUP Wiki

Friday, September 21, 2007

Collaborative development environments (a follow up)

After my entry the other day, Distributed Teams in Development, I did some extra poking around for papers written on the subject. Luckily, the software development world is chock full of these smart people who are willing to make their thoughts and work available to us all out of the goodness of their hearts. Grady Booch's Papers section has a wealth of great information on tough topics. Better still, it's accessible to us mere mortals in that it's not self referential in ideas or terminology. I'm looking at you J2EE.

One paper of interest, titled Papers on collaborative development environments (CDEs) (doc, pdf) relates to what I discussed the other day and breaks it down much better than I ever could. This is worth a read if you work in an environment that is, as he describes distributed by time and distance. There's references to products, both open source and commercial, that are interesting as well. In some cases, he offers a unique view on how to use these products together.

Either way, it's a good bit of insight into a difficult problem.

Thursday, September 20, 2007

Alphabet soup

More so lately, I've been doing a fair amount of work in Java with Eclipse as a development environment. I've been working on both Linux (my main platform for development) and Mac OS X (which I keep around for digital audio / music applications and a few games). The experience is about the same on both platforms; it's nice to see some of the advantages of desktop Java applications shining through.

What gets me about the development world is the sometimes almost comical barrier to entry in terms of nomenclature. Specifically, the endless swarm of acronyms, marketing speak, and double talk that surrounds certain technologies, that is almost damaging. Learning Java, the programming language, is trivial compared to learning Java, the marketing object. I understand differentiating one's self within a market place; fear not Sun - you're different.

One of the worst offenders of terminology warfare is J2EE. I've spent the last few weeks sorting through all of the Blueprints, Tech Articles, White Papers, FAQs, API Specifications, and Glossaries that Sun has to offer at the main java site and I'm not sure it had to be as soup-ish as it was. I'm positive a lot of the ire that Java has been awarded by certain communities is due, in part, to the thickness of said barrier to entry. Some people don't have an enterprise to worry about and are alienated by some of this kind of opacity.

And, more to the point, it's really a shame. When you dig into the ideas behind the Java Message Service, for example, it all makes perfect sense and really is a pleasure to use (most of the time). I understand that with abstraction comes a certain degree of, well, abstraction, but this is kind of silly.

The JMS API enhances the Java EE platform by simplifying enterprise development, allowing loosely coupled, reliable, asynchronous interactions among Java EE components and legacy systems capable of messaging.

I know what it means, but it really raises more questions than it answers for those that don't have a clear understanding of what (exactly) a Java EE component is. Admittedly, the Java 5 EE Tutorial is one of the better guides I've seen for jumping into a framework, but finding it when you don't know it exists is a daunting task, in and of itself.

Eclipse, as a platform, also suffers from a bit of the same. Again, here's a case of a great tool buried under a strange sheen of marketing cruft. The Eclipse project page is daunting to someone who's looking to figure out just exactly what it can do. Clearly, it's extensible, but does it have to actually optimize solely for the beginner and the expert, only? As a real world example, I wanted to find an integrated UML editor for Eclipse. I won't tell you how that experiment ended, but it included a tour of EMF, GMF (which isn't at all like the GEF), GMT, M2T, Model Driven Development integration (affectionately dubbed MDDI), and MDT. In the end, I was rewarded with a tree view that allowed me to add children like Class, or Interface Realization. I know what that means, but I swear I saw screenshots of a graphical UML editor, didn't I? Surely those G* projects do something other than draw trees.

I hope that, one day, a bit of transparency can be brought to these kinds of tools (or platforms; whatever). Until then, I suppose it's just the price of entry into a world that does, in fact, have a lot to offer to a wider audience.

Wednesday, September 19, 2007

Distributed teams in development

I suppose, at some point, an organization becomes large enough where teams of developers are geographically disparate. In my experience, this is never fun. Each team seems to develop a good rapor locally, for obvious reasons, but that doesn't necessarily translate to a strong connection to peers in other cities (or even countries). It's always difficult to spread common knowledge when the level of interaction between teams is low.

There's been some effort to bring people closer together, but these applications or tools are either too "low bandwidth" or don't allow for the kind of communication you really need for things like peer programming and code reviews. I'm referring to tools like MS Live Meeting, which is something I've been personally subjected to. Yes, subjected is the word I mean to use.

I've recently read a blog entry by Grady Booch where he briefly mentions using Skype and Second Life, which sounds interesting. This also appeals because it's available to us Linux folks (at our office, we have a projector connected to a box running Windows, a wireless mic with a polycom, a bridge phone conference thingie, all of which we use solely for team presentations). All bias aside (I have a distinct dislike of Live Meeting), Live Meeting is too slow and is a good representation of low interactivity, low bandwidth distributed tools. Skype (or other VoIP-ish tools) are a bit more natural for realtime conversations, but lack integration with any kind of tools. I'm not entirely sure what Second Life brings to the table; my exposure to it is rather limited.

It should be noted that many, if not all, open source projects are developed by distributed teams. I've been directly involved in a few popular projects such as Gentoo Linux and contributed patches and bug reports to others. In most cases, collaboration tools are critical to these kinds of projects; IRC networks, forums, mailing lists and numerous other bits of infrastructure exist to support this development model. The major thing I've found lacking is integration into the common toolsets of choice, be it vim, emacs, or a full blown workbench like Eclipse.

So why is it that a bunch of developers can't get on the same page about good, distributed, high bandwidth collaborative tools? Better yet, why can't our normal tools support collaboration as a standard part of development? Things like version control systems cover history management, tracking, auditing, and other functions, but they don't help with real time code review and pair programming. Distributed source code editors are interesting, but I'm not sure I've seen one that fits with the tools a personal already uses, as opposed to simply making them use other tools. In an age of modular language workbenches - I'm looking at you Eclipse - this should be something to pursue.

The role of the System Architect

My official title at my day job is System Architect. Simple, straight forward, theoretically well defined. The wikipedia definition of a system architect makes it seem like the world is made of butterflies and honey and everyone is in love. It couldn't be less true.

Admittedly, at my place of business - a tech-saavy company with a development team of about forty very smart people - the idea of what an architect is and what he or she does is still forming. The position didn't exist up until recently. It's a little less like the wikipedia definition cited above, and quite honestly, thank God. My position is definitely more technical than it is business oriented (i.e. I still write code, et al). I tend to focus on (and possibly obsess over) design, system structure, subsystem integration and communication, and other seemingly opaque and lofty topics. Luckily, I still have to eat my own dog food and practice what I preach, so it can't all be about purity and pie in the sky conceptual cruft. If there isn't any meat on the bone, I could easily get eaten alive by my peers. (My apologies for the list of inappropriately used cliches above.)

Sometimes, the responsibilities can get hazy, though. Given that I am in a consultative rather than authoritative position, the onus is on me to prove that the solution offered is the best there is. There's no guru hat here; it's put up or shut up.

Not often, but almost with certainty, there are situations where time constraints around a project force a sense of impending doom and severely limit the time people are willing to put into discussion of a system. It happens. It can be even worse than that, though. The time it takes to have these discussions, not to mention the emotional energy when everyone in the room actually cares, is immense. People can get their egos bruised, or worse. I'm certainly not above such things, at times.

So what is a System Architect, really? Specifically, how do we prove the value of things like design patterns, integration patterns, and the like to other developers?