Documentation is hard, but what happens when you add the need to make docs version specific?

Sat Oct 19 02:48:00 2013

Caomh Veneficus Gysgodi o gath III

So I've been using a couple of Open Source projects for a while now and it's interesting to me to consider how the two projects are doing similar and different things in terms of their documentation.

I'd like to say that I love both of the projects I'm about to talk about, but I'm going to make some critical comments and discuss how one and other Open Source project might improve their documentation.

"So what are these two projects?" I hear you ask...

OpenNMS

Our first project is OpenNMS, a Network Management System. Those people who know me will know that monitoring, reporting and statistics gathering for networking and systems equipment is something I've had a great passion for since working on the Cumbria and Lancashire Education Online (CLEO) network as it was being built back in the early to mid two thousand era.

I could wax lyrical about why monitoring and sensible statistics gathering are important, but let's assume that you know that and move on.

I came across OpenNMS while working at Lancaster University, but it was implemented by the networking team after I'd moved on to the systems team so I never really had any experience or hands on time with it.

I continued to be a Nagios advocate and implemented that for the systems team. Say what you like about Nagios, it does the job. The problem begins when you start to scale your monitoring solution to more than a handful of devices. For CLEO, we got around this by writing our own internal tool called the Operations Desk Database (or ODD, I like silly acronyms, ask me about FART sometime) which was originally designed purely and simply to drive the Nagios monitoring configuration files. It became much more than that in the longer term, and while people were complaining about how gnarly the code was, it entertains me to know that it was still being used many years after I left the networking team to run CLEO :) People. It wasn't ever designed to be what it became.

Solving the problem of scaling to enterprise isn't something most organisations have the luxury of writing themselves, and I'm glad that finding the limitations in Nagios drove me down the path to writing ODD. A tool of the same name still exists, but it's probably much less shitty than the one we kept bolting things onto the side of back in 2005. ODD should probably get it's own blog post at some point, though it may just bring back too many rocking backwards and forwards quietly in a corner moments...

After starting at Zen Internet, the first project that I was given to work on was OpenNMS. After a while in the monitoring desert, it was fun to sink my teeth into something new and challenging. Fun, challenging and at times downright annoying it was.

OpenNMS is an amazing piece of software, the community around it is awesome and the individuals involved are some of the nicest people I've chosen to get incredibly drunk with. So why was it downright annoying at times to work with?

The documentation sucks. OpenNMS suffers with one of the classic Open Source software project problems. Many of the people involved write code. People who spend all of their time writing code and making really cool software aren't always the best people to write documentation about said same thing. Documentation is hard and anyone who tells you differently has never attempted to write documentation for a massively diverse user base speaking many different languages and all having subtly different learning styles. I'll say again, documentation is hard and I have nothing but respect for the people who take the time to write it. Honourable mention here to Shadowcat Systems own Jess Robinson or castaway as many will know her who makes this stuff look easy.

"That's nice Ian. You also suck because you complained and didn't do anything"

Yup. I'm guilty. It's Open Source at the end of the day, so we did our bit by adding to and updating the Media Wiki installation for OpenNMS whilst we battled to get things working in exactly the way the business wanted.

The problem that we hit time and again was versioning. The documentation was out of date, no longer relevant or referred to best practices for an older unspecified version. How could we tell which was which? The short answer was that we couldn't, we relied on talking to people on IRC and raising support tickets under our support contract with the OpenNMS group. It's at this point I should mention that the project started to take on an air of doom and I started to feel like the defence against the dark arts teacher signed up for a one year tour at Hogwarts. Seriously. I was the third Engineering Lead on the project, both previous leads having left the company. Nothing to do with OpenNMS, but it was an amusing title that I took to using ;) Amusingly in relation to the title, I did end up leaving Zen to work for Shadowcat Systems shortly before my first year was up.

Anyways. I'm rambling. Clearly I shouldn't write blog posts when I can't sleep.

Versioning kept tripping us up again and again. If writing documentation wasn't hard enough, we added another dimension to the complexity when we consider versioning. You can't drop the old documentation, some people might still be using the older version and want to refer to the docs as they were when they built their system against version X. Equally, you can't keep all the old information around in the same place as it's just too damn confusing. It's actually another hard problem and we all know how geeks love hard problems. Turns out that when it comes to documentation, many fewer geeks love these problems which is a shame.

CiviCRM

Our second project is CiviCRM. The blurb from the website says:

"Open source constituent relationship management for non-profits, NGOs and advocacy organisations"

It's a funky piece of software that I've used to help a couple of different organisations:

Both organisations needed a way to keep track of their membership records, renewal dates and so on.

In the tradition of reusing existing tools where possible I had a look around and CiviCRM seemed like the best candidate for the job at hand and offered additional features that could help both organisations in the future (https://civicrm.org/go/features has a big list of the things it has to offer).

Implementation for the EPO was a clean slate whilst Sing for Pleasure had a large amount of legacy spreadsheets, Word documents, Google documents/spreadsheets and around 45 years of history to contend with, but that's a substantial topic for another blog post. Possibly. One day.

So why pick out CiviCRM for this discussion? Well, CiviCRM has a wiki like OpenNMS but it uses Confluence. Let's look at some URLs:

The top of that page has a green box and the following text:

"This documentation refers to CiviCRM 4.3 which is the latest stable release."

That's nice and helpful. We're looking at the docs for 4.3. What about another URL:

Ooo! We have a red box with the following text:

"This documentation refers to an older version of CiviCRM (3.4 / 4.0). The current stable version is 4.3. Please introduce all documentation changes and new material here."

So by changing the number in the URL, I get to view a point in time snapshot of the documentation that should be correct for the version I'm using. That means that out of date or historical recommendations can be removed from the latest version at CRMDOC and kept at the historical CRMDOC40 path ensuring that the documentation I remember using when I installed things (including any bookmarks I might have made and notes in my local docs) will remain correct and usable. Isn't that stunningly simple and rather cool?!

"So we should all switch to confluence, is that what you're saying?"

Well, if you're an Open Source project, that's certainly an option that you could investigate. There's a catch. Confluence is not Open Source software. Your project can make use of the facilities offered by Confluence, but you can't modify this platform and customise it for your own Open Source project, what you pull off the shelf is what you get, plus if you want to use this for a commercial project that you might be working on then you're going to have to pay.

So how do we do better with Open Source tools?

There's not much difference in the two solutions above. Surely this is something that we can achieve with existing Open Source offerings? I'd like to think the answer to that is yes, but there's an evolutionary missing link, we need some tools to make it easy to do. Media Wiki is a fine solution to the problem, but we need to add some things to replicate the functionality that we get with Confluence:

  • Banners at the top of every page
  • Web server configuration to correctly map through the CRMDOC latest version shortcut

Then there's also the question of how we create a new version:

  • Should we copy the database as a full snapshot? Copy the tables within the same database? Is there another way?
  • Do we need to mark the old one read only or keep it so that older versions can be updated?
  • Add the banner to say the old one is out of date?
  • Add a banner to say the new one is the current version?

This all needs to be scripted in a way that's simple and easy for the Open Source project to implement.

Seems to me that it would be sensible to implement this using a Media Wiki farm so we end up with one version of Media Wiki installed running multiple wikis. This has the added advantage of being able to provide wikis for internal project work in addition to the user facing documentation, all whilst maintaining a single installation of Media Wiki.

Whilst working at Lancaster Uni my good friend Phoebe Tipper implemented a Media Wiki farm so she was the first person I spoke to about how to make this happen as I wasn't sure which of the myriad of methods are most suitable for our needs. Turns out she rolled her own method thus adding to the confusion :)

She did however suggest the following extensions:

One other point that she raised which I hadn't considered was how Media Wiki performs it's authentication. That's per database, so to implement a solution that allows authenticated users access across multiple databases will need some consideration. That might be a driver for copying the tables within the same database as then views could be used for sharing the user account information.

This idea has been sitting in my brain since the OpenNMS User Conference Europe (OUCE) 2012 and beyond the discussion with Phoebe during the OUCE 2013 I've still not managed to do anything about it which is a very poor show. I'm hoping that this blog post will prompt some discussion and encourage me to pull my finger out and make it happen.

My great thanks to Jess Robinson, Tarus Balog, Jeff Gehlbach, Phoebe Tipper and all the other people who labour tirelessly in the pursuit of the perfect documentation. If you think this is easy, give it a try some time.

Go try OpenNMS and CiviCRM, both excellent projects with comprehensive documentation and both have excellent communities too. Don't forget to add documentation as you work and find things that are wrong ;)