Thu Jan 17 18:50:00 2013

Your design sucks

... but that's ok. All designs suck. All designs will always suck.

If you don't believe me yet, allow me to explain ...

No design survives contact with implementation

When you start implementing your design, you are inevitably going to come across things in reality that don't match your original expectations.

Data that you expect to always be present in the response of an external service may be missing - or invalid. Constraints you expect to be unique may turn out not to be in practice (even sha1 collides eventually). Processes that are supposed to be reliable will turn out to fail far more often than you expected.

That's ok.

In some cases, you can simply time out, abort, or otherwise scream loudly and fall over. In others, you'll need to either relax the constraints of the system - or add a filtering layer that cleans things up and provides a valid derivative of the input data onwards to the system.

Missing data can either be made optional or defaulted.

Invalid data can either be treated as missing, or you can record the original in unconstrained form and have an additional validated version that's present only if the original is sane.

Unique constraints can be relaxed, or the data merged, or one preferred version selected.

Unreliable processes can be retried, or their results made optional (and perhaps retried on demand next time you need them), or you can explicitly model the unreliability.

The decision whether to consider a violation of your assumptions to be an error, to mediate between reality and the system or to adjust the system's conceptual model to accomodate reality must be made on a case by case basis. The decision is always about trade offs - is capturing the true messy state of the outside world important enough to be worth the additional complication introduced in the process of doing so?

No design survives contact with actual users

The conceptual model underlying your design might be elegant and correct, but if it fails to match the model in your users' heads, you're going to end up with frustrated users - or no users at all.

People will want to search based on things you didn't really expect ("yes, I know I can look it up by customer number but it's much easier to read the number they're calling from off my phone and search by that").

People will develop cargo cult habits - expect any 'tear everything down and rebuild' function to be used far more often than it was designed for as users come to regard it as the 'turn it off and on again' button and react accordingly.

People will use your system for things it was never designed to handle internally but superficially looked to them like it should - and then complain vociferiously because they can't attach a 10Mb PDF to an instant message.

When faced with speed questions such as data querying, you'll need to either limit the fields that can be searched, probably causing massive complaints, add additional indices or reorganise data, or implement an external secondary search system.

When faced with cargo cult, you'll need to either restrict access to the functions used improperly (complaints again), alter the user interface to guide the users to the functions they should have been using, or to add intelligence behind the function that detects cases where a simpler process will be sufficient and runs that instead.

When faced with unintended use cases, you'll need to either declare it out of scope (complaints ... and possibly workarounds that will produce a problem of one of the other two categories), implement it in the system even though it wasn't originally planned, or integrate with some other service that provides the functionality.

Again, these are trade offs, and we can now see that there are basically three possibly actions to be taken.

The three responses

Ignore, which covers both 'error on violated assumptions' and 'no, that is not in scope' - i.e. refusal to admit the problem to be relevant to your model. This choice should be made sparingly but when the core principles of the system are at stake it should be seriously considered.

Adapt, which covers 'relax the constraint' and 'add the indices' - i.e. adjust your design until the case in question fits elegantly. This is the best default response if the case is close to fitting already - if the impedance mismatch between what's desired and what currently exists is small.

Externalise, which covers mediation between your system and both the outside physical world and the outside expectations of users - i.e. adding something that is outside the boundaries of the core system but inside the boundaries of the overall system as perceived by users. This response is best taken when the case is orthogonal to the core principles of your system rather than directly opposeed.

No design survives contact with the future

Even given the above, only two things are certain - that at least part of your design sucks, and that in six months you're going to think that part is larger than you do now.

We learn as we develop a system, we learn as we observe users interact with the system, and we definitely learn when the system goes down at three in the morning and somebody pages us in a panic.

The closest to an answer to this that I've found is actually very simple - design on the assumption that you're going to screw up, and try and make sure it's going to involve as little pain as possible to fix the screw ups as you find them.

In fact, all of the basic principles of good program design can all be argued to be consequences of this attitude.

Defensive programming means that when the outside world exposes a mismatch between your design and reality, your system reports an error - which means you can fix the error and move on, rather than having to first clean out any bad data that's accrued from the error being ignored before you can deploy the fix (or, good assertions ensure that which is out of scope is visibly so).

Building the simplest thing that can possibly work means not engaging in speculative programming - which means that not only are you automatically saved from the fact that your ability to see the future totally sucks as well, but there's simply less of everything. Less code means fewer bugs, and less design elements means fewer conceptual errors (or, YAGNI is preparation for adaptation).

Separation of concerns means avoiding leaking implementation details to the outside world - because if the outside world isn't relying on the implementation details, then when you realise your implementation sucks you can overhaul or replace it without having to modify code outside of that functional area of the system (or, compartmentalisation is prior externalisation within the bounds of the aggregate system).

To summarise

Your design sucks.

Your design is always going to suck.

If your design doesn't appear to suck now, that's because you're missing something.

If you're sure you aren't missing anything, you've forgotten that you can't see the future.

Design under the assumption that you are definitely, positively, absolutely going to have to redesign it later.

Even so, remember that the redesign is still going to suck - just hopefully less, or at least differently.

Most importantly though, don't worry too much about it. The sound of a palm meeting a face is not an admission of incompetence, it's the noise made by a problem going from impossible to trivial.

I love realising I've been an idiot, because it means that I'm about to make things better.

So it's ok that your design sucks.

Keep sucking. Keep sucking less. Keep sucking differently. Keep learning.

And remember - shipping something that sucks is still 100% better than not shipping anything at all.

Happy hacking.

-- mst, out