All systems suck

I’ve been thinking a lot about this idea lately.  I’ve spent a lot of years as an engineer and consultant fixing other people’s systems that suck, writing my own systems that suck, and working on legacy systems, that, well, suck.

Don’t let anyone fool you.  All systems suck, to a greater or lesser extent.

If it’s an old system, there’s the part of the code that everybody is afraid to work on: the fragile code that is easier to replace than maintain or refactor.  Sometimes this seems hard, or nobody really understands it.  These parts of the code are almost always surrounded by an SEP field.  If you’re unfamiliar with the term, it means “Somebody Else’s Problem”.  Items with an SEP field are completely invisible to the average human.

New systems have the parts that haven’t been built yet, so you’ll hear things like “This will be so awesome once we build feature X”.  That sucks.

There’s also the prototype that made it into production, a common problem.  Something somebody knocked together over a weekend, whether it was because of lack of time, or because of their utter brilliance, is probably going to suck in ways you just haven’t worked out yet.

All systems, old and crufty or new and shiny, have bottlenecks, where a bottleneck is defined as the slow part, the part that will break first when the system is under excessive load.  This is also part of your system that sucks.

If someone claims their system has no bugs, I have news for you: their system sucks.  And they are overly optimistic (or naive).  (Possibly they just suck as an engineer, too.)

In our heads as engineers we have the Platonic Form of a system: the system that doesn’t suck, that never breaks, that runs perfectly and quietly without anyone looking at it.  We work tirelessly to make our systems approach that system.

Even if you produce this Platonically perfect system, it will begin to suck as soon as you release it.  As data grows and changes, there will start to be parts of the system that don’t work right, or that don’t work fast enough.  Users will find ways to make your system suck in ways you hadn’t even anticipated.  When you need to add features to your perfect system, they will detract from its perfection, and make it suck more.

Here’s the punchline: sucking is like scaling.  You just have to keep on top of it, keep fixing and refactoring and improving and rewriting as you go.  Sometimes you can manage the suck in a linear fashion with bug fixes and refactoring, and sometimes you need a phase change where you re-do parts or all of the system to recover from suckiness.

This mess is what makes engineering and ops different from pure mathematics.  Embrace the suck.  It’s what gets me up in the mornings.

13 thoughts on “All systems suck”

  1. I totally agree.

    But the best counter-attack to this problem is modularity.

    If you can rewrite a part of your system, so that it sucks less, then part after part the whole system will suck less.

  2. Certainly refactoring components reduces the suck of those components. It is however, something of a “painting the Sydney Harbor Bridge” problem.

    Also note, some systems suck more than others. You can reduce the suck of your system, but every system still has a best part and a worst part.

  3. Not only gets me up in the mornings, keeps me employed 🙂

    But seriously. I don’t know if you’ve read “Thinking in Systems” by Donella Meadows or not – if not, I highly recommend it. It encourages the reader to think about systems separately from the actual computer. I enjoyed looking at a system and examining the components: inputs/outputs/goals/feedback loops. I would hazard a guess that the reason a lot of our systems suck is a combination of conflicting goals where needing to just solve the problem in front of our face battles with the desire to create a longer term solution that would improve more pain points and be “future-proof” (impossible!). In RelEng we are often trying to balance those two. What happens? We re-write our systems every couple of years to try and handle the Brave New World we’re in that our systems aren’t prepared for. Rather than think I could actually write a system now that can handle that future I much prefer that we stay flexible enough to continually re-visit our systems and tear things out/enhance as needed. Best thing I learned in school was how to read other people’s code and it’s been invaluable for this exact scenario 🙂

  4. There are perfect systems but ones that we can’t connect to yet. It is pretty nifty that matter never seems to be created or destroyed. If we could somehow design a computation system on that, we would be out of a job. In the lab are quantum computers so I suspect that in the next 50 years we will have systems that don’t rely on software to run. They will run using the logic in nature. I suspect then, that systems willbe more stable than mission critical systems used in Nuclear facilities are today.

    So bottom line, not yet, but soon!

  5. I’ll beat a different horse for a change: the development environment. They are always get sold to us as the easy way to a optimal (i.e. low-suckiness ranking) final product. I have yet to see a development system that makes it hard for me to develop things that sooner or later will suck.


  6. @Immobilier – nope, in my experience, modularity just ends up hiding most of the suck in places the average developer doesn’t need to know about. And that’s a good thing, up until the point where the magic stops working, and someone has to work out how it was supposed to work.

    @smo – a development environment can’t prevent sucky code, but a good one will at least help mitigate the problem.

  7. Embrace The Suck.

    All systems suck – we just need to realise that’s OK. Ship early, ship often – and each time you ship, make it suck a bit less. But ship – don’t delay until you think it doesn’t suck any more. Iterate.

    I think one big difference between small, agile companies (which Mozilla hopefully is, or should be) and big, slow-moving ones is that the small ones are happy to embrace the suck. And they normally end up with something that doesn’t suck quicker than the other way, and have helped a bunch of people along the line as well.

  8. yes I agree, I’m quite young developer with 6 years commercial experience but last years I have spent looking for ideal job with perfect environment (I relocated to UK from Poland). But must of them SUCKS. Even if they have methodologies which do implement by great smart guys, but they usually they don’t have time becouse projects need to much flexibility and they need to find compromise between great coding standard, best development practises and time estimation.

  9. Too bad that too few have heard about the IBM i. The least-suck OS. Wish sometimes they’d open-source it, but then maybe it would be as bad as the rest of them.

    A code base of zillions of code and projects banned if they don’t have a budget, means enhancements in usually small increments (not always small) and about 30-40% of developer time spent in fixes.

Leave a Reply

Your email address will not be published. Required fields are marked *