Three engineering measures that matter

Brad Hipps


Umberto Eco, the famed medievalist and writer, once described the novel as "a machine for generating interpretations." The more interpretations, the better the machine.

In a similar vein, we might describe engineering as a machine that turns ideas into software. To understand how well our machine runs, we want to know three things:

  • How many ideas do we deliver? (Throughput)
  • How fast do we deliver them? (Speed)
  • How well do we do it? (Efficiency)
  • If we get a lot done, well and at good speed, we know—and can show—the machine is doing its job.

    To start, we need a common unit of measure. In sales, this would be dollars. In marketing, it's leads. In engineering, there are two candidates: tasks, and code. We'll use tasks. This is for a couple of reasons:

  • Quantity—getting more done—is an unalloyed good where tasks are concerned. The more tasks completed, the better. The same isn't always true of code. "More code" may or may not be a good thing.
  • Tasks, unlike code, are a unit of measure that non-engineers can understand. They're the common currency of the business.
  • With tasks as our unit of measure, let's look at how we might understand throughput, speed, and efficiency for any body of work.


    Obviously, the starting variable for throughput is knowing how many tasks we completed in a given period. But that variable, by itself, is fairly meaningless. Is ten good? A hundred? A thousand?

    What we're really interested in is the number of tasks we completed, relative to the number of new tasks raised. Ten tasks completed is actually a good number, if the number of new tasks raised over the same period was, say, seven. This tells us that delivery is outpacing demand: the machine is ahead of the number of new ideas being requested.

    But—can we really use tasks as the unit to measure throughput? Aren't there so many variations in task size and complexity, that you're essentially mixing apples and oranges and bicycles?

    Paradoxically, the answer to both questions is Yes.

    How? The law of large numbers (LLN). Over the course of enough tasks, the inevitable differences in complexity among them basically come out in the wash.


    In software engineering, we've developed a strong allergy to saying how long something is likely to take. Instead, we signal effort through coded language like Fibonacci numbers or story points. The whole point of these secret handshakes is to remove the time element. Why? As Agile Manifesto contributor Ron Jeffries once put it, apologetically, “to obscure the time aspect, so that management wouldn’t be tempted to misuse the estimates.”

    Resist the urge to use these opaque abstractions. The business doesn't understand them. (They're not exactly crystal clear among engineers either.)

    Instead, look at the actual average time, measured in days elapsed, required to complete an issue. In plain terms, this is the average time to deliver, or cycle time—something everyone can understand.

    Easier said than done? Typically, yes. Typically, that's nothing short of impossible. But Socratic automates this.


    An efficient engineering team is one whose tasks move from start to finish with a minimum of interruption.

    But "interruption" is vague, so let's be specific about our most common exception states:

  • Rework: that is, the backward movement of tasks, e.g. from a test phase back into a development phase;
  • Deprioritized: tasks that we got started on, and then had to backlog in favor of new, higher priority work;
  • Tasks that become blocked.
  • For efficiency, we want to know how much of our total active work time is spent productively—that is, in a normal flowing state—versus time spent in any of the above exception states.

    Assume a project has absorbed a 100 total work days so far. If the time spent in nonproductive states is only 20 days, we probably feel pretty good—our efficiency, by this measure, is 80 percent. But if that number were 50 days, it would mean that half our total time so far has been eaten up by blocked or idled tasks, reworking tasks, or burning time on things that fell out of priority. Something is off.

    (Again, Socratic derives these allocations automatically.)

    Adding it together

    Each of these measures is useful on its own. But they're best when used as a cooperative group. What really interests us is the check-and-balance among them.

    For example, if we're working efficiently and at good speed, but our throughput isn't keeping pace with demand, the implication is clear. We need more people—or less demand. What's nice is that the data make the case for us.

    On the other hand, if we see our efficiency is falling off, to the point of impacting speed or throughput, "more people" isn't the answer. Instead we're going to dig in on the choke points: which exception states are on the rise, and why?

    In these cases, the collective measures become essential for understanding how we work. Is too much demand overwhelming the machine? Is there some recurring inefficiency in the way we operate? The data help to surface the what, why, and where.