Save the code or save the tests?

A while back during a training session at work, our instructor pitched us a hypothetical question:

Your data centre is on fire, and all your application code is on one drive, and all your tests are on the other drive, and you can only rescue one. What do you do?

  1. Save the application code.
  2. Save the tests.

More recently, out of curiosity, I pitched the same question as a poll on Twitter. On Twitter, roughly 80% of respondents said that they would save the application code and 20% said they would save the tests.

People gave differing reasons for their decisions. Predictably, many respondents said to let it all burn, and some people attempted to justify that choice by asserting that they had backups. But "let it all burn" wasn't one of the provided options and these responses didn't show up in the results, so we can safely ignore those opinions for now.

Most of the remaining respondents seemed to make decisions based on their own assumptions about how detailed and specific those tests actually were. Some people said that it was impossible to make the decision without knowing more about the tests. Some people said to save the application code because the other drive, containing their tests, would be blank, or at least contain nothing of value. Some people said they would save the application code because although the tests might exist, and even be quite robust, they would not specify enough of the application's behaviours to make recreating the application possible. Some added that they would try to rebuild the tests from the saved application codebase. This, of course, also involves significant assumptions about the testability of the application.

But some people, of course, asserted that their tests would be exhaustive enough that rebuilding the application from the test code would be easier.

In short, a big old "it depends".

So, this all made for an entertaining discussion and it generated some interesting results, both during the original training session and on Twitter.

The instructor's point in asking this question was to make an assertion about the virtue of test-driven development and having good code coverage. If you have sufficiently detailed unit tests, we were told, then the structure of the application is so clearly specified by those unit tests that reconstructing the original application from the tests is more or less trivial. It is, in fact, the path of least resistance. So, you should rescue the unit tests. And, you should follow proper TDD practices and always have good code coverage. If you do this, it doesn't matter if the application is blown away, you can start over very easily.

In the training session, we all found this to be very convincing and insightful, and we nodded in agreement, understanding the lesson.

And then I thought about it for a minute.


And I asked myself a different question:

When would this hypothetical, or anything like it, ever happen?

Clearly, the actual scenario as presented is ludicrous. Why would the tests and the application code be kept on two separate discrete drives, side by side in the same data centre? Why would you be in the data centre at that exact time, and be confronted with those specific drives and a choice of only rescuing one? If this scenario actually occurred, the original respondents who deflected the question had the right answer all along. It's a fire. You just get out. Your skin is worth more than any piece of software, and there are assuredly backups, or at least local copies on other developers' machines.

So, this data-centre-on-fire scenario is clearly intended as a proxy for some other, real scenario. But what is that real scenario?

Why is it useful or desirable to be able to completely reconstruct an application from its unit tests? When, and how, would you ever end up in a situation where the application had been destroyed and the unit tests had survived?

I haven't fully figured that out.

There are plenty of real cases where tests get "lost" because of neglect. Or where they never get written at all. I guess this is because of a similar choice to the fire scenario, where you have finite software development resources, and you don't have time to do everything you want to do, so something is going to be neglected.

But if that's the scenario, are we saying that given the choice between (1) maintaining the application and (2) maintaining the tests, we should always maintain the tests? Because, if the tests are rigorous enough, we can just pull a lever at any time and use those tests to produce a working application?

This can't be it. It's true that development time is usually unevenly split between application code and tests (and a dozen other things). And it's true that usually the balance could benefit from being shifted in the direction of the tests. But you can't just have tests and no application. You can't just write the tests for a new feature and declare the feature to be completed and delivered. Can you?

(There are narrow and rare scenarios where you might be developing a standardised test suite for some protocol or specification, where the test suite itself is the product, but that's a whole other ball game from conventional software development. I would call it off-topic.)

So, as entertaining as these discussions have been, what are we actually learning here? Anything?

We may be accidentally learning the opposite of the intended lesson. Slavish adherence to TDD, and the maintenance of exhaustive unit tests, such that if the application code goes missing it can be effortlessly regenerated, is pointless, because that never happens.


Truthfully, I think robust unit tests have great value, but this Allegory Of The Conflagrating Drives doesn't illuminate that value at all. I don't think unit tests need to be this exhaustive. I think TDD has value, but it's just a tool, not a way of life, and more often than not it should be left in the toolbox.

That said, I do think that once you get to the point where you have to make a decision between looking after the application code and looking after the tests, something has already gone wrong. They should be inseparable, kept together on a single allegorical drive. We should refuse to acknowledge a distinction between the two. We should refuse to accept, or maintain, one without the other.

So, personally, I end up responding to the dilemma with something helpful like this:

Your data centre is on fire, and all your application code is on one drive, and all your tests are on the other drive, and you can only rescue one. What do you do?

  1. Save the application code.
  2. Save the tests.
  3. [sagely] I would simply never allow this situation to occur.

There! I am one of the people whose opinions can be safely ignored, exactly as it should be.

Discussion (15)

2020-11-29 18:48:10 by Joe Bryant:

I have never, ever seen a set of unit tests that I could have used to trivially reconstruct a real application. Is that a real thing that people do? I would expect the test development effort required to be astronomical.

2020-11-29 18:55:27 by Stephen Connolly:

I, 2 months ago, completed a rewrite of an entire service, because we had reasons. I left the tests from the previous service disabled until near the end and then turned them on to verify it would behave the same... found a few bugs but otherwise surprisingly painless Not quite saving the tests... I had an openapi.yaml with descriptions of what each endpoint should do... but other than that

2020-11-29 19:32:42 by pul:

You have illicitly opened a shell into your competitor's CI server, and can exfiltrate the zip of either their application code or their test suite. Once your session reads one of the files from the disk your shell will be detected and you won't be able to touch the other file. Which one do you steal?

2020-11-29 19:39:15 by skztr:

Speaking only regarding the hypothetical scenario, and nothing about the merits of TDD, or "testing as documentation" (which I would call a separate consideration), I would instead phrase it as: "You receive a notification that the datacenter containing your backups has been destroyed. As a precaution, you immediately plug in your cold storage to run a routine verification. Just as that process determines your drive to be corrupt, you get a notification that your primary datacenter is also on fire. You're already connected to it, as you almost always are, as the sensitive nature of the project means that all developers work on the project remotely. As the person responsible for the cold storage backups, you of course are an exception to these rules, and so you immediately begin copying files to your local system. Do you start by copying: - The toplevel directory, hoping you get enough of the important bits before the connection fails - The documentation / specifications - The tests - The application

2020-11-29 23:38:27 by hobbs:

Doesn't matter how good the tests are, by the time you've "reconstructed the code" from tests you're out of business. Running code with no tests makes money. Tests with no running code don't.

2020-11-30 00:51:19 by Spwack:

For some reason this put me in the mind of neural-network generated code. That could result in a situation where you have a bunch of tests, pull a lever, get another selection of potentially-viable code, and use the existing tests (which would have to be *incredibly* thorough) and then measure that against some other kind of fitness. Maybe in the future programmers will just be translators between human requests and machine-readable tests, and the bots then go off and write the code that passes the tests. (Of course you might just end up with [override function == returns true] which would then cause all the tests to pass)

2020-11-30 02:48:31 by Vebyast:

This is a proxy for "big bang" rewrites. Example: The backend for your electronic medical records application is too slow because your DB schema is junk. You've learned a lot about your data over the last fifteen years and the existing system just can't tolerate any more tweaks. But you're the system of record for thirty million medical records and people will die if the migration isn't perfect. How do you make sure the migration to the new DB schema goes right? You freeze the schema and the storage layer (let the application code burn), write as many tests as you possibly can (take the test code), and not release your new database code until the tests all pass (rebuild the application using the tests). Obviously big-bang rewrites aren't a good idea (c.f., but sometimes you can't get around them.

2020-11-30 05:41:16 by Tetragramm:

Porting to a new language, is one. Your tests would need to be simple enough to translate, but then it's just a matter of verifying the outputs didn't change.

2020-11-30 09:36:17 by Shieldfoss:

If you can reconstruct your software from your tests, that means your software architecture is encoded in those tests. Since getting the architecture right is the hardest part of typical software development (absent a few edge cases), this means you don't actually have any tests because the code you claim is your tests is actually a development platform (specifically, the platform where you are developing your architecture) and you should probably be writing some tests for your architecture development platform. (This is my primary beef with TDD actually - you can't let tests drive development because the most important parts of development can't be tested like that)

2020-11-30 14:02:03 by goodmoth:

I'm speaking out of ignorance here, but if the value of (good) tests is that they allow you to reconstruct the original application, then aren't we defining the original application as the true thing of value? So it would make at least as much sense to just save the application in the first place? Further, if we looking at this at all realistically, then even very good tests still aren't perfect, and probably don't give you the ability to reconstruct the original application perfectly. But if you save the original application, then, well, you have the original application.

2020-12-01 04:45:05 by Whybird:

Just to add in to the mix, I read recently about an AI tool that can generate tests from code, for actual full coverage. The tests are even human readable. Of course, they don’t prove that your existing code is correct, but it does make explicit ALL behaviours that you did change with any given commit, because the previous seats now “fail”. I see this as a way to get full coverage for everyday development and refactoring, but also occasionally a Big Bang rewrite of some part of a system is inevitable. Here’s another thing I think about: for hugely critical applications, it might be desirable to have your functionality not even reliant on a specific company’s whole *cloud*. Imagine if AWS went down, and you could just switch over to to IBM or Azure, where you know for a fact that you have exactly equivalent functionality. Alternatively, you can just switch to whichever is effectively cheaper at any given time, especially if your vendor bumps up the price unexpectedly.. no vendor lock-in. A full test suite that you run against both/all implementations ensures functional equivalence.

2020-12-01 05:35:19 by Whybird:

Seats -> tests* (Sorry; autocorrect)

2020-12-10 08:31:07 by ingvar:

Whybird raises the alluring thought of "cloud agnosticism". Which is a thing you can do, but you are at that point leveraging your cloud platform minimally, effectively using it only as "a place to run VMs" and most probably "without leveraging the platform's permissioning structure and identity management". And that is totally a thing you can do, but it would, most probably, end up being money-inefficient in a majority of the time, for occasional small savings. And you'd still need a sufficiently-portable way of standing your infrastructure up. So, you'd need to write your own infrastructure specification language and compilers to multiple different backend formats (or, just multiple different terraform files, as that's probably the widest span of cross-vendor thing at this point).

2020-12-14 01:18:11 by different sam:

Building a test suite against an existing codebase and then setting fire to the original codebase is a sufficient description of clean-room reverse engineering.

2021-01-08 03:53:08 by D:

Super easy. If the tests are on another drive, then the tests are not being maintained or run, and are completely worthless. The source code for the app on the other hand would probably just be a little rotten, lacking up to date tests.

New comment by :

Plain text only. Line breaks become <br/>

The square root of minus one: