We had a test that was calculating a total value of a counter (so based on differences between counter values) for "yesterday" (as well as "today", "this week", etc). I committed code and within a few minutes get notified that my code caused a failure in this test -- unexpected as my change didn't even touch anything remotely related to this functionality.
As I start asking if anyone has any idea, someone else mentions they've seen that before too, many weeks ago, but then it "fixed itself".. Today was the 1st, and when I looked at the previous failure for this test, it was also on the 1st, but it was a couple months ago. Last month the tests were passing on the 1st. So I start to dig into it.
To put this in perspective, we had dozens of other tests that were checking the same calculation for various time spans across months (including explicitly for months with both 30 and 31 days, and February-to-March for both leap and non-leap years), and all sorts of combinations of different values, missing data, etc, that had mostly been there for well over a year. We had actually spent a fairly significant amount of time thinking about how to test this for all the different combinations of dates and data.
As it was the 1st, one of the other tests was actually running with the exact same start and end date/times as this "yesterday" test, but was passing. So I started looking at the mock data each was using.
Turns out it was in fact a legitimate bug, but only happened if it was currently the 1st, AND yesterday was not the 31st, AND there were no values at all for the current month (or anytime later).
The mock data for the "yesterday" test didn't have any values after whatever yesterday was. The explicit date test for 31st-to-1st happened to have a 0 value on the 1st, which meant this bug didn't happen.
Mostly out of curiosity, I looked further back in test history. There were in fact 3 or 4 separate times this test failed, all of which were on the 1st of either March, May, July, October or December. But not all of those dates -- because sometimes the 1st was on a weekend, or just no code was pushed, and no build was run.
This was also in production, but probably was never seen by any customers (none had reported it) because the "yesterday" value was only ever displayed in the UI, and most of the time data is added hourly, so by the time a user logged in on the 1st (say, 8 am), it was almost certain there was some piece of data added (even if it was 0).
We added an explicit test with the data for this situation, and of course fixed the bug.
However, this will live on as by far the most obscure time-based test failure I've ever had to deal with.
> I committed code and within a few minutes get notified that my code caused a failure in this test
Forgive me as I'm coming from a .NET background where everything is tightly integration into Visual Studio, but are you not able to run your tests before committing? We are strongly encouraged to run the full test suite prior to committing any code for exactly this reason.
Yeah, and this is also .NET. I can't remember if I ran the full test suite or not (I have to admit, I don't always -- even though it's not best practice -- if it's a fairly isolated/minor change).. but in this case, due to the problem, if I had done the code change on the 30th, it would have passed at that time anyway. The nightly build (in the morning hours of the 1st) would have failed.
We run all our tests nightly (as well on a per commit basis) as a matter of policy. It is incredibly useful when something like this pops up - and with a somewhat obtuse code base like ours, it does a bit too often.
We have a "debug" build that compiles everything and runs unit tests, which runs on every commit on every branch (we use gitflow, so all actual work happens in feature branches).
There's also a nightly (or manually triggered) "release" build that runs on the master and any release/* branches (if there are changes), and additionally does some i18n stuff (which includes the convoluted step of compiling a VB.NET app and then decompiling it into C# so we can run gettext on it), builds installers and does some other packaging tasks.
The problem is this bug was dependent on when the tests ran. Commit on November 30th, and the debug build would be fine, but release build would fail the next night (Dec 1). Commit on October 30th, both will be fine as the release build runs on October 31st. November 1st is a Sunday so chances are no one was working on Saturday, which means no build, which means this failure isn't visible.
As I start asking if anyone has any idea, someone else mentions they've seen that before too, many weeks ago, but then it "fixed itself".. Today was the 1st, and when I looked at the previous failure for this test, it was also on the 1st, but it was a couple months ago. Last month the tests were passing on the 1st. So I start to dig into it.
To put this in perspective, we had dozens of other tests that were checking the same calculation for various time spans across months (including explicitly for months with both 30 and 31 days, and February-to-March for both leap and non-leap years), and all sorts of combinations of different values, missing data, etc, that had mostly been there for well over a year. We had actually spent a fairly significant amount of time thinking about how to test this for all the different combinations of dates and data.
As it was the 1st, one of the other tests was actually running with the exact same start and end date/times as this "yesterday" test, but was passing. So I started looking at the mock data each was using.
Turns out it was in fact a legitimate bug, but only happened if it was currently the 1st, AND yesterday was not the 31st, AND there were no values at all for the current month (or anytime later).
The mock data for the "yesterday" test didn't have any values after whatever yesterday was. The explicit date test for 31st-to-1st happened to have a 0 value on the 1st, which meant this bug didn't happen.
Mostly out of curiosity, I looked further back in test history. There were in fact 3 or 4 separate times this test failed, all of which were on the 1st of either March, May, July, October or December. But not all of those dates -- because sometimes the 1st was on a weekend, or just no code was pushed, and no build was run.
This was also in production, but probably was never seen by any customers (none had reported it) because the "yesterday" value was only ever displayed in the UI, and most of the time data is added hourly, so by the time a user logged in on the 1st (say, 8 am), it was almost certain there was some piece of data added (even if it was 0).
We added an explicit test with the data for this situation, and of course fixed the bug.
However, this will live on as by far the most obscure time-based test failure I've ever had to deal with.