There is no such thing as safe change

Many young or new developers have sometimes hard to deal mind set. Imagine you are developing big product and you about to release it. And then QA found problem that must be fixed. Fix is small and it looks like it is affecting only really small area that can be easily tested. And there is temptation to fix it, test only that small area by R&D and release it. And I will show that there is no such thing as safe change.

But before I will go to example from my experience, I have to remind you, that any application has a lot of bugs. Even “hello world” application can have some bugs especially if it does use run-time library. Most of these bugs are in some edge cases that almost never happen. There are also pair of bugs that masking each other and there are bugs that never manifest itself due to pure luck. And any change can upset last 2 category of bugs. And I already imagine that someone saying: “But I only recompiled small module ABC that has no external connection and sure it does not affect anything else”. And I will show you, that even that can lead to bugs in different completely unrelated area of code.

I was asked to investigate why application is crashing after some unrelated change. After few minutes I found bug, but that bug was unrelated to that change in any way. Moreover, after checking history in version control and I could not find any changes related to that bug in last 2 years. But after I reverted change, everything started to work correctly even bug is still present. And here is first lesson: you as developer should not be tempted to just fix bug. You should investigate why does it happen and why it did not happen before. And very often you can find more problems or will fix differently.

And I did follow my own advice. And after I found following code:

void Cat::Func()

With fix, application crashes in DoSomething, but without that fix DoSomething never executed. Really strange.

And after I did investigate more, I found that when Func is executing, this is not Cat class and it is Dog class. And indeed, I found some really old code in our application that does unsafe cast to Cat class. And as you can imagine it should always crash. But code works fine from moment that someone started using that code about 2 years ago.

And then I found that reading initialized field when class is Dog will read lowest byte of some pointer. And happens that for last two years that lowest byte of that pointer was always zero. And that small change we did, somehow affected memory allocations and now it is never zero and application always crash.

Imagine that we did release our application to customers without testing it. It would be disaster. And thus, is second rule: After any change, application should be retested.


Post comment