Great Programmers Write Debuggable Code

All programs need some form of logging built in to them, so we can observe what it is doing. This is especially important when things go wrong. One of the differences between a great programmer and a bad programmer is that a great programmer adds logging and tools that make it easy to debug the program when things fail.

When the program works as expected, there is often no difference in the quality of the logging. However, as soon as the program fails, or you get the wrong result, you can almost immediately tell the good programmers from the bad.

Example 1: “Let’s make a debug version”

For example, a tester came to me with a call case that didn’t work. We looked at the logs, and saw that the problem seemed to be in a neighboring module. A call to the other module to get a list of values simply returned null. When we enabled logging in the neighboring module and re-ran the test case, there was no more information available. No clues to why null was returned – did we supply the wrong arguments, did some external system fail, was there an error in the neighboring module, or what?

When asking the developer responsible for the code, the answer we got was: “Oh, then we have to make a debug version to see what happens”. Fail! It must be possible to get some sense of where the problem is just from the logs. If this had been in a production system, adding a debug version would have been a lot of work. The code needs to include enough information in the log statements so you at least have some idea of why it fails.

Example 2: Show Me How We Got Here

One of our products at work finds the cheapest route for the delivery of SMS:s (text messages) to mobile phones. Depending on the current location of the phone, and the operator the destination subscriber belongs to, there are many possible routes to choose from, each with a given cost and other characteristics.  In addition, there can be exceptions that forbid some routes, and promote other routes. There are typically many thousands of routes defined. The system finds the cheapest route in each case, given the constraints, and delivers the SMS.

Now, suppose a certain SMS gets delivered using route A, but we believe it should have used route B. Why was route A chosen? If there is no logging information (other than “route A was chosen”), we are left with hundreds of possible routes, their costs, the exceptions, and a complex algorithm. Good luck figuring out why A was chosen.

In our implementation, the log includes all potential routes, in cost order. As routes are eliminated due to different restrictions, the eliminated routes are listed in the log, together with the reason for elimination. With all the information on the input to the algorithm, as well as the steps taken, listed in the log, it is fairly easy to see why a certain route was picked.

Why Not?

So, why don’t all programmers write debuggable code? I can think of three reasons:

  1. You have to be humble enough to realize that there will be cases when your code doesn’t work as expected. I believe many programmers have a hard time with this.
  2. If you test your own code thoroughly, you will make sure it works (or fails gracefully) in a lot of different scenarios. For each scenario, it is natural to add logging. If you don’t test those cases, you are less likely to add logging there.
  3. Many programmers don’t trouble-shoot their own code in production systems often enough. If you have a problem in a live system, and the logs tell you nothing about why there is a problem, you have a strong incentive to add logging that will actually help you the next time you are in a similar situation.

Is Your Code Debuggable?

There are of course cases where even good logging messages don’t give you the full story on why something fails. You may still have to make that debug version. But your regular logging should at the very least give you some hints on what the problem might be.

So, how well have you prepared? When your code fails, will the logs tell you what’s going on?

52 responses to “Great Programmers Write Debuggable Code

  1. I definitely have to go into the bad programmer pile, I don’t often log stuff in my code. But then again the code I write isn’t usually production code or is very small in scale. Would you suggest proactively logging everything or concentrate on functions input/output and possibly trouble code?
    Thanks for the advice, I will try and use this advice when I write my own code in the future. I think like testing, it is one of those things no-one will disagree about its’ importance but when it comes time to do it many people don’t bother.
    Would you also encourage meaningful messages for users when an error appears?

    • Hi Brendon,

      Thanks for your comments. Proper logging is of course more important in big production systems than in small personal projects. However, even for small personal projects I add logging and error handling very early on, because I find it really helpful while I develop the program.

      Logging too much can also be a problem. There should be enough so you have a good idea of what’s going on, without every other line being a log statement. Sometimes logging the inputs will be enough to reconstruct everything that happened, but it varies from case to case. The best indicator I think is if you are able to debug problems in your code with the logging you included.

      I should also add that for big systems (like our SMS processing system at work), it is important that the logging can be turned on and off while the program is running, and that there is a mechanism for selecting what you want logged. When processing thousands of SMS per second, you want to be able to log only information related to e.g. one particular subscriber, not everything for all subscribers.

      Yes, I think it is good to give descriptive error messages in all cases, both to external users and internally. It’s also important to include dynamic information. For example, “Value too large” is not as good as “Value 101 too large – permitted range is 0 – 100″.

  2. A bad programmer think: “this should be ok”.
    A good programmer think: “this can be wrong”.
    Logging is important and help you to see a problem (many programmers dont know about a problem) and it help you to find a problem quick.
    The hardest thing is to find a balance of logging and not logging, too many data could irritate more than it helps.

  3. The problem with writing log information while being in production is, that when you have websites or applications that are being used by thousands of people each minute, you get a lot of overhead in terms of storage. The company isn’t always prepared to shell out that much money just to keep logs – logs are something that programmers need, not the managers. Also, when you have to work on a predefined DB scheme (most enterprise applications) and all the changes have to be agreed with a DBA, I would miss a lot of deadlines just because I would need to have additional tables on a production DB. So yeah in theory it should be like this, but it’s not always possible. The better approach in this situations is to write readable code and good test cases.

    • Yes, logging/tracing everything all the time is not feasible. You need to be able to turn it on/off selectively. When there is a problem, enable logging/tracing for the relevant cases to get more information on why the particular case is failing.

      • I completely agree with you, but from my personal experience, this would happen on edge cases and those wouldn’t be debuggable without a massive quantity of logs.

      • This still does not help when investigating a past case which cannot be reproduced (when logging was turned off). Any best practice for such cases?

      • @PTrobec: presumably each edge case would only need one logging statement though. But sometimes it can be too much to add, so maybe logging on a higher level, or returning distinct error codes for all cases could work.

        @Rob: Yeah, those are the tricky cases. Trying to reproduce the problem is one approach, but sometimes you have to add a debug version to get to the bottom of it. In our production system we have an “Trace on error” option – all cases are logged (in memory only), and if an exception occurs, that trace is saved. It does use up some extra memory and garbage collection resources, but is quite handy sometimes.

    • True. Good loggers i.e. Log4Net allow you to set different logging levels. So for ‘regular’ logging you can enable only error messages (which hopefully you don’t have too many :)). When there is a problem, enable a more verbose level i.e. debugging.

  4. Logging often makes code harder to read. Fortunately, if AOP isnt used, some software architectures by nature allow logging code at central sensitive points, e.g. message dispatchers or IOC containers etc, so normal code isnt polluted and logging is easy to implement and maintain, and you wont forget to log an event as you dont have to add logging code everywhere. Since exception handling should be done centrally too (or at least per module using a ‘security facade’ or sth similar) logging is easy here too. But, of course, it depends.

    • Agree. Central points like message dispatch, state changes etc are good places for logging, and give you a lot of information for relatively few logging statements. Good point André.

  5. Good tracing and logging is essential. Working on a Virtual Machine of a large ERP system. Making tracing and logging clear has helped me save a lot of precious development time and it gives others to find solutions for the problems without ‘bothering’ me first. Needless to say it is not enabled by default and only switched on when an actual problem is encountered. Many times I’ve giving tips to use which tracing when and in at least 50% of the problems they don’t call back: they figured out themselves.
    the rest of the 50% is either not looking at the trace and let me do it or they found a ‘real’ problem.
    Getting a debuggable version to a customer is a ‘last resort’ which usually is a lot of effort.
    The ideal situation with a good trace will lead to a reproduction internally, from that point on creating a small program to reproduce the problem which can be added later in a regression test.

  6. Usually logging frameworks allow switching logging on and off for modules and even single classes. You can deflate logging as you move to the production environment.

    Also log statements document your code if you do it right, e.g.
    Log.Debug(“Finding contract data of customer {0}”, customer.Id);
    reads as a comment in your code. Plus it can save your life if you have a screenshot of the app when the error occurred showing the customer.

    Logging of exceptions is essential. (Almost) Never choke exceptions without acting on them. If the customer/user tells you the time of failure a stacktrace in the log can save you days.

    Debuggable code is more than just code peppered with some log statements. E.g. for a stacktrace to be descriptive avoid using method chaining. If you get a NullPointerException (might have a different name in your language) from such a Java code line, you cannot find the method returning null easily:
    String street = myContract.getMainContact().getAddress().getStreet();

    Keep in mind: Setting a conditional breakpoint on a local variable is easy, tracking the return value of method getAddress() in the example above is not that easy.

    Also using anonymous types is sometimes not helpful for debugging. If you have no line numbers in your stacktrace, your are left with class and method names. So don’t write to much code in, e.g., a lambda expression or in an anonymous interface implementation which will be named $3 by the Java compiler.

    And yes, this is all necessary. Test driven development is not preventing having bugs in deliverables. It is preventing introducing new bugs by changing code. That’s all.

    Regards,
    dirk

    • Thanks Dirk, lots of good points.

    • “Also log statements document your code” – You don’t document code the same way you write log statement: doc is on the *whys* (you did this and that in your code), refers to theory etc, logs are more about *what* happened. Comments on the “whats” are usually a sign of bad decomposition or bad function and variable naming.

  7. Brian Coryell

    My pet peeve: log entries that say “File not found” but don’t include the filespec.

  8. Mike Crowder

    A fourth reason developers don’t write debuggable code: the project lead won’t allow it because, supposedly, the paying customer won’t allow it. Can’t even leave TRACE statements in the code which don’t compile in a release version. Nope. Nada. No such tools/artifacts are permitted in the released code and no amount of persuasion will change anyone’s mind.

    • … “the project lead won’t allow it because, supposedly, the paying customer won’t allow it”

      What kind of customers are these? I’d hate to have them. I mean next thing you know they wouldn’t want me including other apparently superfluous development.. like stability, maintainability, and durability. And don’t even consider doing unit tests (that are inherently not bundled in the released product).

      • Mike Crowder

        The project has been on-going for more than 17 years. We are forced to continue to use coding/style rules that date from the 1970′s and 1980′s for “consistency” reasons (close to 1 million SLOC), even though some of those rules are today considered poor programming practice. But we do perform unit tests!

        I don’t buy the “consistency” argument for not evolving the coding/style rules toward modern techniques. If “consistency” were so important, we would all still be using rotary dial phones and riding horses to work.

  9. Nicholas Merriam

    Great article. There is even another reason to create proper logging and debug information. I had one project where, at the 11th hour, the customer said, I am sorry but nobody can understand the results, can you record some of the intermediate results and make them visible somehow. There would have been no chance at all to add this feature but, happily, it was already there in the logging code. I just had to spend a few hours to record this part of the debug info in a slightly different format for the released product code.

  10. There’s perhaps another consideration: if you don’t control the run-time environment, detailed logs can provide competitors insight into the architecture and algorithms of your solution.

  11. There is more to debugging than just when to log, but also just good programming practices can help.

    For example, in environments that dump a stack-trace when they fail. You can make your methods quite granular (i.e. approximately no more than 10 lines of code) so that from the stack-trace you can tell exactly where the error occurred (to the nearest 10 lines of code). Couple that with the type of exception thrown and I believe 90% of the time you can work out where and what the problem is.

    • Sometimes the result is wrong, but there is no exception thrown. In example 2, if A is chosen, but B should have been chosen, there will be no stack trace to help.

  12. It’s definitely a challenge to be a good programmer in this regard. Often times even following your three points, it’s difficult to identify what points in the code need logging. Just as bad as not logging can be too much logging where the root cause gets buried in a pile of otherwise irrelevant information.

    My rule of thumb is to log anything that comes from outside the module, especially any IO operations.

  13. Logging can be botched in so many ways! Incomplete logging, noisy logging, slow logging, inaccurate logging…
    Guess what? I find logging useful as a last resort, but I much prefer good support from a real debugger, algorithms that are deterministic (so if you have the same input, you can reproduce the outputs), and well, old fashioned crash dumps (you lose the time perspective, but you get to see everything as it was).
    So yes, debuggable code is code that can go to QA and then production, and mature over time. But debuggability isn’t limited to logging, far from it!

  14. “Let’s make a debug version”

    Depending on the language, doing this may change what the program does. Back in the day, I would sometimes come across programs (in C) that if I added a “temporary” print statement to help trace the state along the way and narrow down the cause, the error would go away. Usually this meant there was some memory being trashed and the code shift from just that one extra statement would allow it to work (memory was likely still trashed.. just nothing critical anymore). And this doesn’t even count the optimizations that are often turned off during a “debug” build, but not in the release one. I’ve also seen “gotcha” side effects in perl of just printing [seemingly] read-only access of variables (due to some internal variable updates it does).

    “Separate but equal is not truly equal” — to reuse a [paraphrased] saying in a slightly different context.

  15. Yes, I’ve also had my fair share of those kinds of problems. Another common cause is changed timing between threads when debug print-outs are added.

  16. Ricardo Santos

    The article title is misleading, as the article itself only talks about logs. Logs are only one tool for debugging. But certainly not the only one. Debuggers, remote monitors, snapshots, monitoring agents, etc are also useful. You can also have logs and be less able to debug a system (example, when you over do it, or when the logging itself becomes a bottle neck).

    I also object at your definition of what constitutes a great programmer.

    Is like saying that a carpenter that does not use a hammer on every project is a bad carpenter. While hammers are useful, sometimes you need a screwdriver or a saw, or a sander. Is not the tools that make a great carpenter, but knowing which are the tools, when and how to use them make a good carpenter. Persistence in doing it with mastery is what defines a great carpenter.

    Similarly using logs does not define a great developer. Just like a hammer, logs are one of the tools at disposal. Knowing which are the tools and when and how to use them is what defines a good programmer (persistence in doing it with mastery is what defines a great programmer) .

    In the case of logging, you can certainly over do it. I once had a system that was making 30 GB a day in logs. Adding a way to selective turn on/off the logging parts was more important to the stability of the system than the log itself. As well as limiting the log, rotating it, etc.

    • Hi Ricardo,

      Thanks for your comment. Attaching a debugger on a busy system (delivering thousands of SMS per second for example) is often not feasible (things start timing out etc). Plus – where would you start? You need some indication of where the problem might be in the first place. Also, you can’t ask operations people to attach a debugger, but you can ask them to enable logging.

      But as long as you have some way of troubleshooting the system when it doesn’t work as expected (doesn’t necessarily mean that exceptions are thrown), then that’s fine. However, in my experience, logging/tracing is the easiest and best way of doing that.

  17. Christian Schlichtherle

    Debugging and logging are only the last line of defense these days. Serious engineering spends its energy on writing automated tests (unit test, integration tests, whatever).

    The problem with debugging is that it’s plain unproductive: Every minute you spend in a debugger trying to analyze an issue is wasted time because you are NOT productive, i.e. you are not adding ANY value to the code base. Compare this with writing tests: Every minute you spent writing a test is adding value to your code base. It’s an investment which provides a return EVERY time you run the test code.

    The problems with logging are endless, but it sums up to be too much of the wrong information and that you are spending too much time writing the logging statements at first. Better spend this time on writing tests. Also, logging statements are littering the code and make it hard to read and maintain. Overall, I find logging statements to be an insult to my eyes these days.

  18. I disagree that logging is the correct way to make debuggable code. While at times it is very helpful it can often be a symptom of a poorly designed error handling system. If your code is properly checking for errors at all points, and delivers proper exceptions/error codes I don’t think logging is necessary.

    The type of logging I do for debugging is usually very specific to the problem I’m debugging and not suitable for staying in the code. Once I have the problem located and an appropriate test coded, I can then remove the logging. This allows me not to worry about the long-term usefulness of the log statements and merely debug the problem at hand.

    • I think logging and error handling are separate issues. Catching, handling (and yes, logging) errors is necessary, but not enough. If your system produces the wrong result, but doesn’t throw an error, how do you find and correct the problem in your code? You need some way of knowing what the program did, which paths it took, in order to start looking for the problem. With good logging in place, it’s often (but by no means always) enough to look at the logs to figure out what went wrong.

      I also frequently add logging statements when looking for specific problems, but many of them I remove once the problem is solved. You can’t log everything, so you only want to keep the well-placed, generic logging statements that will also be useful in the future.

  19. Good article indeed.

    I find that back end programs are in particular need of detailed logs.

    Webapps are easier to debug because the request stack is usually smaller, after all, it will usually complete in under 2 seconds.

    Most of my work is triggered by cron while I slumber, so if there is an issue I would like to read the log, rather than read the code to determine what happened in the small hours of the day.

    Even more important when I need to debug somebody else’s program. If I can read the logs to determine what went wrong without reading the code, then I would say it’s written by a great programmer!

  20. Pingback: Weekly Design News – Resources, Tutorials and Freebies (N.184) | Wordpress Webdesigner

  21. Pingback: Great programmers write debuggable code | CROWDCARE

  22. You may use great patterns and coolest frameworks but if your code doesn’t log enough your code SUCKS! I’ve seen to many banking apps that made business decisions worth millions without a single trace in logs… And BTW: logs greatly increase code readability – good logs actually explain what code really does!

  23. A student said to the master, “I want to be a good programmer.”

    The master replied, “Would you rather be a good programmer who writes bad programs, or a bad programmer who writes good programs?”

    The student left enlightened.

  24. I like so much your opinion thanks for share.

  25. Pingback: Weekly Design News (N.184)

  26. I can think of two reasons a programmer (who may or may not be myself) might not do this, in spite of knowing how good it is:

    1. Clean code. Scattering logging all over the place doesn’t make for tidy “print this and stick it in the Tate Modern as art” code.
    2. More seriously, putting logging in everywhere can lead to a tangled mess. It’s great if the system is modern, has been built to use DI (or any alternative), but retrofitting old code, as I had to do recently, to log in each object is hideous. I ended up with static calls to a singleton, because there was no way I could find to get the logger into every location without a large rewrite. And that still bugs me.

  27. Pingback: Issue 48 – Get knocked down 5, get up 6 | TLN

  28. Pingback: Recent Readings Week 22 | Larry's Notebook

  29. Pingback: 优秀程序员编写可调试的代码 | 我爱互联网

  30. Pingback: 赵伊凡BLOG:完美的开发工程师需要会写可调试的代码 | 赵伊凡's Blog

  31. Pingback: 优秀程序员写可调试的代码 | 我爱互联网

  32. Pingback: Good programmers write debug code - A8 Observer

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s