One common reaction to my post on writing debuggable code was: you don’t need logging, just use a debugger. While there are cases where a debugger is the best option, there are many reasons why having proper logging in place is superior to using a debugger for trouble shooting.
Where do you start? The systems we develop at Symsoft deliver text messages (SMS) to mobile subscribers. To deliver one SMS, there is quite a bit of processing, including external signaling and possibly a database lookup. If an SMS is not delivered as expected, where do you start trouble shooting? Not all failures cause an exception to be thrown, so there may not be an obvious starting point. Even with moderately complex logic, it is quite difficult to know where to set the first breakpoint. Furthermore, you will most likely not get it right on the first try. With a log, you instantly get an overview of the whole process.
A sequence, not a snapshot. To figure out why something is not working as expected, you often need more than just the current state. Why is this variable null? Why are we in state IDLE when receiving this response message? Why did this timer time out? To be able to answer these questions, you need to see what happened before the error happened. With a debugger, you see the current state. With a log, you also see the events leading up to the current state. That makes a big difference.
Saved, not fleeting. The log statements you put into the code remain there. You keep getting the benefits of them every time you trouble shoot that part of the code again. With a debugger, it is always a one-off effort. The next time you debug the same part of the code, you again have to figure out where to put a breakpoint, which variables to look at and so on. With the log statements, your efforts are saved and remain for the next time.
Who can do it? To use a debugger, you need a certain level of expertise. You also need to know the code well enough to know where to look for the problem. This limits the number of people that can trouble shoot a problem. On the other hand, a lot more people can enable logging when there is a problem, collect the results and send them to the developer. Some customers or technician may even be able to figure out what the problem is just by reading the logs, without involving a developer at all.
In a live system? Are you comfortable with attaching a debugger in a busy system carrying live traffic? There is always the possibility of making a mistake, or stopping threads that will cause time-outs in other parts of the system. By comparison, enabling and disabling logging is a lot safer.
What About Performance?
Another argument against logging was that it is too slow and produces too much data. Clearly, it is not possible to log everything, so here are the strategies that work for us:
Use session-based logging. With session-based logging, you log everything that happens in the specified session, and nothing else. For example, you can specify a list of phone numbers, and when (and only when) one of those phone numbers sends an SMS, a full log of the complete session is output. So even though the system may be processing several thousand SMSs per second, only a handful logs are output.
On and off at run-time. It must be possible to turn logging on and off while the system is running. If there is a problem, you enable logging to try to capture the problem. If there is no problem, the logging can be turned off.
Buffer before outputting. There should be close to no performance penalty with logging. If logging is disabled, the code should just skip the logging statements. If logging is enabled, the log strings should be handed off to a low priority queue that can collect and output the log without a performance penalty for the traffic handling.
“We Don’t Need Logging”
Some people argue that as long as you write well structured, clean code, and test the code properly, then it should work, and there is no need to spend time adding logging statements. In my experience, this is just wishful thinking. However careful you are developing the software, there will always be bugs slipping through. The sooner you admit this, the better.
Also, a problem may not even be caused by a bug. There are many cases where the software is working as it should, but the result is not what the customer expects. Perhaps the system is not configured correctly. Or maybe an external system is not responding as it should. Or maybe there is a misunderstanding on how a certain feature works. In all those cases it is important to be able to see what is happening in order to get it to work the way the customer wants to, even though there is no bug in the software.
There are many cases where your only option is using a debugger in order to figure out what the problem is. Even great log statements can’t cover all cases. But by using logging wisely, you can cut down substantially the number of times the debugger is needed, while at the same time getting all the benefits of logging.