I recently gave a presentation on what it is like to work as a software developer to first-year engineering students at KTH taking an introductory programming course. I wanted to give my view on the main differences between professional software development and programming for a university course.
First I talked about challenges with large-scale software development. Then I listed several development practices used to cope with these challenges. I went on to present ways to become a better programmer, and ended with some fun facts from work.
Characteristics of Production Software
Programs are BIG. The main characteristic of software used in live systems is the size. For example, our main repository at work contains 1.8 million lines of Java (I used the tool cloc to find the number of lines). The sheer size of the programs complicates software development, and a lot of practices have been developed to deal with it.
Software is never done. Software that is used keeps growing and evolving. The customers find more and more uses for it, and want more and more features. Software that is not used is discontinued, but successful software is developed continuously for many years. Several developers are almost always involved. For example, I checked the subversion log of one of the main classes in our SMS application. There were 150 check-ins over 7 years, by 8 different developers (2 of which don’t work at Symsoft anymore).
Complexity from aggregation. Most features in production software are quite simple, but because there are so many of them, you get subtle (or not so subtle) interactions between them causing bugs. The complexity of the system comes from the aggregation of many simple parts, not from any complex parts.
Reading code. A consequence of the characteristics mentioned above is that reading code is a very important skill. Before a program can be modified, you need to understand what it does, and how it does it. Only then can new functionality be added so it fits in with the existing structure, and without breaking anything. Reading and understanding a program can be a major effort, and one sign of a well-designed program is that it is relatively straight-forward to modify it.
How to Manage
Many techniques have been developed to make developing and maintaining large programs easier. Here are the most important.
Modularize. This is an obvious first step, and (I believe) universally used. The software is split into subsystems, layers or modules so that smaller chunks of functionality can be dealt with at a time.
Iterate. Developing software bit by bit is as helpful for small 30-lines scripts as it is for systems with millions of lines of code. I like the following quote:
“A complex system that works is invariably found to have evolved from a simple system that worked.” – John Gall
Self-documenting code. The naming of the classes, methods and variables is incredibly important. The names, when chosen well, let you understand what the program does just by reading them. They also greatly reduce the need for comments in the code.
No duplication. Code-duplication causes a lot of problems when you come back to modify the program (happens all the time) – you may forget to make the intended change in all the duplicates. So instead of copy-pasting, combine the logic into one method. It makes the code more compact, and easier to modify in the future. There is an excellent article by Martin Fowler, Avoiding Repetition, on this subject. Unfortunately, I have seen a lot of duplicated code in production software over the years.
Unit testing. Unit testing (JUnit etc.) is useful both in small-scale and large-scale development. It’s an easy way to ensure your smallest program parts work as expected, and you get repeatable tests that can be run again and again. Making sure your code can be unit-tested also automatically improves the structure of the code – it becomes less monolithic, more de-coupled.
Version control. Using version control (like git or subversion) is a no-brainer, and as far as I can tell pretty much always used in professional software development. Version control systems are used both to keep track of different working versions of the software, and for knowing exactly what code is included in each release.
Version control is also useful on an individual level, for example for the programming assignments at a university course. Any program I spend more than 10 minutes writing, I stick in a local git repository. That way, I can always go back to previous (working) versions of my program.
Write for people first, computer second. The code you write will be read many times in the future (by you, or another developer). The computer doesn’t care how the code is written, so make it as easy as possible to understand for the next person that has to read it. A corollary to this is: don’t be too clever. It’s better to be clear than to be clever. The following quote puts it another way:
“It’s OK to figure out murder mysteries, but you shouldn’t need to figure out code. You should be able to read it.” – Steve McConnell
Plan for failure – logging and error handling. What do you do when the program doesn’t work as expected? You need some way of seeing what is going on. Usually this is in the form of tracing or logging. This will inevitably happen, so you might as well build in support for tracing and logging from the start (and make it possible to activate and deactivate while the program is running). It’s a similar story with error handling; put in place a unified way of handling errors in the program, because errors will happen.
Issue tracking. For any real system, it is necessary to have a way to keep track of bug reports – e-mails don’t cut it. At work we use Jira, but there are many products with similar capabilities. Often, the same system is also used to keep track of new features to be implemented.
Becoming a better programmer
There is almost no limit to what you can do to improve as a programmer. Here are some basic tips, appropriate if you are taking a software development course at university, but want to progress beyond learning the content of the course.
Program! The best way to learn to program is to actually program. Reading a book or listening to a lecture can make it seem like you have learnt and understood the concepts, but it is not until you actually start writing your own programs that you really learn. This is why (in my opinion) the best programming courses have lots of programming assignments – that’s when you are forced to put the theory into practice.
But you don’t have to stop at just doing the assignments. Solve some other problems with programs as well. If you don’t know what to solve, you can look for good problems at Programming Praxis, Ruby Quiz (the problems don’t all have to be solved in Ruby), or Project Euler (quite mathematical).
Learn a scripting language. Being able to write small scripts to automate tasks on the command line or to filter out information from a log file containing 100,000 lines is quite useful. Make sure to learn how to use regular expressions, and a language like Ruby, Python or Perl. All those languages are also used for “serious” applications, but even just using them for quick shell scripts is worthwhile.
Learn an IDE and a text editor well. For Java development it’s really worthwhile to learn to use an Integrated Development Environment (IDE), such as IntelliJ IDEA or Eclipse, really well (as I have written about before). In addition, it is always useful to know a good text editor, for example Emacs or Vim. In both cases (IDE and editor), learn as many keyboard shortcuts as you can. The goal is to be able to go from thought to program as effectively as possible.
Books. The number one book that every programmer should read is Code Complete by Steve McConnell. It’s always on top of every list of the best programming books, and for good reason (I’ve reviewed it on Amazon). It has a lot to say about how to write code (in a language-neutral way). The next book to read is The Pragmatic Programmer by Andrew Hunt and David Thomas. It contains a lot of tips on how to develop software efficiently.
Hacker News and Proggit. There are always interesting articles related to programming at both Hacker News and reddit/r/programming. Just remember to not spend too much time there – program instead. But for inspiration these sites are great.
I also added a few fun facts about how we develop software at Symsoft. All of these are not necessarily true everywhere, but I wanted to add in a few bits of trivia to give a better sense of what it can be like to work.
HashMap and ArrayList. The two Java data structures we use the most in our code.
English. We’re in Sweden, but everything is in English – the code, documentation, bug reports etc. No big surprise, I just wanted to make sure everybody knows.
People interactions. Even if your job description is coding, there is quite a bit of interaction with other people. Mostly discussing designs and bugs with colleagues, but also contact with product management and sales, and with customers. There is no such thing as a solitary coder.
Time for bug-fixing. We schedule around 30% of the developers’ time for bug-fixing. This includes the time for investigating reported problems, fixing the bugs and testing the solution. It is worth noting that many reported problems aren’t actual bugs. Sometimes the system is not configured properly, sometimes the problem is in a surrounding system, and sometimes the bug report is actually a request for a new feature.
No UML. We don’t use UML, but there are lots of whiteboards around. We use them all the time when discussing a solution to a problem, and when describing how the system works.
The talk took about 45 minutes to deliver. It was interesting to prepare and fun to give. What is your opinion and experience on what it is like to work as a professional software developer? Let me know in the comments.