The Pragmatic Programmer: 20th Anniversary Edition

The Pragmatic Programmer: From Journeyman to Master by Dave Thomas and Andrew Hunt was given to me as a gift after an internship. The book gave me invaluable advice as I started out in my career as a professional software engineer. Re-reading it a decade later, I thought the general advice still held up well, but it made references to technologies such as CORBA that are no longer used and felt dated as a result. The authors agreed and wrote a 20th anniversary edition that was updated for modern developers. A third of the book is brand-new material, covering subjects such as security and concurrency. The rest of the book has been extensively rewritten based on the authors’ experience putting these principles into practice. We discussed the 20th anniversary edition in my book club at work.

The book is meant for those just starting out in the world of professional software engineering. Many of the tips, such as Tip 28: Always Use Version Control will seem obvious to experienced hands. However, it can also be a guide for senior developers mentoring junior developers, putting actionable advice into words. The book is also valuable to those who lack a formal CS education; it explains things like big-O notation and where to learn more about these subjects. I think that any software engineer will get one or two things out of this book, though it’s most valuable for beginners.

One of the things I appreciate about the book is that they talk about applying the principles not only to software engineering but to writing the book as well. The book was originally written in troff and later converted to LaTeX. For example, to illustrate Tip 29: Write Code That Writes Code they wrote a program to convert troff markup to LaTeX. In the 20th anniversary edition, they talk about their efforts to use parallelism to speed up the book build process and how it led to surprising bugs.

Perhaps the best thing about the book is that the authors summarize their points into short tips highlighted throughout the book. The authors helpfully attach these tips to a card attached to the physical book. This makes it easy to remember the principles espoused in the book and to refer to them later. I think this is a feature that more books should include, especially managerial or technical books.

Chapter 1: A Pragmatic Philosophy

The first chapter is less about coding and more about the general principles a pragmatic programmer follows. Most of all, it’s about taking responsibility for your work. The first tip of the chapter is Tip 3: You Have Agency: if you don’t like something, you can be a catalyst for change. Or you can change organizations if change isn’t happening. The most important tip of the chapter to me is Tip 4: Provide Options, Don’t Make Lame Excuses. In this section, they discuss taking responsibility for the commitments you make and having a contingency plan for things outside your control. If you don’t meet the commitment, provide solutions to fix the problems. Don’t tell your boss, “The cat ate my source code.”

Software rots over time without efforts to fix it. The authors talk about broken windows policing, the theory that minor problems such as a single broken window give people the psychological safety to commit larger crimes. Regardless of whether broken windows policing is actually true, the metaphor applies to software. This leads to Tip 5: Don’t Live with Broken Windows: If you see a broken window in your software, make an effort to fix it, even if it’s only a minor effort to board it up. This may seem impractical if your project already has a lot of broken windows, but this tip helps you avoid creating such an environment in the first place. In my experience, it works: when we set up a new project at work, we made a commitment to use git commit hooks to enforce coding standards. This made each of us more reluctant to compromise on software to begin with, and all of the code was a good example to copy from.

A pragmatic programmer is always learning, and learns things outside their specialty; they are a jack of all trades. Even if they are a specialist in their current role, they invest regularly in a broad knowledge portfolio. In addition to software skills, people skills are important as well. The section “Communicate!” shows how to effectively communicate your ideas, such as how to present, what to say, and how pick the right time. In the words of Tip 11: English is Just Another Programming Language. If you don’t have an answer to an email immediately, respond with an acknowledgment and that you’ll get back to them later - nobody wants to be talking to a void. Don’t be afraid to reach out for help if you need it; that’s what your colleagues are there for, after all. And don’t neglect documentation! Make it an integral part of the development process, not an afterthought.

Finally, the principles in this book are not iron-clad: you must consider the tradeoffs between different values and make the right decision for your project. Your software does not need to be perfect. When working on software, involve your users in deciding what quality issues are acceptable in return for getting it out faster. After all, if you wait a year to ship the perfect version, their requirements will change anyways. As Tip 8 says: Make Quality a Requirements Issue.

Chapter 2: A Pragmatic Approach

Why is decoupling good? Because by isolating concerns we make each easier to change. ETC.
Why is the single responsibility principle useful? Because a change in requirements is mirrored by a change in just one module. ETC.
Why is naming important? Because good names make code easier to read, and you have to read it to change it. ETC!

However, the authors also stress that ETC is a value, not a rule. For example, ETC may not be appropriate for writing code that has high performance requirements; making the code complex to achieve the performance requirements is an acceptable tradeoff.

They then turn to another important acronym for implementing ETC in Tip 15: DRY—Don’t Repeat Yourself. DRY makes things easier to change by having one place to change anything. Worse, if you forget to make a change, you’ll have contradictory information in your program that could crash it or silently corrupt data.

What kind of duplication is there?

Code Duplication: For example, having a case statement duplicated across several different places rather than in a single function.
Documentation Duplication: Some people believe that every function needs a comment. If you do this, you will also have to update the comments each time the function changes. Ask what your comment adds to the code before writing it!
Data Duplication: Caching an expensive result and forgetting to update the cache when the source data changes.
Representational Duplication: When you work with external API, the client and server must adhere to the same format in order to work. If one changes, the other side will break Having a common specification, such as openAPI allows you to integrate more reliably with the service.
Interdeveloper duplication: When two developers do the same work. This can be mitigated by Tip 16: Make It Easy to Reuse. If it’s hard to use your code, other developers will be tempted to duplicate it.

A closely related principle to DRY is Orthogonality. Two components of a software system are orthogonal if changes in one do not effect the other. Systems should be designed as a set of cooperating independent modules, each of which has a single, well-defined purpose. Modules communicate between themselves using well defined interfaces and don’t rely on shared global data or the implementation details of another module. Unless you change a component’s external interfaces, it should not cause changes in the rest of the system. Orthogonal systems are easier to test, because more testing can be done at the module level in unit tests rather than end-to-end integration tests that test the whole system.

Often, when starting a software project, there are a lot of unknowns. The user has an idea of what they want, but there’s some ambiguity in the requirements. You don’t know if the library and frameworks you pick will work nicely together. The solution here is Tip 20: Use Tracer Bullets to Find the Target. In a machine gun, tracer bullets are bullets that glow in the air, enabling the user to see if they’re hitting the target at night. Tracer Bullet Development provides that kind of immediate feedback. Look for a single feature that can be built quickly using the architectural approach you’ve chosen, and put that in front of the users. You may miss; users may say that’s not quite what they wanted. But that’s the point of tracer code: it allows you to adjust your aim with a skeleton project that’s easier to change than a final application. Users will be delighted to see something working early, and you’ll have an integration platform to build the rest of the application on.

Tracer code is different from prototypes. To the authors, prototypes are disposable code used to learn about a problem domain, never meant to be used in production. Prototypes don’t even have to be code. A UI can be mocked up in an interface builder, or an architecture mapped out with post-it notes. In terms of Tip 21: Prototype to Learn. In contrast, tracer bullet code is meant to be part of the final application.

The final tip of this chapter I bring up is Tip 18: There Are No Final Decisions. Decisions should be reversible; if you rely on MySQL today, you may find yourself needing to switch to Postgres six months from now. If you’ve properly abstracted the database logic, making this change should be easy. Marketing may decide that your web app should be a mobile app in the future; if your architecture is built well, this extra demand should not be a burden. This is one tip I disagree with: I think it can easily be taken too far. If you provide too much reversibility, you’ll end up with over-abstracted code with configuration options that are never used. I think it’s more reasonable to think about what decisions can reasonably change and make them flexible; if you spend all your time trying to cover for every possibility, you’ll never get around to actually coding the required functionality.

Chapter 3: The Basic Tools

This chapter focuses on how to make the most out of your tools, what tools to invest in, and how to approach debugging. The first bit of advice: Tip 25: Keep Knowledge in Plain Text. By plain text, they mean keep knowledge such as configuration or data in a simultaneously human-readable and computer readable format. Plain text insures you against obsolesce; you can always write something to parse it later, while reverse-engineering a binary format is significantly harder. In addition, almost any other tool in existence can process plain text in some way, so you’ll have an extensive suite of other tools to use. As an extension of the power of plain text, they also suggest you master a command shell such as bash. Shells provide a family of tools that are composable with each other, and can be combined as much as your imagination allows. A GUI in contrast, limits you to the actions the programmers of the GUI thought of in advance. Finally, you should learn a text processing language such as awk or perl to get the most out of text - the authors used perl (first edition) and ruby (20th anniversary edition) to automatically highlight the source code in the book, for example.

The next topic the authors turn to is debugging. Debugging is the main task a software engineer does throughout their day, so it’s essential you get good at it. Defects show up in a variety of ways, from misunderstood requirements to coding errors. Some cultures try to find someone to blame for a defect; the authors think you should avoid that with Tip 29: Fix the Problem, Not the Blame.

They give the following tips on debugging your code:

Tip 30: Don’t Panic: It’s easy to panic when you’re on a tight deadline or a client is angry at you. However, take a deep breath and think about the problem at hand. The cause of the bug may be several layers removed from what you’re seeing, so try to focus on root causes rather than fixing the symptoms.
The Impossible has Happened: If you think to yourself “that’s not possible” - you’re wrong. It’s clearly possible, and it’s staring you in the face.
Reproduce It!: Find a minimal case that triggers the bug, whether that be a certain input data set, or pattern of actions. Once you can reliably cause the bug, you can trace it through your code.
Tip 32: Read the Damn Error Message: Enough said.
The Operating System is Fine: It’s possible that you found a bug in the Linux kernel or postgres, but these are extensively battle-tested applications. It’s much more likely that the problem is in your code.
The Binary Chop: Cut things in half until you find the problem. This massively decreases the search space you have to work in. If you have a long stack trace and are trying to find which function mangled the value, log the value halfway through. If the value is fine, log the value halfway through the next half, or if it’s mangled, halfway through the previous half, and so on. If a release introduces a regression, find a version that’s fine, and binary chop through the commits to find the commit that introduced the bug.
Use a Debugger and/or Logging Statements: Debuggers allow you to step through the code and inspect the values of variables, finding the exact point where things go wrong. In environments where a debugger is not available, logging statements can show you how a variable changes in time, or just how far the program got before crashing.
Rubber Ducking: Explain the bug to a colleague, or talk out loud to a rubber duck. You don’t have to get a response, by verbalizing your assumptions you may gain sudden insight into the problem.

Once you’ve solved the bug, however there’s still one more step: you should write a test to catch that bug in the future.

Chapter 4: Pragmatic Paranoia

Tip 36: You Can’t Write Perfect Software starts off the chapter. While we’d like to write perfect software, there will always be bugs, poor design decisions, and missing documentation. The theme of this chapter is how to design this fact in mind.

The first idea they propose is Design By Contract. Similar to legal contracts, it explains a function or module’s rights and responsibilities. A contract has three parts: It has Preconditions: things that must be true when it is called, such as what qualifies as valid inputs. Postconditions are what will be true when it is done, such as a sort routine returning a sorted array. Finally, Invariants are things that are always true from the caller’s perspective - they may change while the routine is running, but will hold at the beginning and the end of the call. For example, in a sort routine, the invariant is that the list to be sorted will contain the same number of items when it started as when it finished. If the contract is violated, the contract will specify what to do, such as crash or throw an exception.

Some languages, such as Clojure have built-in semantics for design by contract, with explicit pre- and post- conditions. However, if your language doesn’t support contracts, you can implement them with Tip 39: Use Assertions to Prevent the Impossible. You can assert that the conditions of your contract are true, and handle the cases where the contract is violated. If you don’t know what to do when a contract is violated, the authors recommend Tip 38: Crash Early. It’s better that you crash rather than write incorrect data to the database. After all, dead programs tell no lies. Of course, crashing immediately may not be appropriate - if you have resources open make sure to close them before exiting.

The final paranoid tip is Tip 43: Avoid Fortune-Telling. Pragmatic programmers only make decisions that they can get immediate feedback on. The more predictions you make about the future, the more likely you’ll get some of the predictions wrong and make the wrong decision based on them.

You might find yourself slipping into fortune telling when you have to:

Estimate completion dates months in the future

Plan a design for future maintenance or extendability

Guess user’s future needs

Guess future tech availability

Chapter 5: Bend, or Break

In a previous chapter, the authors wrote about making decisions reversible and easier to change. This chapter tells you how to implement it in your code. The key here is to make your code flexible rather than rigid - good code bends to circumstances rather than breaks. Part of this is decoupling code. Code is consider coupled when they share something in common. This may be something as simple as a shared global variable, or something more complex like an inheritance chain.

The authors argue against what they term Train Wrecks - long chains of method calls, such as this example they give:

public void applyDiscount(customer, order_id, discount) { 
	totals = customer
			  .orders 
			  .find(order_id) 
			  .getTotals();

	totals.grandTotal = totals.grandTotal - discount;
	totals.discount = discount; 
}

This code is traversing many different levels of abstraction - you have to know that a customer object exposes orders, that orders have a find method, and that the order find returns has a getTotal method. If any of these levels of abstraction are changed, your code might break. And requirements may change; What if the business decides to implement a maximum discount amount of 40%? Certainly, this could be applied in the applyDiscount routine, but anything could modify the grandTotal and discount fields - this rule could be violated if other modules modifying the totals object don’t get the memo.

The authors suggest refactoring the code so that there is no orders object, just a find method and an applyDiscount method for the order object that implements the 40% rule:

public void applyDiscount(customer, order_id, discount) { 
	customer
	.findOrder(order_id)
	.applyDiscount(discount); 
}

The authors suggest having only one . when you access something if that something is likely to change, such as anything in your application, or a fast moving external API. This includes using intermediate variables between accesses, such as this code:

# This is cheating!
orders = customer.orders
order = orders.find(order_id)
totals = order.getTotals

However, the rule does not apply to things that are unlikely to change, such as core language APIs. So this code is ok:

people
.sort_by {|person| person.age } 
.first(10)
.map {| person | person.name }

Another source of coupling is globally accessible data. Global data makes it hard to reason about the state of a program, since any other module might be able to change it. Global data includes design patterns such as singletons, and external resources such as databases. Given how extensive global resources are, how can one avoid them? If global data is unavoidable, the key is to manage them through a well-defined API that you control, rather than allowing anything to read and write global data. In the words of Tip 48: If It’s Important Enough to Be Global, Wrap It in an API.

Poor use of inheritance is a third source of coupling. Inheritance is used for two reasons: code reuse and type modeling. Inheritance doesn’t work for code reuse; Not only is the code of a child class coupled to any ancestor of the class, so is any code that uses the class. Things may unexpectedly break when an ancestor changes an API, even if you are using a subclass.

Nor does inheritance work for modeling types. Class hierarchies quickly become tangled, wall covering monstrosities. Another problem is multiple inheritance. A Car may be a type of Vehicle, but it may be an Asset or InsuredItem. Multiple inheritance is required to model this, and many OO languages don’t support multiple inheritance. Instead of paying the inheritance tax, the authors suggest using:

Interfaces/Protocols
Delegation
Mixins/Traits

Interfaces or Protocols are classes that contain no code but instead contains behaviors. A class that implements an interface promises to define the behaviors. For example, a Car might implement Drivable which has methods such as accelerate and brake. Interfaces can be used as types, and any class that implements the interface will be compatible with that type. This is a much easier way to provide polymorphism than inheritance.

Another alternative to inheritance is delegation. If you want to include behavior from class Foo add a member of type Foo to your class rather than inherit from Foo. You can then use Foo’s API wrapped in code you control. Delegation is a has-a relationship rather than a is-a relationship.

The problem with interfaces and delegation is that they require writing lots of boilerplate code. For example, it’s likely that most of your classes that implement Drivable will have the same logic for brake, but each class will have to write it’s own implementation of brake. This leads to repeated code across your codebase, violating the DRY principle. To resolve this, the authors turn to Mixins - sets of functions that can be “mixed into” a class. This allows you to add common functionality without using inheritance. I wonder how mixins are implemented in a language like Java, which doesn’t have an obvious version of that feature. It’s also not clear to me how mixins are different from inheritance; aren’t they just a form of multiple inheritance?

Tip 55: Parameterize Your App Using External Configuration: Code may have values that change while the application is running, such as credentials for for third-party services. Rather than directly including the values in your code, you should externalize them and put them in a configuration bucket. Keeping credentials in source code is a security risk - hackers scan public git repositories for common security credentials, such as AWS keys. It’s common to store them in a flat file or database tables, and read them when the application initializes. However, in our world of highly-available applications that’s not as appropriate. Instead the authors propose configuration-as-a-service, where configuration is stored behind a service API. This allows multiple applications to share configuration information, use access control to control who can see and edit configuration, and provide a UI to easily edit config information. Using the configuration service, applications can subscribe to a configuration item and get notifications when they change. This allows applications to update config data on their side without restarting.

Chapter 6: Concurrency

This chapter deals with Parallelism, where two pieces of code run at the same time, and Concurrency, where things act as if they run at the same time. In the real world, things are asynchronous - the user is supplying input, network resources are called, and the screen is being redrawn all at the same time. Applications that run everything serially feel sluggish.

In Tip 56: Analyze Workflow to Improve Concurrency the authors advocate that you break temporal coupling where possible. Temporal Coupling is when your code depends on event A happening before event B. You should look at your workflow to see what can be executed concurrently. Look for activities that take a lot of time that would allow for something else to be done in the meantime. If your application makes multiple independent API calls to a remote service, execute them on separate threads rather than serially, then gather up the results of each call. If your workflow allows a way to split the work into multiple independent units, take advantage of those multiple cores and execute them in parallel.

Of course, parallelism has its pitfalls as well. For example, imagine reading an integer, incrementing it, and writing it back. If two processes read that integer at the same time, they will each increment the value to n+1, when you want it to be n+2. The update needs to be atomic; each process needs to do this sequentially without the other process interfering. This can be done through synchronized methods, semaphores, or other forms of resource locking. However, they have their own dangers as well, such as deadlocking, where two processes each get a lock on one of two needed resources, but not the other. Each waits forever for the other to release its lock. The authors think you should avoid shared state rather than try to handle yourself wherever possible; Tip 57: Shared State Is Incorrect State.

The authors ran into this issue when writing the 20th anniversary edition: they updated the build process for the book to utilize parallelism. However, the build would randomly fail. The authors tracked this down to changing the directory temporarily. In the original, a subtask would change directory, then go back to the original directory. However, this no longer worked when new threads started, expecting to be in the root directory. Depending on the timing, this could break the build. This prompted them to write Tip 58: Random Failures Are Often Concurrency Issues.

Chapter 7: While You Are Coding

This chapter is more of a grab-bag. It covers subjects such as psychology, big-O notation, refactoring, security, and testing.

In Tip 61: Listen to Your Inner Lizard the authors talk about listening to your instincts (your lizard-brain). If you find yourself having a hard time writing code, your brain is trying to tell you something. Perhaps the structure or design is wrong, or you don’t fully understand the requirements. If you find yourself in this situation, take a step back and think about what you are doing. Maybe go for a walk, or sleep on it. You might find that the solution is staring you in the face when you come back.

Perhaps you need to refactor the code instead of writing more. Refactoring is a continuous process, espoused in Tip 65: Refactor Early, Refactor Often. If anything strikes you as wrong in your code, such as DRY violations, outdated knowledge or non-orthogonal design, don’t hesitate to fix it. When you are refactoring, make sure you have a good suite of unit tests beforehand to test if your changes break anything. Run the tests frequently to check if you’ve broken anything.

Speaking of tests, the authors start with a bold assertion: Tip 67: Testing Is Not About Finding Bugs. Instead, tests function as the First User of Your Code - a source of immediate feedback, and immediately forces you to think about what counts as a correct solution. In addition, tightly coupled code tends to be hard to test, so it helps you make good design decisions. The authors emphatically do not think you should adopt full-on Test Driven Development - it’s too easy to become a slave to writing tests. They note an example of a TDD advocate starting a sudoku solver using TDD and spent so much time writing the tests they failed to write the solver itself!

In a sidebar, Dave Thomas explains that he stopped writing tests for a few months, and said “not a lot” happened. The quality didn’t drop, nor did he introduce bugs into the code. His code was still testable, it just wasn’t tested.

Andy says I shouldn’t include this sidebar. He worries it will tempt inexperienced developers not to test. Here’s my compromise: Should you write tests? Yes. But after you’ve been doing it for 30 years, feel free to experiment a little to see where the benefit lies for you.

Chapter 8: Before the Project

This chapter focuses on how to start your project on the right foot. The first subject the authors tackle is requirements gathering: The Requirements Pit. While we talk about gathering requirements as if they are on the ground, waiting to be picked up, requirements are non-obvious because of Tip 75: No One Knows Exactly What They Want. They think of requirements gathering as a kind of therapy, where you take an initial requirement and ask questions about the details to nail down exactly what they need. The authors show an example of a simple requirement: “Shipping should be free on all orders costing $50 or more”. Does that include the shipping cost itself? Tax? If you’re selling ebooks as well, should they be included? The job of the programmer is Tip 76: Programmers Help People Understand What They Want. You should find any edge cases the client may not have considered and make sure they’re documented. This doesn’t mean creating long specifications the client won’t read. Instead, the authors think requirements should be able to fit on an index card. This helps prevent feature creep; if the client understands how adding one more index card will impact the schedule, they’ll consider the tradeoffs and prioritize the requirements they need the most.

You are given constraints in your requirements as well. Your job as a software engineer is to evaluate if those constraints are things you actually have to live with or if you can relax them. In the words of Tip 81: Don’t Think Outside the Box—Find the Box, the constraints are the edges of the box. What you initially thought of as a constraint may actually be an assumption you held.

Another tip the authors advocate for is Tip 78: Work with a User to Think Like a User. If you’re building an inventory system, work in the warehouse for a few days to get an idea of their processes and how your system will be used. If you don’t understand how it will be used, you could create something that meets all of the requirements but is totally useless. They cite an example of a digital sound mixing board that could do anything to sound that was possible, yet nobody wanted to use it. Rather than take advantage of recording engineers’ experience with tactile sliders and knobs, they built an interface that was unfamiliar to them. Each feature was buried behind menus and given unintuitive names. It did what was required, but didn’t do it how it was required.

The authors also consider in this chapter what it means to be Agile. Many teams and companies are eager for an off-the-shelf solution: call it Agile-in-a-Box. But no process can make you Agile; “Use this process and you’ll be agile” ignores a key part of the Agile manifesto: Individuals and interactions over processes and tools. To the authors Agile can be boiled down to the following:

Work out where you are.

Make the smallest meaningful step towards where you want to be.

Evaluate where you end up, and fix anything you broke.

Do this for every level of what you do, from process to code, and you’ll have adopted the Agile spirit.

Chapter 9: Pragmatic Projects

Can the lessons of The Pragmatic Programmer be applied to teams too? The authors say yes. This chapter focuses on how to apply the lessons of the previous chapters to the team level. Many of the lessons are the same as those mentioned previously, so I won’t go into them again.

The authors advise Tip 87: Do What Works, Not What’s Fashionable. Just because Google or Facebook adopts process $ x $ doesn’t mean it’s right for your team. How do you know if something works? Try it. Pilot an idea with a small team, and see what works about it and what doesn’t. The goal isn’t to “do Scrum” or “be Agile”, but to deliver working software continuously. When you adopt a new idea, you should do it with improving continuous deployment of software in mind. If you’re measuring your deployments in months, try to get it down to weeks instead. Once you get it down to weeks, try to deliver in one-week iterations.

Related to continuously delivering software is Tip 96: Delight Users, Don’t Just Deliver Code. Delivering working software in a timely matter is not enough to delight your users; that is merely meeting expectations. The authors suggest you ask your users a question:

How will you know that we’ve all been successful a month (or a year, or whatever) after this project is done?

The answer may not be related to the requirements, and may surprise you. For example, a recommendations engine might be valued on driving customer retention. But once you know what the secret to success is, you should aim not just to hit the goal but to exceed it.

Finally, take pride in your work. The final tip of the book is Tip 97: Sign Your Work.

Final Thoughts

I was only able to cover a portion of this remarkable book in this review. I highly recommend this book to any software engineer, especially to those just starting out in the field. It makes a great graduation gift to someone just finishing their CS degree.

<< Previous Next >>