Write Professional Unit Tests in Python

Testing is the foundation of solid software development. There are many types of testing, but the most important type is unit testing. Unit testing gives you a lot of confidence that you can use well-tested pieces as primitives and rely on them when you compose them to create your program. They increase your inventory of trusted code beyond your language's builtins and standard library. In addition, Python provides great support for writing unit tests.

Running Example

Before diving into all the principles, heuristics and guidelines, let's see a representative unit test in action. The SelfDrivingCar class is a partial implementation of the driving logic of a self-driving car. It mostly deals with controlling the speed of the car. It is aware of objects in front of it, the speed limit, and whether or not it arrived at its destination. 

Here is a unit test for the stop() method to whet your appetite. I'll get into the details later. 

Unit Testing Guidelines

Commit

Writing good unit tests is hard work. Writing unit tests takes time. When you make changes to your code, you will usually need to change your tests as well. Sometimes you'll have bugs in your test code. That means you have to be really committed. The benefits are enormous, even for small projects, but they are not free.

Be Disciplined

You must be disciplined. Be consistent. Make sure the tests always pass. Don't let the tests be broken because you "know" the code is OK.

Automate

To help you be disciplined, you should automate your unit tests. The tests should run automatically at significant points like pre-commit or pre-deployment. Ideally, your source control management system rejects code that didn't pass all its tests.

Untested Code Is Broken by Definition

If you didn't test it, you can't say it works. This means you should consider it broken. If it's critical code, don't deploy it to production.

Background

What Is a Unit?

A unit for the purpose of unit testing is a file/module containing a set of related functions or a class. If you have a file with multiple classes, you should write a unit test for each class.

To TDD or Not to TDD

Test-driven development is a practice where you write the tests before you write the code. There are several benefits to this approach, but I recommend avoiding it if you have the discipline to write proper tests later. 

The reason is that I design with code. I write code, look at it, rewrite it, look at it again and rewrite it again very quickly. Writing tests first limits me and slows me down. 

Once I'm done with the initial design, I'll write the tests immediately, before integrating with the rest of the system. That said, it is a great way to introduce yourself to unit tests, and it ensures all your code will have tests.

The Unittest Module

The unittest module comes with Python's standard library. It provides a class called TestCase, which you can derive your class from. Then you can override a setUp() method to prepare a test fixture before each test and/or a classSetUp() class method to prepare a test fixture for all the tests (not reset between individual tests). There are corresponding tearDown() and classTearDown() methods you can override as well.

Here are the relevant parts from our SelfDrivingCarTest class. I use only the setUp() method. I create a fresh SelfDrivingCar instance and store it in self.car so it's available to every test.

The next step is to write specific test methods to test that code under test—the SelfDrivingCar class in this case—is doing what it's supposed to do. The structure of a test method is pretty standard:

  • Prepare the environment (optional).
  • Prepare expected result.
  • Call the code under test.
  • Assert that the actual result matches the expected result.

Note that the result doesn't have to be the output of a method. It can be a state change of a class, a side effect like adding a new row in a database, writing a file or sending an email.

For example, the stop() method of the SelfDrivingCar class doesn't return anything, but it changes the internal state by setting the speed to 0. The assertEqual() method provided by the TestCase base class is used here to verify that calling stop() worked as expected.

There are actually two tests here. The first test is to make sure that if the car's speed is 5 and stop() is called, then the speed becomes 0. Then, another test is to ensure nothing goes wrong if calling stop() again when the car is already stopped.

Later, I'll introduce several more tests for additional functionality.

The Doctest Module

The doctest module is pretty interesting. It lets you use interactive code samples in your docstring and verify the results, including raised Exceptions. 

I don't use or recommend doctest for large-scale systems. Proper unit testing takes a lot of work. The test code is typically much larger than the code under test. Docstrings are just not the right medium for writing comprehensive tests. They are cool, though. Here is what a factorial function with doc tests looks like:

As you can see, the docstring is much bigger than the function code. It doesn't promote readability.

Running Tests

OK. You wrote your unit tests. For a large system, you'll have tens/hundreds/thousands of modules and classes across possibly multiple directories. How do you run all these tests?

The unittest module provides various facilities to group tests and run them programmatically. Check out Loading and Running Tests. But the easiest way is test discovery. This option was added only in Python 2.7. Pre-2.7 you could use nose to discover and run tests. Nose has a few other advantages like running test functions without having to create a class for your test cases. But for the purpose of this article, let's stick with unittest.

To discover and run your unittest-based tests, simply type on the command-line:

python -m unittest discover

unittest will scan all the files and sub-directories, run any tests it finds, and provide a nice report as well as runtime. If you want to see what tests it is running, you can add the -v flag:

python -m unittest discover -v

There are several flags that control the operation:

Test Coverage

Test coverage is an often neglected field. Coverage means how much of your code is actually tested by your tests. For example, if you have a function with an if-else statement and you test only the if branch, then you don't know whether the else branch works or not. In the following code example, the function add() checks the type of its arguments. If both are integers, it just adds them. 

If both are strings, it tries to convert them to integers and adds them. Otherwise it raises an exception. The test_add() function tests the add() function with arguments that are both integers and with arguments that are floats and verifies the correct behavior in each case. But the test coverage is incomplete. The case of string arguments wasn't tested. As a result, the test passes successfully, but the typo in the branch where the arguments are both strings wasn't discovered (see the 'intg' there?).

Here is the output:

Hands-On Unit Tests

Writing industrial-strength unit tests is not easy or simple. There are several things to consider and trade-offs to be made.

Design for Testability

If your code is what is called formally spaghetti code or a big ball of mud where different levels of abstraction are mixed together and every piece of code depends on every other piece of code, you'll have a hard time testing it. Also, whenever you change something, you'll have to update a bunch of tests too.

The good news is that general-purpose proper software design is exactly what you need for testability. In particular, well-factored modular code, where each component has clear responsibility and interacts with other components via well-defined interfaces, will make writing good unit tests a pleasure.

For example, our SelfDrivingCar class is responsible for high-level operation of the car: go, stop, navigate. It has a calculate_distance_to_object_in_front() method that hasn't been implemented yet. This functionality should probably be implemented by a totally separate sub-system. It may include reading data from various sensors, interacting with other self-driving cars, a whole machine vision stack to analyze pictures from multiple cameras.

Let's see how this works in practice. The SelfDrivingCar will accept an argument called object_detector that has a method called calculate_distance_to_object_in_front(), and it will delegate this functionality to this object. Now, there is no need to unit test this because the object_detector is responsible (and should be tested) for it. You still want to unit test the fact that you are using the object_detector properly.

Cost/Benefit

The amount of effort you put into testing should be correlated to the cost of failure, how stable the code is, and how easy it is to fix if problems are detected down the line.

For example, our self-driving car class is super critical. If the stop() method doesn't work properly, our self-driving car might kill people, destroy property, and derail the whole self-driving cars market. If you develop a self-driving car, I suspect your unit tests for the stop() method will be a little more rigorous than mine. 

On the other hand, if a single button in your web application on a page that's buried three levels below your main page flickers a little when someone clicks it, you may fix it, but probably will not add a dedicated unit test for this case. The economics just don't justify it. 

Testing Mindset

Testing mindset is important. One principle I use is that every piece of code has at least two users: the other code that's using it and the test that's testing it. This simple rule helps a lot with design and dependencies. If you remember that you have to write a test for your code, you will not add a lot of dependencies that are difficult to reconstruct during testing.

For example, suppose your code needs to compute something. In order to do that, it needs to load some data from a database, read a configuration file, and dynamically consult some REST API for up-to-date information. This all may be required for various reasons, but putting all that into a single function will make it pretty difficult to unit test. It's still possible with mocking, but it's much better to structure your code properly.

Pure Functions

The easiest code to test is pure functions. Pure functions are functions that access only the values of their parameters, have no side effects, and return the same result whenever called with the same arguments. They don't change your program's state, don't access the file system or the network. Their benefits are too many to count here. 

Why are they easy to test? Because there is no need to set a special environment to test. You just pass arguments and test the result. You also know that as long as the code under test doesn't change, your test doesn't have to change. 

Compare it to a function that reads an XML configuration file. Your test will have to create an XML file and pass its filename to the code under test. No big deal. But suppose someone decided that XML is abominable and all configuration files must be in JSON. They go about their business and convert all configuration files to JSON. They run all the tests including your tests and they all pass! 

Why? Because the code didn't change. It still expects an XML configuration file, and your test still constructs an XML file for it. But in production, your code will get a JSON file, which it will fail to parse.

Testing Error Handling

Error handling is another thing that's critical to test. It is also part of design. Who is responsible for the correctness of input? Every function and method should be clear about it. If it's the function's responsibility, it should verify its input, but if it's the caller's responsibility then the function can just go about its business and assume the input is correct. The overall correctness of the system will be ensured by having tests for the caller to verify that it only passes correct input to your function.

Typically, you want to verify the input on the public interface to your code because you don't necessarily know who's going to call your code. Let's look at the drive() method of the self-driving car. This method expects a 'destination' parameter. The 'destination' parameter will be used later in navigation, but the drive method does nothing to verify it is correct. 

Let's assume that the destination is supposed to be a tuple of latitude and longitude. There are all kinds of tests that can be done to verify it is valid (e.g. is the destination in the middle of the sea). For our purposes, let's just ensure that it is a tuple of floats in the range 0.0 to 90.0 for latitude and -180.0 to 180.0 for longitude.

Here is the updated SelfDrivingCar class. I implemented trivially some of the unimplemented methods because the drive() method calls some of these methods directly or indirectly.

To test error handling in the test, I will pass invalid arguments and verify that they are properly rejected. You can do this by using the self.assertRaises() method of unittest.TestCase. This method succeeds if the code under test indeed raises an exception.

Let's see it in action. The test_drive() method passes latitude and longitude outside the valid range and expects the drive() method to raise an exception.

The test fails, because the drive() method doesn't check its arguments for validity and doesn't raise an exception. You get a nice report with full information about what failed, where and why.

To fix it let's update the drive() method to actually check the range of its arguments:

Now, all the tests pass.

Testing Private Methods

Should you test every function and method? In particular, should you test private methods called only by your code? The typically unsatisfying answer is: "It depends". 

I'll try to be useful here and tell you what it depends on. You know exactly who calls your private method—it's your own code. If your tests for the public methods that call your private method are comprehensive then you already test your private methods exhaustively. But if a private method is very complicated, you may want to test it independently. Use your judgment.

How to Organize Your Unit Tests

In a large system, it's not always clear how to organize your tests. Should you have one big file with all the tests for a package, or one test file for each class? Should the tests be in the same file as the code under test, or in the same directory?

Here is the system I use. Tests should be totally separate from the code under test (hence I don't use doctest). Ideally, your code should be in a package. The tests for each package should in a sibling directory of your package. In the tests directory, there should be one file for each module of your package named test_<module name>

For example, if you have three modules in your package: module_1.py, module_2.py and module_3.py, you should have three test files: test_module_1.py, test_module_2.py and test_module_3.py under the tests directory. 

This convention has several advantages. It makes it clear just by browsing directories that you didn't forget to test some module completely. It also helps to organize the tests in reasonable size chunks. Assuming that your modules are reasonably sized then the test code for each module will be in its own file, which may be a little bigger than the module under test, but still something that fits comfortably in one file. 

Conclusion

Unit tests are the foundation of solid code. In this tutorial, I explored some principles and guidelines for unit testing and explained the reasoning behind several best practices. The bigger the system you're building, the more important unit tests become. But unit tests are not enough. Other types of tests are needed too for large-scale systems: integration tests, performance tests, load tests, penetration tests, acceptance tests, etc. 

Tags:

Comments

Related Articles