You should test your code

There are many reasons why you should test your code:

  • Writing a test helps you define what your code should do and helps you enforce single responsibility of functions.
  • Getting into a habit of writing tests for corner cases helps you develop more robust code.
  • If your tests have good names and reasonable test cases it serves as documentation of your code for your future self or other collaborators.
  • It is a good way of documenting assumptions you make. It is often useful to write a test that fails in the event of an assumption being broken, such as functionality you have not implemented yet.
  • Many IDEs have good built-in support for running tests, and debugging a test is often my main way of either debugging through my own code to find errors, or stepping through code that is unfamiliar to me to se how the code is supposed to work and fail.
  • Good test coverage is essential when refactoring code.

Put your tests where pytest can find them

pytest finds your tests automatically according to (what I have just learned is) standard test discovery by:

  • recursively looking through directories
  • search for files named test_*.py or *_test.py
  • in those files: search for functions prefixed with test outside classes
  • in those files: search for functions or methods inside classes prefixed with Test

An example of a directory structure could look like this:

my_code/
    app.py
    utils.py
tests/
    test_app.py
    test_utils.py

It is good practice to organize your tests separately from the rest of your code, for example in a folder named tests as above. There are many reasons, for example default module discovery may ignore your tests, your tests may require additional packages to run, and if you are writing a library or application, the tests should not need to be included in your library or application.

Run your tests

I normally use PyCharm and find the built-in functionality for running and debugging tests from the interface quite nice, but it is always cool to learn more CLI tricks.

Run all tests

Running all tests found from the current directory is quite simple:

!python -m pytest
============================= test session starts ==============================
platform darwin -- Python 3.8.1, pytest-5.4.1, py-1.8.1, pluggy-0.13.1
rootdir: /Users/Gunnhild/code/notes/_notebooks, inifile: pytest.ini
collected 6 items                                                              

test_examples.py .F                                                      [ 33%]
test_fixturefunctions.py .                                               [ 50%]
test_mark_examples.py ...                                                [100%]

=================================== FAILURES ===================================
_____________________________ test_failing_example _____________________________

    def test_failing_example():
        print("Hello")
>       assert False
E       assert False

test_examples.py:9: AssertionError
----------------------------- Captured stdout call -----------------------------
Hello
=========================== short test summary info ============================
FAILED test_examples.py::test_failing_example - assert False
========================= 1 failed, 5 passed in 0.83s ==========================

Note: We can run the tests by running either python -m pytest or just pytest. Running through python will add the current directory to sys.path which is often desirable, therefore I'll stick with that.

Here, pytest discovered three file with tests, test_examples.py, containing two tests, one which passes and one which fails, test_fixturefunctions.py, containing one passing test and test_mark_examples.py, containing three passing tests. The tests in test_examples.py look like this:

# contents of test_examples.py
def test_example():
    print("Hi")
    assert True

def test_failing_example():
    print("Hello")
    assert False

Customize test output

The default mode is that output from a test is not shown unless the test fails. We can use the capture option to print output anyway, or -s for short:

!python -m pytest test_examples.py::test_example --capture=no
============================= test session starts ==============================
platform darwin -- Python 3.8.1, pytest-5.4.1, py-1.8.1, pluggy-0.13.1
rootdir: /Users/Gunnhild/code/notes/_notebooks, inifile: pytest.ini
collected 1 item                                                               

test_examples.py Hi
.

============================== 1 passed in 0.01s ===============================

The traceback formatting for failing tests is set by the option tb. There are many options, such as --tb=line to limit output from failing tests to one line:

!python -m pytest test_examples.py --tb=line
============================= test session starts ==============================
platform darwin -- Python 3.8.1, pytest-5.4.1, py-1.8.1, pluggy-0.13.1
rootdir: /Users/Gunnhild/code/notes/_notebooks, inifile: pytest.ini
collected 2 items                                                              

test_examples.py .F                                                      [100%]

=================================== FAILURES ===================================
/Users/Gunnhild/code/notes/_notebooks/test_examples.py:9: assert False
=========================== short test summary info ============================
FAILED test_examples.py::test_failing_example - assert False
========================= 1 failed, 1 passed in 0.06s ==========================

Specify modules, files or single tests

Above we used a trick: We can run tests found in a single file and a single test from the command line as well, using the syntax

pytest test_module/test_file_name.py::test_function_name
!python -m pytest test_examples.py::test_example
============================= test session starts ==============================
platform darwin -- Python 3.8.1, pytest-5.4.1, py-1.8.1, pluggy-0.13.1
rootdir: /Users/Gunnhild/code/notes/_notebooks
collected 1 item                                                               

test_examples.py .                                                       [100%]

============================== 1 passed in 0.01s ===============================

Group tests using marks

We can use marks to run groups of tests easily with the -m option

python -m pytest -m mark_name

pytest has a range of built-in marks, such as the slow mark. This can be used to group tests so that you can run the quick tests and check for failures there first, before running the slow tests. We register our marks in pytest.ini to let pytest know we are marking on purpose, otherwise pytest will raise a Warning. For example, we could mark tests for different purposes:

# contents of test_examples.py
import pytest

@pytest.mark.this
def test_example():
    print("Hello")
    assert True

@pytest.mark.this
@pytest.mark.that
def test_several_marks():
    print("Nothing")
    assert True

def test_unmarked():
    print("Hello")
    assert 1

Our pytest.ini should then look like

# content of pytest.ini
[pytest]
markers =
    this: example of marker.
    that: another example of marker.

and then we can run groups accordingly:

!python -m pytest -m this
============================= test session starts ==============================
platform darwin -- Python 3.8.1, pytest-5.4.1, py-1.8.1, pluggy-0.13.1
rootdir: /Users/Gunnhild/code/notes/_notebooks, inifile: pytest.ini
collected 6 items / 4 deselected / 2 selected                                  

test_mark_examples.py ..                                                 [100%]

======================= 2 passed, 4 deselected in 0.75s ========================
!python -m pytest -m "this and not that"
============================= test session starts ==============================
platform darwin -- Python 3.8.1, pytest-5.4.1, py-1.8.1, pluggy-0.13.1
rootdir: /Users/Gunnhild/code/notes/_notebooks, inifile: pytest.ini
collected 6 items / 5 deselected / 1 selected                                  

test_mark_examples.py .                                                  [100%]

======================= 1 passed, 5 deselected in 0.59s ========================

Use fixtures to initialize your test

In software testing, a fixture can be used to ensuring that tests are repeatable: the same code with the same inputs in the same environment will reproduce the same results. We can use fixtures for

  • setting up mocks of external services such as APIs, so your tests won't depend on the reliability of external applications, and your can test all response cases you need to
  • setting up and sharing test data between tests
  • setting up the environment that the test will run in

In this example, we have a function that saves an input dataframe to a specified path as a csv file.

from pathlib import Path
import pandas as pd 


def save(df: pd.DataFrame, save_path: Path):
    if not df.empty:
        df.to_csv(save_path)
    else:
        print("Nothing to save. ")

To test this function, we might want to save a dataframe, and check that we get the same result back when we read the csv file. For this test we use the built-in fixture tmp_path. The tmp_path fixture creates a path unique to each test run, that doesn't clutter the repository or any other shared folders we might care about. This ensures that if the tests are run in a different environment, such as on another developer's computer or in a continuous integration pipeline, the folders will exist when needed and be deleted eventually. We use any fixture in a test by using the fixture name as an input argument to the test function:

import numpy as np
import pandas as pd
from pandas.testing import assert_frame_equal


def test_save(tmp_path):
    # Given
    save_path = tmp_path / "df.csv"
    df_expected = pd.DataFrame(columns=["a", "b", "c"], data=[3, 2, 1]*np.ones([5,3]))
    save(df_expected, save_path)

    # When 
    df_actual = pd.read_csv(save_path, index_col=False)

    # Then 
    assert_frame_equal(df_expected, df_actual)
!python -m pytest test_save_example.py::test_save
============================= test session starts ==============================
platform darwin -- Python 3.8.1, pytest-5.4.1, py-1.8.1, pluggy-0.13.1
rootdir: /Users/Gunnhild/code/notes/_notebooks, inifile: pytest.ini
collected 1 item                                                               

test_save_example.py .                                                   [100%]

============================== 1 passed in 0.96s ===============================

Create test data in fixtures

Right now we create the test data in the test. An alternative is to create the dataframe in a fixture. The advantage is that there is less code to read in the test, and the fixture can be reused by different tests, if we have several functions acting on the data we can avoid duplication of code. We create fixtures by using the decorator @pytest.fixture:

import numpy as np
import pandas as pd
from pandas.testing import assert_frame_equal
from pathlib import Path
import pytest


@pytest.fixture()
def test_dataframe():
    df = pd.DataFrame(columns=["a", "b", "c"], data=[3, 2, 1]*np.ones([5,3]))
    return df


def test_save_fixturized(tmp_path, test_dataframe):
    # Given
    save_path = tmp_path / "df.csv"
    save(test_dataframe, save_path)

    # When 
    df_actual = pd.read_csv(save_path, index_col=False)

    # Then 
    assert_frame_equal(test_dataframe, df_actual)
!python -m pytest test_save_example.py::test_save_fixturized
============================= test session starts ==============================
platform darwin -- Python 3.8.1, pytest-5.4.1, py-1.8.1, pluggy-0.13.1
rootdir: /Users/Gunnhild/code/notes/_notebooks, inifile: pytest.ini
collected 1 item                                                               

test_save_example.py .                                                   [100%]

============================== 1 passed in 0.84s ===============================

Note that when we call the fixture function, we automatically get the return value, instead of the function itself, i.e. we do not need to use assign the return value of the function to a variable holding the dataframe: df_expected = test_dataframe()

Use monkeypatch to set environment variables

We often use environment variables to configure our functionality, such as where they should output their results, login credentials for databases and services. Keeping these configs in environment variables is recommended in order to run the same code with different configurations in different environments: locally when developing, in a test environment and in a production environment. To test these functions, we can use monkeypatching. Let's say we read environment variables in our function:

import os


def read_config():
    password = os.environ["DB_PASSWORD"]
    user = os.environ["DB_USER"]
    return {
        "password": password, 
        "user": user,
    }

We can then use the monkeypatch fixture in our test, to set environment variables to toy values for the test execution:

def test_read_config(monkeypatch):
    monkeypatch.setenv("DB_PASSWORD", "password123")
    monkeypatch.setenv("DB_USER", "username")
    conf = read_config()
    assert set(conf.keys()) == {"password", "user"}
    assert conf["password"] == "password123"
    assert conf["user"] == "username"
!python -m pytest test_monkeypatching.py::test_read_config
============================= test session starts ==============================
platform darwin -- Python 3.8.1, pytest-5.4.1, py-1.8.1, pluggy-0.13.1
rootdir: /Users/Gunnhild/code/notes/_notebooks, inifile: pytest.ini
collected 1 item                                                               

test_monkeypatching.py .                                                 [100%]

============================== 1 passed in 0.02s ===============================

We could extract the mocking into fixtures to share the setup between tests:

import pytest


@pytest.fixture()
def monkeypatch_config(monkeypatch):
    monkeypatch.setenv("DB_PASSWORD", "password123")
    monkeypatch.setenv("DB_USER", "username")


def test_read_config_using_fixture(monkeypatch_config):
    conf = read_config()
    assert set(conf.keys()) == {"password", "user"}
    assert conf["password"] == "password123"
    assert conf["user"] == "username"
!python -m pytest test_monkeypatching.py::test_read_config_using_fixture
============================= test session starts ==============================
platform darwin -- Python 3.8.1, pytest-5.4.1, py-1.8.1, pluggy-0.13.1
rootdir: /Users/Gunnhild/code/notes/_notebooks, inifile: pytest.ini
collected 1 item                                                               

test_monkeypatching.py .                                                 [100%]

============================== 1 passed in 0.02s ===============================

Use mocks to test external dependencies

When we have external dependencies, such as an API or databases, we want our tests to be independent of the status of our dependencies. For instance, we want to test that our code can handle both when the API is up and running normally, and when the API is down. However, we can't control whether the API is up or down when we run our tests, so we use mocks to imitate the responses from our dependencies.

Mocking is a field big enough for it's own post at some point, but what I keep coming back to is a RealPython article on Understanding the Python Mock Object Library.

Use parametrization to cover multiple cases

There are at least two ways of rerunning tests for different test cases in order to ensure all execution paths are tested, and both involve parametrizing:

When we parametrize, pytest will run the tests for all different cases we specify automatically.

In my experience, we should parametrize tests to ensure that we cover all the different cases that arise from having different input data to the function under test, i.e. the function specific stuff, whereas we should parametrize fixtures when we want to test different objects. If the fixtures are mocking external dependencies or our own complex objects, it may be a good idea to parameterize fixtures to ensure we cover different setups.

A code smell indicating that we should parametrize a fixture, is duplicated code for creating different tests for different functions, or setting up different test cases in the same test, across multiple tests. A nice side effect of parametrizing your fixtures, is that all new tests that use the same fixture will automatically be run for the different cases.

Parametrizing fixtures to cover multiple test cases

Let's go back to the save test example of saving a dataframe.

Note: In this test, I have parametrized an input parameter to the function, but above I argued that input arguments is better suited for test parametrization than fixture parametrization. A better example would perhaps be if the data in the test was an attribute of a class, and we wished to create a mock of the class to test. It may also be suitable to extract input parameters to fixtures when creation is complex. In any case, the example serves to show some of the functionality of fixtures that we can use.

Where we left off, our test only covered one execution path: the first branch of the if statement, i.e. if the input dataframe is non-empty. If we want to test the other branch, we can parametrize the fixture to return different dataframes. When a test relies on a parametrized fixture, it will be rerun for all parametrizations of the fixture.

import numpy as np
import pandas as pd
from pandas.testing import assert_frame_equal
import pytest 

@pytest.fixture(params=[True, False], ids=["non-empty", "empty"])
def dataframes(request):
    if request.param:
        return pd.DataFrame(columns=["a", "b", "c"], data=[3, 2, 1]*np.ones([5,3]))
    else:
        return pd.DataFrame()


def test_save_parametrized_fixture(tmp_path, dataframes):
    # Given
    save_path = tmp_path / "df.csv"
    save(dataframes, save_path)

    if dataframes.empty:
        # When 
        files_in_dir = [x for x in tmp_path.iterdir() if x.is_file()]
        # Then
        assert not files_in_dir

    else:
        # When
        df_actual = pd.read_csv(save_path, index_col=False)
        # Then 
        assert_frame_equal(dataframes, df_actual)
!python -m pytest test_save_example.py::test_save_parametrized_fixture
============================= test session starts ==============================
platform darwin -- Python 3.8.1, pytest-5.4.1, py-1.8.1, pluggy-0.13.1
rootdir: /Users/Gunnhild/code/notes/_notebooks, inifile: pytest.ini
collected 2 items                                                              

test_save_example.py ..                                                  [100%]

============================== 2 passed in 0.60s ===============================

This executes the test twice automatically. We use the params keyword to parametrize our fixture, and the ids keyword to provide human readable names for our different parametrizations. We use the request fixture in our fixture to access the parameters we send in on the request's attribute param.

params takes a list as inputs, so if we need several arguments to our fixture function, we can use for example a list of tuples or a list of dicts:

@pytest.fixture(params=[(True, 5), (False, )], ids=["non-empty", "empty"])
def df_fixture_with_tuples(request):
    if request.param[0]:
        n = request.param[1]
        return pd.DataFrame(columns=["a", "b", "c"], data=[3, 2, 1]*np.ones([n,3]))
    else:
        return pd.DataFrame()


@pytest.fixture(
    params=[
        {"non_empty": True, "length": 5}, 
        {"non_empty": False, "length": None}
    ], 
    ids=["non-empty", "empty"]
)
def df_fixture_with_dict(request):
    if request.param["non_empty"]:
        n = request.param["length"]
        return pd.DataFrame(columns=["a", "b", "c"], data=[3, 2, 1]*np.ones([n,3]))
    else:
        return pd.DataFrame()

Lets run a failing test, to see our id in action, with this toy test function

def test_demo_fail_output(dataframes):
    if dataframes.empty:
        assert False
    else: 
        assert True
!python -m pytest --tb=line test_save_example.py::test_demo_fail_output
============================= test session starts ==============================
platform darwin -- Python 3.8.1, pytest-5.4.1, py-1.8.1, pluggy-0.13.1
rootdir: /Users/Gunnhild/code/notes/_notebooks, inifile: pytest.ini
collected 2 items                                                              

test_save_example.py .F                                                  [100%]

=================================== FAILURES ===================================
/Users/Gunnhild/code/notes/_notebooks/test_save_example.py:74: assert False
=========================== short test summary info ============================
FAILED test_save_example.py::test_demo_fail_output[empty] - assert False
========================= 1 failed, 1 passed in 0.76s ==========================

The id of the failing test, empty, is printed in the list of failed tests. If you use PyCharm, you will find that it prints a pretty summary of the ids of parametrized tests, both parametrized through fixtures and the test itself, by building up a tree of the tests that are run, organised by module, script, and function, and I'm sure many other IDEs have similar functionality.

Parametrize tests to cover multiple test cases

To cover the different execution paths, we can also parametrize the test itself, which looks a little different. Let's return to our save example, but add to the functionality. Let's say we want to pass an argument for the number of rows to save, and add a validator to check that the number of rows is a valid argument:

import pandas as pd


def save(df: pd.DataFrame, save_path: Path, num_rows: Optional[int] = None):
    if not df.empty: 
        if num_rows: 
            num_rows = validate_num_rows(num_rows)
            df = df[0: num_rows]
        df.to_csv(save_path, index=False)
    else:
        print("Nothing to save. ")

def validate_num_rows(num_rows)-> int:
    if not int(num_rows) == num_rows:
        raise ValueError(f"num_rows must be int, got {num_rows}")
    if num_rows < 1:
        raise ValueError(f"num_rows must be >= 1, got {num_rows}")
    return int(num_rows)

We parametrize our test to cover both the case when a num_rows argument is not supplied, and when it is supplied:

import pandas as pd
import pytest 


@pytest.mark.parametrize(argnames="number_of_rows", argvalues=[None, 3])
def test_save_fixturized(tmp_path, test_dataframe, number_of_rows):
    save_path = tmp_path / "df.csv"
    save(test_dataframe, save_path, num_rows=number_of_rows)

    df_actual = pd.read_csv(save_path)
    if number_of_rows:
        df_expected = test_dataframe[0:number_of_rows]
    else: 
        df_expected = test_dataframe

    assert_frame_equal(df_expected, df_actual)

Parametrize is a mark, where the first argument, argnames, is a string with the argument names separated by commas, the second, argvalues is a list with the argument values for the different test cases. If we have several arguments, argvalues must be a list of tuples, and the number of tuples must match the number of argnames for each element of the list. We use the parameterized values in the test by setting them as input arguments to the test. These names must match argnames.

!python -m pytest test_parametrize.py::test_save_fixturized
============================= test session starts ==============================
platform darwin -- Python 3.8.1, pytest-5.4.1, py-1.8.1, pluggy-0.13.1
rootdir: /Users/Gunnhild/code/notes/_notebooks, inifile: pytest.ini
collected 2 items                                                              

test_parametrize.py ..                                                   [100%]

============================== 2 passed in 0.79s ===============================

Testing that exceptions are raised

To make assertions about expected exceptions, we use pytest.raises. We will use the function validate_num_rows as an example, as it raises errors in some cases, and not in others. This is also a good opportunity to document some assumptions for our future self about what this test does. Since there are many different cases, we will parametrize the test function to cover all branches of the code and demonstrate functionality.

import pytest


@pytest.mark.parametrize(
    "input_num_rows, expected_output, expected_error", 
    [
        (3, 3, None),
        (2.0, 2, None),
        ("text", None, ValueError),
        ("3.4", None, ValueError),
        (-1, None, ValueError),
    ], 
    ids=[
        "integer",
        "float_that_can_be_converted_to_integer",
        "string_fails",
        "float_fails",
        "negative_number_fails"
    ]
)
def test_validate_num_rows(input_num_rows, expected_output, expected_error):
    if expected_error:
        with pytest.raises(expected_error):
            validate_num_rows(input_num_rows)
    else:
        actual_output = validate_num_rows(input_num_rows)
        assert expected_output == actual_output

Now we have an example of having multiple argument names and argument values with tuples, as mentioned above.

To test for failure and success, we use the argument expected_error which we set to None for the test cases that should fail and to the error we expect when a test should pass. Then we use pytest.raises to call a function and validate that the expected error was thrown if expected_error is not None.

!python -m pytest test_parametrize.py::test_validate_num_rows
============================= test session starts ==============================
platform darwin -- Python 3.8.1, pytest-5.4.1, py-1.8.1, pluggy-0.13.1
rootdir: /Users/Gunnhild/code/notes/_notebooks, inifile: pytest.ini
collected 5 items                                                              

test_parametrize.py .....                                                [100%]

============================== 5 passed in 0.56s ===============================

Another option for conditional raising of exeptions is shown in the documentation, and uses a contextmanager that yields for non-failing cases. It seems a little complicated to me, but if you're used to this construction you can save some lines of code in your tests.

Further reading

Mocking is an obvious next step when writing tests, my favorite source is the above mentioned RealPython article on Understanding the Python Mock Object Library.

Code coverage is a concept that goes hand in hand with testing and is a good starting point for what to test. pytest-cov is an easy coverage plugin for pytest.

The ecosystem of plugins to pytest is huge, and there are many I would like to try, for examplepytest-mock for mocking and pytest-vcr for HTTP requests. This tutorial covers both the pytest-vcr library, but also basic concepts in testing and code quality, as well as the author's strategy on how to read up on testing in Python.