Testing with pytest
pytest is a widely used Python framework for testing, which is flexible for small and large test suites.
- You should test your code
- Put your tests where pytest can find them
- Run your tests
- Use fixtures to initialize your test
- Use mocks to test external dependencies
- Use parametrization to cover multiple cases
- Testing that exceptions are raised
- Further reading
You should test your code
There are many reasons why you should test your code:
- Writing a test helps you define what your code should do and helps you enforce single responsibility of functions.
- Getting into a habit of writing tests for corner cases helps you develop more robust code.
- If your tests have good names and reasonable test cases it serves as documentation of your code for your future self or other collaborators.
- It is a good way of documenting assumptions you make. It is often useful to write a test that fails in the event of an assumption being broken, such as functionality you have not implemented yet.
- Many IDEs have good built-in support for running tests, and debugging a test is often my main way of either debugging through my own code to find errors, or stepping through code that is unfamiliar to me to se how the code is supposed to work and fail.
- Good test coverage is essential when refactoring code.
Put your tests where pytest can find them
pytest finds your tests automatically according to (what I have just learned is) standard test discovery by:
- recursively looking through directories
- search for files named
test_*.py
or*_test.py
- in those files: search for functions prefixed with
test
outside classes - in those files: search for functions or methods inside classes prefixed with
Test
An example of a directory structure could look like this:
my_code/
app.py
utils.py
tests/
test_app.py
test_utils.py
It is good practice to organize your tests separately from the rest of your code, for example in a folder named tests
as above. There are many reasons, for example default module discovery may ignore your tests, your tests may require additional packages to run, and if you are writing a library or application, the tests should not need to be included in your library or application.
!python -m pytest
Note: We can run the tests by running either python -m pytest
or just pytest
. Running through python will add the current directory to sys.path
which is often desirable, therefore I'll stick with that.
Here, pytest
discovered three file with tests, test_examples.py
, containing two tests, one which passes and one which fails, test_fixturefunctions.py
, containing one passing test and test_mark_examples.py
, containing three passing tests. The tests in test_examples.py
look like this:
# contents of test_examples.py
def test_example():
print("Hi")
assert True
def test_failing_example():
print("Hello")
assert False
Customize test output
The default mode is that output from a test is not shown unless the test fails. We can use the capture
option to print output anyway, or -s
for short:
!python -m pytest test_examples.py::test_example --capture=no
The traceback formatting for failing tests is set by the option tb
. There are many options, such as --tb=line
to limit output from failing tests to one line:
!python -m pytest test_examples.py --tb=line
!python -m pytest test_examples.py::test_example
Group tests using marks
We can use marks to run groups of tests easily with the -m
option
python -m pytest -m mark_name
pytest
has a range of built-in marks, such as the slow
mark. This can be used to group tests so that you can run the quick tests and check for failures there first, before running the slow tests. We register our marks in pytest.ini
to let pytest know we are marking on purpose, otherwise pytest will raise a Warning.
For example, we could mark tests for different purposes:
# contents of test_examples.py
import pytest
@pytest.mark.this
def test_example():
print("Hello")
assert True
@pytest.mark.this
@pytest.mark.that
def test_several_marks():
print("Nothing")
assert True
def test_unmarked():
print("Hello")
assert 1
Our pytest.ini should then look like
# content of pytest.ini
[pytest]
markers =
this: example of marker.
that: another example of marker.
and then we can run groups accordingly:
!python -m pytest -m this
!python -m pytest -m "this and not that"
Use fixtures to initialize your test
In software testing, a fixture can be used to ensuring that tests are repeatable: the same code with the same inputs in the same environment will reproduce the same results. We can use fixtures for
- setting up mocks of external services such as APIs, so your tests won't depend on the reliability of external applications, and your can test all response cases you need to
- setting up and sharing test data between tests
- setting up the environment that the test will run in
In this example, we have a function that saves an input dataframe to a specified path as a csv file.
from pathlib import Path
import pandas as pd
def save(df: pd.DataFrame, save_path: Path):
if not df.empty:
df.to_csv(save_path)
else:
print("Nothing to save. ")
To test this function, we might want to save a dataframe, and check that we get the same result back when we read the csv file. For this test we use the built-in fixture tmp_path
. The tmp_path
fixture creates a path unique to each test run, that doesn't clutter the repository or any other shared folders we might care about. This ensures that if the tests are run in a different environment, such as on another developer's computer or in a continuous integration pipeline, the folders will exist when needed and be deleted eventually. We use any fixture in a test by using the fixture name as an input argument to the test function:
import numpy as np
import pandas as pd
from pandas.testing import assert_frame_equal
def test_save(tmp_path):
# Given
save_path = tmp_path / "df.csv"
df_expected = pd.DataFrame(columns=["a", "b", "c"], data=[3, 2, 1]*np.ones([5,3]))
save(df_expected, save_path)
# When
df_actual = pd.read_csv(save_path, index_col=False)
# Then
assert_frame_equal(df_expected, df_actual)
!python -m pytest test_save_example.py::test_save
Create test data in fixtures
Right now we create the test data in the test. An alternative is to create the dataframe in a fixture. The advantage is that there is less code to read in the test, and the fixture can be reused by different tests, if we have several functions acting on the data we can avoid duplication of code. We create fixtures by using the decorator @pytest.fixture
:
import numpy as np
import pandas as pd
from pandas.testing import assert_frame_equal
from pathlib import Path
import pytest
@pytest.fixture()
def test_dataframe():
df = pd.DataFrame(columns=["a", "b", "c"], data=[3, 2, 1]*np.ones([5,3]))
return df
def test_save_fixturized(tmp_path, test_dataframe):
# Given
save_path = tmp_path / "df.csv"
save(test_dataframe, save_path)
# When
df_actual = pd.read_csv(save_path, index_col=False)
# Then
assert_frame_equal(test_dataframe, df_actual)
!python -m pytest test_save_example.py::test_save_fixturized
Note that when we call the fixture function, we automatically get the return value, instead of the function itself, i.e. we do not need to use assign the return value of the function to a variable holding the dataframe: df_expected = test_dataframe()
Use monkeypatch to set environment variables
We often use environment variables to configure our functionality, such as where they should output their results, login credentials for databases and services. Keeping these configs in environment variables is recommended in order to run the same code with different configurations in different environments: locally when developing, in a test environment and in a production environment. To test these functions, we can use monkeypatching. Let's say we read environment variables in our function:
import os
def read_config():
password = os.environ["DB_PASSWORD"]
user = os.environ["DB_USER"]
return {
"password": password,
"user": user,
}
We can then use the monkeypatch fixture in our test, to set environment variables to toy values for the test execution:
def test_read_config(monkeypatch):
monkeypatch.setenv("DB_PASSWORD", "password123")
monkeypatch.setenv("DB_USER", "username")
conf = read_config()
assert set(conf.keys()) == {"password", "user"}
assert conf["password"] == "password123"
assert conf["user"] == "username"
!python -m pytest test_monkeypatching.py::test_read_config
We could extract the mocking into fixtures to share the setup between tests:
import pytest
@pytest.fixture()
def monkeypatch_config(monkeypatch):
monkeypatch.setenv("DB_PASSWORD", "password123")
monkeypatch.setenv("DB_USER", "username")
def test_read_config_using_fixture(monkeypatch_config):
conf = read_config()
assert set(conf.keys()) == {"password", "user"}
assert conf["password"] == "password123"
assert conf["user"] == "username"
!python -m pytest test_monkeypatching.py::test_read_config_using_fixture
Use mocks to test external dependencies
When we have external dependencies, such as an API or databases, we want our tests to be independent of the status of our dependencies. For instance, we want to test that our code can handle both when the API is up and running normally, and when the API is down. However, we can't control whether the API is up or down when we run our tests, so we use mocks to imitate the responses from our dependencies.
Mocking is a field big enough for it's own post at some point, but what I keep coming back to is a RealPython article on Understanding the Python Mock Object Library.
Use parametrization to cover multiple cases
There are at least two ways of rerunning tests for different test cases in order to ensure all execution paths are tested, and both involve parametrizing:
When we parametrize, pytest will run the tests for all different cases we specify automatically.
In my experience, we should parametrize tests to ensure that we cover all the different cases that arise from having different input data to the function under test, i.e. the function specific stuff, whereas we should parametrize fixtures when we want to test different objects. If the fixtures are mocking external dependencies or our own complex objects, it may be a good idea to parameterize fixtures to ensure we cover different setups.
A code smell indicating that we should parametrize a fixture, is duplicated code for creating different tests for different functions, or setting up different test cases in the same test, across multiple tests. A nice side effect of parametrizing your fixtures, is that all new tests that use the same fixture will automatically be run for the different cases.
Parametrizing fixtures to cover multiple test cases
Let's go back to the save
test example of saving a dataframe.
Where we left off, our test only covered one execution path: the first branch of the if statement, i.e. if the input dataframe is non-empty. If we want to test the other branch, we can parametrize the fixture to return different dataframes. When a test relies on a parametrized fixture, it will be rerun for all parametrizations of the fixture.
import numpy as np
import pandas as pd
from pandas.testing import assert_frame_equal
import pytest
@pytest.fixture(params=[True, False], ids=["non-empty", "empty"])
def dataframes(request):
if request.param:
return pd.DataFrame(columns=["a", "b", "c"], data=[3, 2, 1]*np.ones([5,3]))
else:
return pd.DataFrame()
def test_save_parametrized_fixture(tmp_path, dataframes):
# Given
save_path = tmp_path / "df.csv"
save(dataframes, save_path)
if dataframes.empty:
# When
files_in_dir = [x for x in tmp_path.iterdir() if x.is_file()]
# Then
assert not files_in_dir
else:
# When
df_actual = pd.read_csv(save_path, index_col=False)
# Then
assert_frame_equal(dataframes, df_actual)
!python -m pytest test_save_example.py::test_save_parametrized_fixture
This executes the test twice automatically. We use the params
keyword to parametrize our fixture, and the ids
keyword to provide human readable names for our different parametrizations. We use the request
fixture in our fixture to access the parameters we send in on the request
's attribute param
.
params
takes a list as inputs, so if we need several arguments to our fixture function, we can use for example a list of tuples or a list of dicts:
@pytest.fixture(params=[(True, 5), (False, )], ids=["non-empty", "empty"])
def df_fixture_with_tuples(request):
if request.param[0]:
n = request.param[1]
return pd.DataFrame(columns=["a", "b", "c"], data=[3, 2, 1]*np.ones([n,3]))
else:
return pd.DataFrame()
@pytest.fixture(
params=[
{"non_empty": True, "length": 5},
{"non_empty": False, "length": None}
],
ids=["non-empty", "empty"]
)
def df_fixture_with_dict(request):
if request.param["non_empty"]:
n = request.param["length"]
return pd.DataFrame(columns=["a", "b", "c"], data=[3, 2, 1]*np.ones([n,3]))
else:
return pd.DataFrame()
Lets run a failing test, to see our id
in action, with this toy test function
def test_demo_fail_output(dataframes):
if dataframes.empty:
assert False
else:
assert True
!python -m pytest --tb=line test_save_example.py::test_demo_fail_output
The id
of the failing test, empty
, is printed in the list of failed tests. If you use PyCharm, you will find that it prints a pretty summary of the ids of parametrized tests, both parametrized through fixtures and the test itself, by building up a tree of the tests that are run, organised by module, script, and function, and I'm sure many other IDEs have similar functionality.
Parametrize tests to cover multiple test cases
To cover the different execution paths, we can also parametrize the test itself, which looks a little different. Let's return to our save example, but add to the functionality. Let's say we want to pass an argument for the number of rows to save, and add a validator to check that the number of rows is a valid argument:
import pandas as pd
def save(df: pd.DataFrame, save_path: Path, num_rows: Optional[int] = None):
if not df.empty:
if num_rows:
num_rows = validate_num_rows(num_rows)
df = df[0: num_rows]
df.to_csv(save_path, index=False)
else:
print("Nothing to save. ")
def validate_num_rows(num_rows)-> int:
if not int(num_rows) == num_rows:
raise ValueError(f"num_rows must be int, got {num_rows}")
if num_rows < 1:
raise ValueError(f"num_rows must be >= 1, got {num_rows}")
return int(num_rows)
We parametrize our test to cover both the case when a num_rows
argument is not supplied, and when it is supplied:
import pandas as pd
import pytest
@pytest.mark.parametrize(argnames="number_of_rows", argvalues=[None, 3])
def test_save_fixturized(tmp_path, test_dataframe, number_of_rows):
save_path = tmp_path / "df.csv"
save(test_dataframe, save_path, num_rows=number_of_rows)
df_actual = pd.read_csv(save_path)
if number_of_rows:
df_expected = test_dataframe[0:number_of_rows]
else:
df_expected = test_dataframe
assert_frame_equal(df_expected, df_actual)
Parametrize is a mark, where the first argument, argnames
, is a string with the argument names separated by commas, the second, argvalues
is a list with the argument values for the different test cases. If we have several arguments, argvalues
must be a list of tuples, and the number of tuples must match the number of argnames
for each element of the list. We use the parameterized values in the test by setting them as input arguments to the test. These names must match argnames
.
!python -m pytest test_parametrize.py::test_save_fixturized
Testing that exceptions are raised
To make assertions about expected exceptions, we use pytest.raises
. We will use the function validate_num_rows
as an example, as it raises errors in some cases, and not in others. This is also a good opportunity to document some assumptions for our future self about what this test does. Since there are many different cases, we will parametrize the test function to cover all branches of the code and demonstrate functionality.
import pytest
@pytest.mark.parametrize(
"input_num_rows, expected_output, expected_error",
[
(3, 3, None),
(2.0, 2, None),
("text", None, ValueError),
("3.4", None, ValueError),
(-1, None, ValueError),
],
ids=[
"integer",
"float_that_can_be_converted_to_integer",
"string_fails",
"float_fails",
"negative_number_fails"
]
)
def test_validate_num_rows(input_num_rows, expected_output, expected_error):
if expected_error:
with pytest.raises(expected_error):
validate_num_rows(input_num_rows)
else:
actual_output = validate_num_rows(input_num_rows)
assert expected_output == actual_output
Now we have an example of having multiple argument names and argument values with tuples, as mentioned above.
To test for failure and success, we use the argument expected_error
which we set to None
for the test cases that should fail and to the error we expect when a test should pass. Then we use pytest.raises
to call a function and validate that the expected error was thrown if expected_error
is not None
.
!python -m pytest test_parametrize.py::test_validate_num_rows
Another option for conditional raising of exeptions is shown in the documentation, and uses a contextmanager that yields for non-failing cases. It seems a little complicated to me, but if you're used to this construction you can save some lines of code in your tests.
Further reading
Mocking is an obvious next step when writing tests, my favorite source is the above mentioned RealPython article on Understanding the Python Mock Object Library.
Code coverage is a concept that goes hand in hand with testing and is a good starting point for what to test. pytest-cov
is an easy coverage plugin for pytest.
The ecosystem of plugins to pytest is huge, and there are many I would like to try, for examplepytest-mock
for mocking and pytest-vcr
for HTTP requests. This tutorial covers both the pytest-vcr
library, but also basic concepts in testing and code quality, as well as the author's strategy on how to read up on testing in Python.