Blog | Tristan Kernan - Investigating the Pytest Django Teardown Race Condition

Investigating the Pytest Django Teardown Race Condition

Background

In my previous post about pytest, playwright and django, I described an annoying non deterministic failure on test teardown. I believe the issue is a race condition that occurs when the test database is flushed, while network requests are pending. The reference issue agrees with the theory but does not detail a root cause, nor are the solutions insightful or practical.

I let this issue simmer; the band-aid of explicitly waiting for network requests to finish at the end of each test mostly worked. It was far from satisfactory, however, being a guard that developers would have to mindfully add to every test. Over time, changes in network requests also meant failing tests. In short, my team reached the tipping point at which a more thorough solution was necessary to continue investing in playwright tests.

Investigation

In past investigations, I placed a degree of trust in the library code. After all, doubting the code that I depend on can lead to some significant trust issues. This time around though, I was determined to answer the question that bugged me: why are there unfinished requests at test teardown? To find the answer, I had to look into the source for pytest-django, django, and the python standard library.

pytest-django provides the live_server fixture, which exposes an http server backed by the django application on a local port. This fixture delegates to the LiveServer class, which mirrors the LiveServerTestCase class provided by django, re-implemented to work as a pytest fixture. The LiveServer class starts a LiveServerThread (also provided by django), which is responsible for actually running the http server.

Ok, still with me? The LiveServerThread is terminated at the end of the live_server fixture. In theory, then, pending requests should be forcefully terminated. Ah, but that would only be the case if LiveServerThread was responsible for processing requests itself - in fact, it isn't. The default server_class is django's ThreadedWSGIServer, which processes each incoming request in its own thread - meaning that the server is capable of concurrent request processing. The server class is powered by the python standard library, leveraging socketserver.ThreadingMixIn for the per-thread behavior.

Notably, ThreadedWSGIServer configures the request handling threads to be daemons, per the docs:

When inheriting from ThreadingMixIn for threaded connection behavior, you should explicitly declare how you want your threads to behave on an abrupt shutdown. The ThreadingMixIn class defines an attribute daemon_threads, which indicates whether or not the server should wait for thread termination. You should set the flag explicitly if you would like threads to behave autonomously; the default is False, meaning that Python will not exit until all threads created by ThreadingMixIn have exited.

This means that, while the LiveServerThread will exit cleanly, any pending request threads will not be joined / waited upon to finish. Mystery solved? If any of those requests are interacting with the database, there's the possibility for a deadlock between the request thread and the main thread attempting to flush the test database.

A possible solution

A relatively straightforward, if not pretty, solution is to override the configuration of ThreadedWSGIServer in a live_server fixture override:

from django.core.servers.basehttp import ThreadedWSGIServer

@pytest.fixture
def live_server(live_server):
    # wait on requests to finish before shutting down the server
    ThreadedWSGIServer.daemon_threads = False

    yield live_server

This change will result in the server waiting for request thread completion before itself shutting down. As it's isolated to the test suite, it's not the most abominable monkeypatch imaginable, but I'd prefer a cleaner solution whereby pytest-django would support customization of the underlying LiveServerThread, to modify the server_class (I believe this is better supported by django's unit test framework).

Unfinished business

This investigation cleanly answers my original question: why are there unfinished network requests at test teardown? The requests are handled in separate, daemon threads that are not stopped / waited on when the live_server fixture cleans up.

In the pytest-django source though, I noticed another behavior: the live_server fixture is session scoped by default. This means the fixture provides the same LiveServer instance for every test in a given run. Therefore the server class won't stop until all tests have run, or put another way, those request threads won't be joined at the end of each test, so there's still the possibility for the race condition.

To solve this, my untested ¹ theory is to make the live_server fixture function scoped, which means that the LiveServer class will stop after each test case, joining all request threads.

This could be achieved by overriding it in my project conftest, copying the relevant contents from the pytest-django source:

from django.core.servers.basehttp import ThreadedWSGIServer

@pytest.fixture(scope="function")
def live_server(request: pytest.FixtureRequest):
    """Run a live Django server in the background during tests.
    Copied from pytest-django source. Simplified based on project usage.
    """
    # wait on requests to finish before shutting down the server
    ThreadedWSGIServer.daemon_threads = False

    server = live_server_helper.LiveServer("localhost")

    yield server

    server.stop()

Put together, this approach should resolve the underlying race condition: at the end of each test, the http server will be stopped, waiting for pending requests to finish. When the database flush runs, there will be no concurrent access.

I am on vacation currently! ↩