Doctests & Metaclasses & Multiprocessing, Oh My!

Doctests & Metaclasses & Multiprocessing, Oh My!

I've been programming in Python for more than a decade, but it's only now that I have had to bring out the big guns: writing a metaclass. Sure, I've used them before (constantly, in fact: Django's entire data model is built on top of the BaseModel metaclass), but I've never needed to get down and dirty and actually write one.

This article is inspired by Ned Batchelder's series of 'techniques in the real world' blog posts. His article on using Python 3.10's match/case statement to handle a JSON packet made me think which dusty corners I had visited recently, and reminded me of my struggles to get my doctests to run under Django.


Unlike some of my colleagues, I like doctests, but only when taken in moderation: If I'm going to put a small example into my function's docstring (and I am), then of course I want to test that I haven't made a typo, and docstring test are perfect for this. Take the following function:

def currency(amount: Decimal|float, *, symbol: str = '$') -> str:
    Rounds and formats currency value for display.

        >>> currency(185995)
        >>> currency(35, symbol='€')

            Value of currency. 
            Optionally change the default dollar symbol.

        Formatted string

I think it's quicker to understand the function's purpose from those two basic examples, than to read the whole docstring (and the function signature). The doctest module will happily run are function twice and check the output for us. Everybody wins! Except when we don't...

Failing Faster in Parallel

That function is part of a real-world text utility module I wrote for a big Django project. It has 21 functions tested to 100% branch-coverage by 123 tests, 19 of which are doctests. The whole project is well tested - currently 1,502 test cases in total. My workstation has a bunch of CPU cores, why not to use them to run those tests in parallel and get back to coding faster?

This is where it all fell apart for me. Django's test runner has had built-in support for running tests in parallel for years (since version 1.9, released 2015). It works well, but not for doctests. Creating a load_test() function as documented in the Python docs gave the dreaded TypeError: cannot pickle 'module' object.

Creating TestCases for doctests

As you recall, a metaclass is Python's way of allowing us to step and change the way a class is built. We can use that power to create a plain TestCase class (that Django already knows how to parallelise) then populate it with test functions, one for each doctest. We'll call the metaclass DocTestLoader. Let's see it in action on that text utility module:

from unittest import TestCase
from animal3.utils.testing import DocTestLoader
from ..utils import text

class DocTests(TestCase, metaclass=DocTestLoader, test_module=text):

It's that easy! If you were to examine the DocTests class you would see a whole bunch of test functions that the metaclass made for us, including one to test the docs for our example function: test_doctest_currency()

Like any good abstraction, using the metaclass is easy - although admittedly challenging to write.

The Metaclass

This article is more a war-story than a deep tutorial on metaclass creation, so I'm going to stay fairly high-level. The resources I found most useful where: Chapter 9 Metaprogramming in David Beazley's Python Cookbook
for an overview, followed by the official docs for the nitty-gritty details.

As per the official docs, when Python executes a class these steps a taken:

  1. Resolve inheritance to determine type
  2. Determine which metaclass to use, usually type()
  3. Prepare the namespace
  4. Execute the class body
  5. Create the class object

We want to insert new functions into the class during creation, so step three looks like a good place to start. Preparing the class's namespace is handled by the metaclass's __prepare__() method.

Let's take a look at the completed metaclass:

class DocTestLoader(type):
    def __new__(*args: Any, **kwargs: Any) -> type:
        """Prevent `test_module` argument from going further."""
        return type.__new__(*args, **kwargs)

    def __prepare__(
        meta: Type[type],
        class_name: str,
        bases: Tuple[type, ...],
        **kwargs: Any,
    ) -> Mapping[str, object]:
        """Create test methods and add them to the class."""
        prepared = type.__prepare__(class_name, bases)
        test_module: ModuleType = kwargs.pop('test_module')
        suite = doctest.DocTestSuite(
        for test in suite:
            name = meta.create_test_name(test)              
            prepared[name] = meta.create_test_method(test)  
        return prepared

    def create_test_method(cls, test: doctest.DocTestCase) -> Callable:
        def test_method(self) -> None:                    
            return test.runTest()
        return test_method

    def create_test_name(cls, test: doctest.DocTestCase) -> str:
        name = repr(test).partition(' ')[0]
        name = f"test_doctest_{name}"
        return name

Not so bad. Like when using most new techniques, the hard part was figuring out where to put our code - how our piece fit into the larger puzzle.

The code is simple, at least when looking back at it. In the __prepare__() method we first let type, the parent metaclass create the class's namespace for us. Then we examine the test module given, create a doctest test suite, then use that to create a bunch of plain old test functions which we add into our class namespace using names of our own choosing.


My doctests are all being loaded and run, and I can now run my tests in parallel. All is good with the world and I can take it easy for a while.

$ ./ test
Ran 1505 tests in 1.678s

$ ./ test --parallel 16
Ran 1505 tests in 0.296s

Updated 27 Apr 2024, first published 14 Apr 2024