November 1, 2006

Python Tip of the Day: Unit Tests and loading module to test

Filed under: ResearchAndDevelopment — Ryan Wilcox @ 5:10 pm

Unit Tests are very useful. Lately I find myself writing tests for even small scripts – a few hundred lines. Even in a few hundred line script there’s always that one function which does the main work on the most basic block, and you want to make sure it properly handles whatever kind of input is passed in. Doctests are great for this: having a method documented and tested without having to write any test infrastructure is great. (See also: my blog article on easy doctesting).

unittest: when doctest isn’t enough

Sometimes, for more complex things, you do need more infrastructure than doctest gives you. unittest is right there to help in that case. With more power often comes more pain, and unittest has the issue that with “standard” directory structures it’s hard to load the module you actually want to test.

Typically unittest based tests are kept in their own files, separate from the actual working code. Probably a good thing, it keeps the script file from becoming too large, and keeping the focus of purpose. So we keep tests in a separate file than the script it tests, and import said script to test the functions in it. Or: the test script imports the actual script, and tests methods from the actual script to make sure they work.

“Everybody” seems to like this directory structure for tests and code:

toplevelfolder/
	a.py
	b.py
	Tests/
		test_a.py
		test_b.py

I like this structure too – it keeps the tests away from the source, and I don’t get test files cluttering up my source directory. As much as I could let the file system sort these things I kinda like it.

So now we have a problem: test_a.py tests a.py, meaning that test_a has to import a. Hmm. Pythonistas are already familiar with this issue, but needs some explanation for Everybody Else. Then we can get onto the solutions.

The Background: The Problem of Importing ‘a’

Python searches for modules along a list of directories accessible via sys.path. So my answer to this question has always been to write a dozen lines of broiler-plate code in the test_ python files to:

  1. Get the parent directory of the script
  2. Get that directory’s parent directory
  3. Append that to sys.path
  4. Import module to test

Of course, every time you copy and paste code you risk subtle changes in the mix: did I copy it from an older place, or a more recent place, did I introduce something stupid, what if I need to update my methods? All these questions suck, so the best idea is to avoid duplicating as much broiler-plate code as possible.

The Solution in Python 2.5

Python 2.5 brings two solutions to the table. One of these solutions looks really promising, but doesn’t actually work. The other solution, which certainly isn’t the first one that jumps to mind when you’re looking at this problem, is the one that actually does work.

One Possible Solution: Relative Imports

Python 2.5, recently released, had a feature I was very excited about: Relative Imports (PEP 328), except it just gives an error.

In our test script we have:
from .. import a

When we try it out:


$ cd topleveldirectory/Tests/
$ python test_a.py
Traceback (most recent call last):
File "test_a.py", line 15, in
from .. import a
ValueError: Attempted relative import in non-package

Hmm. Funny, and it seems like it should work. ‘Cept apparently it doesn’t work like that in Python 2.5 (We’ll explain why later). Nuts, that would have solved my problem. On to the next solution.

The Actual Solution: Executing Modules As Scripts

As excited as I was about PEP 328’s relative imports and how it would make package management easier, I was equally as apathetic about PEP 338: Executing Modules As Scripts.

Reviewing this PEP finds the explanation of the error in our first approach:

Import Statements and the Main Module

The release of 2.5b1 showed a surprising (although obvious in retrospect) interaction between this PEP [338] and PEP 328 – explicit relative imports don’t work from a main module. This is due to the fact that relative imports rely on __name__ to determine the current module’s position in the package hierarchy. In a main module, the value of __name__ is always ‘__main__’, so explicit relative imports will always fail (as they only work for a module inside a package).

Apparently this issue will be revisited for Python 2.6:

The question of whether or not relative imports should be supported when a main module is executed with -m is something that will be revisited for Python 2.6.

Python 2.6 isn’t here yet, so we have to live in an imperfect world. But this Python 2.5 imperfect world is better than the world I lived in before, with dozens of lines of broiler-plate code laying around everywhere.

Setting up to use PEP 338

  1. Your test structure should look like:

    toplevelfolder/
    	a.py
    	b.py
    	Tests/
    		__init__.py
    		test_a.py
    		test_b.py
    

  2. In your test_a.py file, just import a and write your tests.
  3. $ cd toplevelfolder/
    $ python -m Tests.test_a.py
    

Yes! This works. Great. In a way it would be great to have something that gathers up all the tests for you. It also does its work very efficiently and with a minimal amount of typing. We also find a decent use case for PEP 338.

Another Solution (for shell scripts or earlier pythons)

In the process of writing this entry I remembered Python’s -c switch, which allows you to throw together small bits of Python together right on the command line and execute it. Silly me. We can use this method to run tests too. More typing, but hey it works, especially for installs that haven’t updated their Python to something modern (the version of Python that ships with Mac OS X 10.4 isn’t exactly modern. Hopefully 10.5 will ship with Python 2.5.)

Setup for this technique is just the same as the PEP 338 solution: import a in your test, cd toplevelfolder, and run Python. The command you feed to Python is really the only different thing here.

$ python -c "import unittest; import Tests; unittest.defaultTestLoader.loadTestsFromName('Tests.test_a.py')"

This has a distinct advantage to all the other methods: we can easily write a shell script to iterate over files in a folder and repeatedly call this pattern, executing all the tests in a folder. You know that shirt “Go away or I shall replace you with a very small shell script“? Well, guess what.


#!/bin/sh
for dir in $@; do
for files in `ls $dir/test_*.py`; do
dname=`dirname $files`
fname=`basename $files`
python -c "import unittest; import Tests; unittest.defaultTestLoader.loadTestsFromName('$dname.$fname')"
done
done

Save this as dotests.sh (say in your toplevelfolder. Or somewhere on your path, whatever) and now:

$ cd toplevelfolder/
$ sh dotests.sh Tests

All the test_*.py files in the Tests folder will be ran, and the world will be happy.

Conclusion

Seems like we have solutions! No More Broilerplate! Simple imports in our tests. Automation possibilities (consider it a reader’s exercise to port that shell script to use the PEP 338 method).

Now, go forth and test!