Processes vs. Threads for Integration Testing

John Jacobsen
Friday, April 19, 2013

Home All Posts

Earlier post Resources for Learning Clojure code clojure

Later post Introduction to Context Managers in Python code python

Python has become notorious for being somewhat problematic when it comes to concurrency. The language supports threads, but performance is known to be dismal in some cases. In particular, as David Beazley discovered, when I/O-bound and CPU-bound threads run in the same program, the GIL performs very badly (worse than if you ran the tasks one after the other on a single thread/CPU). Alternative approaches which have become popular are to use asynchronous or event-driven programming techniques (Twisted, asyncore), or to make heavy use of the multiprocessing module.

These wrinkles aside, there are situations where threads in Python are still very useful. One of these is integration testing of distributed systems.

Imagine you have a distributed system with some 30 (or 3, or 300) software components running on some cluster. These programs might interact with a database, with a user via command line or a GUI (Web or otherwise), and with each other via a messaging or RPC protocol (XML-RPC, ZeroMQ, …).

While low-level (functional, unit) testing will perhaps be the bulk of your tests, integration tests are important to make sure all the programs talk to each other as they should. Like your other tests, you want to automate these. And they should run as fast as possible to optimize the feedback cycle during development.

A straightforward way to test these components is to run them all in separate programs (locally or distributed), each in their own process. However, you’re likely to get much better performance from your many short-running tests if you run the components as local threads in a single process. The components running in these threads would, of course, still talk to each other via RPC, ZeroMQ, etc., same as if they were processes. But for short tests the setup and teardown for threads is much faster. The most trivial example (assigning a value to a variabled) shows the difference dramatically:

# in ipython:

import threading, subprocess

doproc = lambda: subprocess.Popen(["python", "-c", "'a=1'"], 
                                  stdout=subprocess.PIPE).communicate()

def dothread():
    def run():
        a = 1
    th = threading.Thread(target=run)
    th.start()
    th.join()

time junk = [doproc() for _ in range(500)]

# CPU times: user 0.18 s, sys: 1.81 s, total: 1.98 s
# Wall time: 14.30 s

time junk = [dothread() for _ in range(500)]

# CPU times: user 0.09 s, sys: 0.05 s, total: 0.14 s
# Wall time: 0.16 s

That’s a factor of about a hundred – something you will most definitely notice in your test suite.

Another advantage of this approach is that, when you write your code so that it can be run as threads, you can put as many or few of these threads in actual processes (programs) as you’d like. In other words, the coupling of processes to components is loosened. Your automated tests will force you to implement clean shutdown semantics for each thread (otherwise your test program will likely not terminate without manual interruption).

Finally, it’s much easier to interrogate the state of each component when it’s running as a thread, than it is to query a subprocess (via e.g. RPC). This greatly simplifies the assertions you have to make in your integration tests, since you don’t have to send a message of some sort via RPC or message queues – you can just query variables.

I found, for the automated tests in IceCube Live, that making components that could be instantiated in threads (for testing) or in processes (for production) greatly sped up my test suite and simplified the actual tests quite a bit. I should note that, prior to release, there is still a final integration test done on a mirror test system which simulates actual data collection and makes sure IceCube Live can play along with other systems. The mirroring, however, is not exact, since the actual detector elements we deployed at South Pole are expensive and rely on a billion tons of crystal-clear ice to work as intended.

In the next post we will explore the use of context managers, which are helpful for organizing setup and teardown of complex tests cases involving multiple components.

Earlier post Resources for Learning Clojure code clojure

Later post Introduction to Context Managers in Python code python

Blog Posts (140)


Home

Reflections on a Year of Daily Memory Drawings art

Repainting art

Daily Memory Drawings art

Questions to Ask art

Macro-writing Macros code clojure

Lazy Physics code clojure

Fun with Instaparse code clojure

Nucleotide Repetition Lengths code clojure

Updating the Genome Decoder code clojure

Getting Our Hands Dirty (with the Human Genome) code clojure

Validating the Genome Decoder code clojure

A Two Bit Decoder code clojure

Exploratory Genomics with Clojure code clojure

Rosalind Problems in Clojure code clojure

Introduction to Context Managers in Python code python

Resources for Learning Clojure code clojure

Continuous Testing in Python, Clojure and Blub code clojure python

Programming Languages code

Milvans and Container Malls southpole

Oxygen southpole

Ghost southpole

Turkey, Stuffing, Eclipse southpole

Wind Storm and Moon Dust southpole

Shower Instructions southpole

Fresh Air and Bananas southpole

Traveller and the Human Chain southpole

Reveille southpole

Drifts southpole

Bon Voyage southpole

A Nicer Guy? southpole

The Quiet Earth southpole

Ten southpole

ISO50 southpole art

In Defense of Hobbies misc code art

Closure southpole

Takeoff southpole

Mummification southpole

Eleventh Hour southpole

Diamond southpole

Baby, It’s Cold Outside southpole

Fruition southpole

Settling In southpole

Revolution Number Nine southpole

Red Eye to McMurdo southpole

Faults in Ice and Rock southpole

Bardo southpole

Chasing the Sun southpole

Downhole southpole

Coming Out of Hibernation southpole

Managing the Most Remote Data Center in the World code southpole

Photoshop on a Dime art

Man on Wire misc

Posing Rigs art

Metric art

Cuba southpole

Wickets southpole

Safe southpole

Broken Glasses southpole

End of the Second Act southpole

Pigs and Fish southpole

Last Arrivals southpole

Lily White southpole

In a Dry and Waterless Place southpole

Immortality southpole

Routine southpole

Tourists southpole

Passing Notes southpole

Translation southpole

The Usual Delays southpole

RNZAF southpole

CHC southpole

Wyeth on Another Planet art

Detox southpole

Packing southpole

Nails southpole

Gearing Up southpole

Gouache, and a new system for conquering the world art

Fall 2008 HPAC Studies art

YABP (Yet Another Blog Platform) southpole

A Bath southpole

Green Marathon southpole

Sprung southpole

Outta Here southpole

Lame Duck DAQer southpole

Eclipse Town southpole

One More Week southpole

IceCube Laboratory Video Tour; Midrats Finale southpole

SPIFF, Party, Shift Change southpole

Good things come in threes, or 18s southpole

Sleepless in the Station southpole

Post Deploy southpole

Midrats southpole

IceCube and The Beatles southpole

Video: Flight to South Pole southpole

The Pure Land southpole

Almost There southpole

There are no mice in the Hotel California Bunkroom southpole

Short Timer southpole

Sleepy in MacTown southpole

Superposition of Luggage States southpole

Sir Ed southpole

Shortcut to Toast southpole

Pynchon, Redux southpole

Flights: Round 1 southpole

Packing for the Pole southpole

Goals for Trip southpole

Balaklavas southpole

Tree and Man (Test Post) southpole

Schedule southpole

How to mail stuff to John at the South Pole southpole

Summer and Winter southpole

Homeward Bound southpole

Redeployment southpole

Short-timer southpole

The Cleanest Air in the World southpole

One more day (?) southpole

One more week (?) southpole

Closing Softly southpole

More Photos southpole

Super Bowl Wednesday southpole

Night Owls southpole

First Week southpole

More Ice Pix southpole

Settling In southpole

NPX southpole

Pole Bound southpole

Bad Dirt southpole

The Last Laugh southpole

Nope southpole

First Delay southpole

Batteries and Sheep southpole

All for McNaught southpole

The Big (Really really big…) Picture southpole

t=0 southpole

Giacometti southpole

Descent southpole

Video Tour southpole

How to subscribe to blog updates southpole

What The Blog is For southpole

Home