With a basic understanding of what XUnit is doing, we need to determine where we’re going to try to split things up across multiple cores. Take a look at the sequence diagram fom the last article (here); we have a choice to make. It’s better to make outer loops Parallel vs. inner loops. The design decision this helps us make is that the unit of concurrency is the Class. This means if I make 100 Tests inside a single class, it will run sequentially just as though we had no fancy concurrency code. In the rest of this article we’ll look at the modifications needed to use Payneallel.ForEach with the unit tests.
We start our modifications in the XUnit GUI, which is refreshingly straightforward. The first thing to do is make it easy to choose concurrent execution. The XUnit GUI now looks like this, following the sequential execution of the control group:
The “Run concurrently” checkbox is my addition. When you click the Run button:
void OnClick_Run(object sender, EventArgs e)
_totalCount = 0;
_testCount = GetTestCount();
buttonGo.Enabled = false;
ThreadStart ts = new ThreadStart(RunAsync);
Thread t = new Thread(ts);
t.Name = "xUnitAsyncThread";
Our xUnit ExecutorWrapper is “wrapper”. In order to keep from screwing around with the GUI thread, we run XUnit on a new thread, which will in turn create many other threads using Payneallel. By default, Payneallel will block the calling thread until all operations are done, however we cannot both block the GUI thread AND allow it to update itself as test results are available. The RunAsync method is simple:
My next modification is to the ExecutorWrapper class. I tried to make my changes to XUnit additive only, adding functionality by adding methods rather than modifying things that already work for sequential execution.
public void BeginRunAssembly(Action<XmlNode> callback)
XmlNodeCallbackWrapper wrapper = new XmlNodeCallbackWrapper(callback);
CreateObject("XUnit.Sdk.Executor+RunAssemblyParallel", executor, wrapper);
I see no reason not to keep running the test in a separate AppDomain. We have added another inner class to Executor, the RunAssemblyParallel class.
Through experimentation I found that this would be the appropriate place to introduce parallel execution, at the Class level as I said previously. This class is almost a copy of the RunAssembly class included with XUnit:
public class RunAssemblyParallel : MarshalByRefObject
public RunAssemblyParallel(Executor executor, object _handler)
protected void DoParallel(Executor executor, object _handler)
ICallbackEventHandler handler = _handler as ICallbackEventHandler;
AssemblyResult results = new AssemblyResult(new Uri(executor.assembly.CodeBase).LocalPath);
Action<Type> doOne = delegate(Type type)
ITestClassCommand testClassCommand = TestClassCommandFactory.Make(type);
if (testClassCommand != null)
ClassResult classResult = TestClassCommandRunner.Execute(testClassCommand,
result => OnTestResult(result, handler));
Type exportedTypes = executor.assembly.GetExportedTypes();
int count = exportedTypes.Length;
//Parallel Test execution
Stopwatch sw = new Stopwatch();
Payneallel.ForEach<Type>(exportedTypes, doOne, true);
Console.WriteLine("Time elapsed: " + sw.Elapsed);
results.ExecutionTime = sw.Elapsed.TotalSeconds;
Like the TPL, Payneallel likes an Action<T> to execute. In the vanilla XUnit version of this code, there is no StopWatch and there is a regular foreach() block instead of Payneallel.ForEach. The stopwatch is important because I can no longer trust XUnit to time the execution! For a long time I ran and re-ran my tests and the Parallel code was always slower than the sequential version. Then I had a “pwop” moment and found the following line of code:
ExecutionTime += child.ExecutionTime;
Whoops! We can’t just add the execution time of the children (from TimedCommand) when some of the commands are running at the same time.
With the Timing issue solved, I was successfully executing unit tests concurrently and saving a lot of time doing so. Here is the same set of unit tests ran using my new Concurrent xUnit hack.
I’ll take 27 seconds over 51 seconds any day, and I have not done any optimization work yet, nor constructed a test case where the tests are nearly 4x faster on a four processor machine, but I expect to be able to get there. As I mentioned before, the Class is the unit of concurrency with this experiment, so the amount of time saved will depend heavily on how the test cases are structured. A more ideal method would be to first get a list of all of the individual methods marked with [Fact] and use the parallel semantics on that list instead.
I have a side project that is woefully under unit tested, code that I inherited. I write unit tests for the code I touch as I refactor it. The unit tests will involve a lot of database access, calculations, and Presenter mocking. I can’t disclose what this codebase is just yet, but I am in the process of testing TestDrivenàXUnitàNCover. I depend heavily on NCover and I really can’t imagine manually trying to determine what I’ve got test coverage on anymore. If this test is successful, I will eventually be able to report on how this concurrent unit testing works on 100,000 lines of code 99% covered by thousands of unit tests. This should be a sufficient test case to prove this idea is sound.
As the years go by and we still don’t have 5ghz machines, designing frameworks with concurrency in mind will become increasingly important.