The overhead of async/await in NET 4.5

The support for asynchronous operations in .NET 4.5 has made it much easier to create easily-intelligible asynchronous methods that avoid blocking. However, async/wait isn't cost-free in terms of CPU overhead. How best to judge when to use it? Chris Hurley explains.

ANTS Performance Profiler 9 now includes async profiling for database calls.

Async/await is great for avoiding blocking while potentially time-consuming work is performed in a .NET application, but there are overheads associated with running an async method: the current execution context has to be captured, there is a thread transition, and a state machine is built through which your code runs. The cost of this is comparatively negligible when the asynchronous work takes a long time, but it’s worth keeping in mind.

Support for the async and await contextual keywords is one of the most convenient new features in .NET 4.5. It’s always been possible to write asynchronous code, of course, but async/await allows it to be written in a relatively straightforward manner which neatly expresses the intention of the code, and means that it isn’t necessary to write separate continuation methods. As long as you have a Task (or anything else that implements the Awaitable pattern) that you can await on, the compiler can automatically set up the environment on which you can wait for it to complete, and then continue execution once the work is done, all without blocking the calling thread unnecessarily.

In order to provide a responsive and smooth interface, particularly on touch and gesture devices, it is particularly important  to avoiding blocking the UI thread. This was a central focus for Microsoft during the development of the WinRT API, and they ensured that any APIs that may take longer than 50ms to execute would only be available in an asynchronous form.

Of course, you could use async/await regardless of the amount of time that the method call is likely to take. However, the ease with which it’s possible to make an operation asynchronous in your code hides the work that’s being done behind the scenes. As soon as the compiler sees the async keyword next to a method, it replaces your method with the async state machine. If you write a simple method that looks like:

… then the compiler generates the following (obtained by setting .NET Reflector to .NET 4.0 mode, so it doesn’t attempt to understand the async implementation):

Calling the method now requires creating a state machine and building a Task to contain the work that goes on within it: none of the code in the original method is referenced here. Setting it all up the first time is a relatively complex operation ( Figure 1):

1827-clip_image001.png

Figure 1: Framework methods required to initialize an example async method

Despite this async method being relatively simple, ANTS Performance Profiler shows that it’s caused over 900 framework methods to be run in order to initialize it and the work it does the first time that it’s run.

1827-clip_image002-630x197.png

Figure 2: The framework methods include those called by Task.Run and several System.Runtime.CompilerServices methods

The largest proportion of these methods is made up of those involved in starting a new Task in which to do the asynchronous work, due to the call to Task.Run (Figure 2). This is not inherently due to the use of async/await, but it should be noted that moving the asynchronous work onto another thread in some way like this is required if the original thread is to be unblocked: otherwise, the work is done synchronously, despite the use of the async/await keywords. Even if the method never hits an await statement or starts a new Task, there is still overhead, as building the async method involves getting the execution context and synchronization context and therefore examining the stack. Fortunately, the context is cached, and so the overhead on subsequent calls is much lower.

The synchronization context is necessary to ensure that the continuation code after await statements is called in the same context as the original code. This is important if, for example, the method was originally called from the UI thread and will update the UI when the asynchronous task is complete, but is not always necessary or desirable. Calling Task.ConfigureAwait(false) prevents the restoration of the synchronization context, and should be used when it is not required to return to the original context.

So, given that the compiler has replaced the original contents of the AsyncMethod() method, where did it move it to? It’s ended up in the MoveNext() method of the state machine. For example, after doing some initial set-up, it runs InitialWork():

Disabling Async Mode in ANTS Performance Profiler 8 exposes this implementation detail:

1827-clip_image003.png

Figure 3: Disabling Async Mode in ANTS Performance Profiler 8 shows the internal MoveNext methods and the switch to the thread pool

In this example, the initial work was done on the originating thread, but switched over to a thread-pool thread in order to do the async work (see Figure 3). Execution then returns to the state machine, which moves on and executes the final part of the method. The more await statements there are, the more movements through the state machine are required.

So what is the overhead of all this initialization, and how much persists on subsequent calls? Here, I’ve set up a simple WPF application to initiate some synchronous and asynchronous calls in response to button clicks. The potentially-asynchronous method, DoAsyncWork(), returns in 1ms. The first call to this method takes just over 1ms when called synchronously (Figure 4):

1827-clip_image004-630x145.png

Figure 4: ANTS Performance Profiler 8 results for initial synchronous run of an example method

However, when going through the async state machine, the total time for the task to complete is over 70ms. Indeed, it takes 45ms just to get to the await statement, at which point the calling thread is unblocked (Figure 5):

1827-clip_image005-630x133.png

Figure 5: Async Mode results for the first async run of an example method, where the Total time column shows the total time required for the async method to complete

There’s a lot of initialization happening here, and fortunately the overhead is much lower on subsequent runs, as we’ll see in the next example. When these methods are called 1000 times in a loop, the synchronous calls complete in barely any more time than the 1000ms the work itself would take (Figure 6):

1827-clip_image006-630x112.png

Figure 6: Results after running an example method synchronously in a loop 1000 times

Once an async method is called in a loop to call the same function, however, the total time increases due to the additional overhead. In this particular example, involving both the use of async and scheduling tasks to the thread pool, this increase is around 150ms over the 1000ms duration for the work itself, after running the method once to prevent JIT and thread pool initialization overhead (Figure 7). That’s an increase of around 15%.

1827-clip_image007-630x112.png

Figure 7: Results after running the example method asynchronously in a loop 1000 times

The continued overhead of async/await and the dispatching of individual tasks to the thread pool is actually quite small given the amount of work that’s being done, and there’s certainly no reason not to use it for methods that are potentially slow, especially given the benefits of running such code asynchronously.  However, the overhead isn’t zero, so if you’re looking to maximize performance of frequently-called code you may want to avoid the use of async/await for very short methods, especially those called in a loop – instead, wrap the async code around potentially slow methods or larger units of work where the added overhead is negligible.

Conclusion

Avoid using async/await for very short methods or having await statements in tight loops (run the whole loop asynchronously instead). Microsoft recommends that any method that might take longer than 50ms to return should run asynchronously, so you may wish to use this figure to determine whether it’s worth using the async/await pattern.

 

For more articles like this, sign up to the fortnightly Simple-Talk newsletter.

Tags: ,

  • 31518 views

  • Rate
    [Total: 26    Average: 3.8/5]
  • Matthew Schaad

    There Is Not Always a Thread Transition
    There is not always a thread transition with async methods. As Eric Lippert puts it, the whole point of async is to *avoid* thread transitions wherever possible.

    http://blogs.msdn.com/b/ericlippert/archive/2010/10/29/asynchronous-programming-in-c-5-0-part-two-whence-await.aspx

  • Anonymous

    It’s a bad assumption that all software is synchronous
    Forcing development overhead because defaulting to asynchronous processing is perhaps the biggest mistake Microsoft an other companies have made.

    I have plenty of real world examples of a design that requires synchronous processing where any need for asynchronous is easily a minority part of the design.

    Also, 98% of our clients don’t want or need POS (point of sale) mobile applications (no WinRT).

    I’ve used Async/await in my Silverlight 5 application (yes .NET 4.0) with the appropriate extensions applied. It’s usage is hokey at best, not well thought out, and clearly a Band-Aid … for example methods can NOT have any ByRef parameters. The list of limitations make it’s usage difficult to implement, requires major code changes, and for the most part is to be avoided.

    But again, synchronous programming should be the "Default" and asynchronous left up to the design/development team if they feel it’s necessary. The ONLY reason this isn’t so (reversed), is because Microsoft don’t want their OS to appear slow (the UI side) … so once again it’s all about what Microsoft want, and not what developers want.

    Bottom line, don’t use Async/Await if you can avoid it … sadly, convoluted code is going to have to rule the day until Microsoft start produce tools we want rather than tools to make them "look good".

  • Jon Harrop

    Async in F#
    How does this compare with async in F#?

  • Morten Hartlev Lindhart

    The simple example you provide is not simple. It creates (or takes from the threadpool) a thread, by using the “Task.Run” method. This is very fine if you want to do parallel computing, but the scope here is async/await – so here it’s a misuse, since it causes overhead.

    I did a simple example myself with an asynchronous vs. a synchronously method that takes about 15 milliseconds (calling a webservice) – both were run about 200 times (about 3 seconds). The 200 asynchronous calls combined was about 1-3 milliseconds slower (so statistically it was the same speed).

    When I then allowed the async methods to take advantage of the waiting period and start other calls then the async calls where a factor 3 faster (1 second vs. 3 seconds).