With the new async syntax in Python 3.5, defining asynchronous functions has become a lot simpler. In this article, you can see an example for this new feature.

The example code here pulls in a set of homepages of popular websites and displays the first 10 characters of every HTTP response. It uses the awesome aiohttp library. Please make sure that your machine has aiohttp and Python 3.5 installed.

Synchronous approach

First, let’s take a look at how this would have been solved in a naive, synchronous fashion. Let’s define our set of URLs that we want to retrieve.

sites = [
    'https://www.google.com',
    'https://www.yahoo.com',
    'https://www.bbc.co.uk',
    'https://en.wikipedia.org',
    'https://de.wikipedia.org',
    'https://news.ycombinator.com',
    'https://www.tagesschau.de',
]
FORMAT_STRING ="{site.url:<30.30}: {site.text:.10} in {site.elapsed}s"

We also need some logic to retrieve the pages.

from requests import get

def get_site_snippet(site):
    return FORMAT_STRING.format(site=get(site))

def main():
    for site in sites:
        print(get_site_snippet(site))

main()

Once we run our example, we see the following response

https://www.google.com/       : <!doctype  in 0:00:00.798940s
https://www.yahoo.com/        : <?xml vers in 0:00:00.883083s
http://www.bbc.co.uk/         : <!DOCTYPE  in 0:00:01.479646s
https://en.wikipedia.org/wiki/: <!DOCTYPE  in 0:00:00.172367s
https://de.wikipedia.org/wiki/: <!DOCTYPE  in 0:00:00.166793s
https://news.ycombinator.com/ : <html op=" in 0:00:01.464526s
https://www.tagesschau.de/    : <!DOCTYPE  in 0:00:00.991291s

This looks good. But, knowing how this kind of code works, we realize that a lot of time is being spent waiting for and being blocked by external resources. HTTP servers can take a long time to respond to some requests, and depending on a multitude of factors, results are almost always not being delivered instantly.

Since one HTTP request has to be finished to execute the next HTTP request, our program loses a considerable amount of time because it’s blocked by an external resource. This is also known as an I/O bound computation.

By interleaving the execution of more than one function call at once, you are able to save a considerable amount of time. In this example, this means that while one function call is busy retrieving HTTP results, another function call can already get the next site name and start the next HTTP request.

Async all the things

We run the same task again, but with a twist. Instead of synchronously executing the get_site_snippet function, we are going to asynchronously get all website results and join the results in the end. Let’s take a look at how to achieve that.

from asyncio.client import get

async def async_get_site_snippet(site):
    response = await get(site)
    content = await response.read()
    return FORMAT_STRING.format(site=content)

A keen observer immediately notices the usage of async and await. I won’t try to get too much into the details of how these are being handled in CPython internally. Let’s just say that this means that the functions we’re calling does not return the desired result immediately. Instead, calling an async function, returns a promise to eventually calculate a result. To be more precise, calling async_get_site_snippet('http://www.google.com') returns a coroutine object. The same applies for the two await calls: the functions get(site) and response.read() both return a promise.

Now, by itself the coroutine object does nothing. Getting to the result of every coroutine call involves some extra code that we are going to add now.

First, a list of tasks needed to be created, containing all the coroutine objects that need to be run.

from asyncio import get_event_loop, wait

def async_main():
    tasks = [async_get_site_snippet(site) for site in sites]

Then, we create a BaseEventLoop object, that runs all our tasks until they all have completed. To execute all coroutines concurrently, the list of tasks needs to be wrapped with asyncio.wait. The BaseEventLoop object can then run the wrapped tasks until all results are returned.

    loop = get_event_loop()
    # We can safely discard pending futures
    result, _ = loop.run_until_complete(wait(tasks))
    loop.close()

The results of loop.run_until_complete are now contained in a list and are ready to be retrieved.

    for task in result:
        print(task.result())

async_main()

Pretty neat. The benefits become obvious immediately when we compare execution times:

print("Running synchronous example")
start = time()
main()
duration = time() - start

print("Running asynchronous example")
async_start = time()
async_main()
async_duration = time() - async_start

print("Synchronous example took {} seconds".format(duration))
print("Asynchronous example took {} seconds".format(async_duration))

This outputs the following on my trusty laptop:

Running synchronous example
https://www.google.com/       : <!doctype  in 0:00:00.798940s
https://www.yahoo.com/        : <?xml vers in 0:00:00.883083s
http://www.bbc.co.uk/         : <!DOCTYPE  in 0:00:01.479646s
https://en.wikipedia.org/wiki/: <!DOCTYPE  in 0:00:00.172367s
https://de.wikipedia.org/wiki/: <!DOCTYPE  in 0:00:00.166793s
https://news.ycombinator.com/ : <html op=" in 0:00:01.464526s
https://www.tagesschau.de/    : <!DOCTYPE  in 0:00:00.991291s
Running asynchronous example
https://www.google.com/       : <!doctype  in 0:00:00.618827s
http://www.bbc.co.uk/         : <!DOCTYPE  in 0:00:00.501347s
https://en.wikipedia.org/wiki/: <!DOCTYPE  in 0:00:00.169479s
https://www.yahoo.com/        : <?xml vers in 0:00:00.762460s
https://news.ycombinator.com/ : <html op=" in 0:00:00.711696s
https://www.tagesschau.de/    : <!DOCTYPE  in 0:00:00.645607s
https://de.wikipedia.org/wiki/: <!DOCTYPE  in 0:00:00.167020s
Synchronous example took 12.025413990020752 seconds
Asynchronous example took 6.950876951217651 seconds

While the result may vary slightly depending on the network and server load, it becomes clear that we can shave off a few seconds of execution time by not letting HTTP requests block us. In this run the execution time was halved, impressive.

Using Async Functions in Python 3.5

Synchronous approach

Async all the things

Further reading

Python documentation

Tutorials

Tags