Last Updated on November 14, 2023
We can query the HTTP status of websites using asyncio by opening a stream sending HTTP requests, and then reading the response.
Once developed, we can use this mechanism to query the status of many websites concurrently and even report the results dynamically.
In this tutorial, you will discover how to develop a concurrent client program to check the HTTP status of multiple web pages.
After completing this tutorial, you will know:
- How to manually manage the life-cycle of opening, sending, receiving, and closing an HTTP connection in asyncio.
- How to sequentially and then concurrently query the status of multiple websites.
- How to concurrently query and dynamically report responses from servers as they are received with asyncio.
Let’s get started.
How to Check HTTP Status with Asyncio
The asyncio module provides support for opening socket connections and reading and writing data via streams.
We can use this capability to check the status of web pages.
This involves perhaps four steps, they are:
- 1. Open a connection
- 2. Write a request
- 3. Read a response
- 4. Close the connection
Let’s take a closer look at each part in turn.
Open HTTP Connection
A connection can be opened in asyncio using the asyncio.open_connection() function.
Establish a network connection and return a pair of (reader, writer) objects.
— Asyncio Streams
Among many arguments, the function takes the string hostname and integer port number
This is a coroutine that must be awaited and returns a StreamReader and a StreamWriter for reading and writing with the socket.
This can be used to open an HTTP connection on port 80.
For example:
1 2 3 |
... # open a socket connection reader, writer = await asyncio.open_connection('www.google.com', 80) |
We can also open an SSL connection using the ssl=True argument. This can be used to open an HTTPS connection on port 443.
For example:
1 2 3 |
... # open a socket connection reader, writer = await asyncio.open_connection('www.google.com', 443) |
Write HTTP Request
Once open, we can write a query to the StreamWriter to make an HTTP request.
StreamWriter: Represents a writer object that provides APIs to write data to the IO stream.
— Asyncio Streams
For example, an HTTP version 1.1 request is in plain text. We can request the file path ‘/’, which may look as follows:
1 2 |
GET / HTTP/1.1 Host: www.google.com |
Importantly, there must be a carriage return and a line feed (\r\n) at the end of each line, and an empty line at the end.
As Python strings this may look as follows:
1 2 3 |
'GET / HTTP/1.1\r\n' 'Host: www.google.com\r\n' '\r\n' |
You can learn more about HTTP v1.1 request messages here:
This string must be encoded as bytes before being written to the StreamWriter.
This can be achieved using the encode() method on the string itself.
The default ‘utf-8’ encoding may be sufficient.
For example:
1 2 3 |
... # encode string as bytes byte_data = string.encode() |
You can see a listing of encodings here:
The bytes can then be written to the socket via the StreamWriter via the write() method.
For example:
1 2 3 |
... # write query to socket writer.write(byte_data) |
After writing the request, it is a good idea to wait for the byte data to be sent and for the socket to be ready.
This can be achieved by the drain() method.
Wait until it is appropriate to resume writing to the stream.
— Asyncio Streams
This is a coroutine that must be awaited.
For example:
1 2 3 |
... # wait for the socket to be ready. await writer.drain() |
Read HTTP Response
Once the HTTP request has been made, we can read the response.
This can be achieved via the StreamReader for the socket.
Represents a reader object that provides APIs to read data from the IO stream.
— Asyncio Streams
The response can be read using the read() method which will read a chunk of bytes, or the readline() method which will read one line of bytes.
We might prefer the readline() method because we are using the text-based HTTP protocol which sends HTML data one line at a time.
The readline() method is a coroutine and must be awaited.
For example:
1 2 3 |
... # read one line of response line_bytes = await reader.readline() |
HTTP 1.1 responses are composed of two parts, a header separated by an empty line, then the body terminating with an empty line.
The header has information about whether the request was successful and what type of file will be sent, and the body contains the content of the file, such as an HTML webpage.
The first line of the HTTP header contains the HTTP status for the requested page on the server.
You can learn more about HTTP v1.1 responses here:
Each line must be decoded from bytes into a string.
This can be achieved using the decode() method on the byte data. Again, the default encoding is ‘utf_8’.
For example:
1 2 3 |
... # decode bytes into a string line_data = line_bytes.decode() |
Close HTTP Connection
We can close the socket connection by closing the StreamWriter.
This can be achieved by calling the close() method.
For example:
1 2 3 |
... # close the connection writer.close() |
This does not block and may not close the socket immediately.
Now that we know how to make HTTP requests and read responses using asyncio, let’s look at some worked examples of checking web page statuses.
Run loops using all CPUs, download your FREE book to learn how.
Example of Checking HTTP Status Sequentially
We can develop an example to check the HTTP status for multiple websites using asyncio.
In this example, we will first develop a coroutine that will check the status of a given URL. We will then call this coroutine once for each of the top 10 websites.
Firstly, we can define a coroutine that will take a URL string and return the HTTP status.
1 2 3 |
# get the HTTP/S status of a webpage async def get_status(url): # ... |
The URL must be parsed into its constituent components.
We require the hostname and file path when making the HTTP request. We also need to know the URL scheme (HTTP or HTTPS) in order to determine whether SSL is required nor not.
This can be achieved using the urllib.parse.urlsplit() function that takes a URL string and returns a named tuple of all the URL elements.
1 2 3 |
... # split the url into components url_parsed = urlsplit(url) |
We can then open the HTTP connection based on the URL scheme and use the URL hostname.
1 2 3 4 5 6 |
... # open the connection if url_parsed.scheme == 'https': Â Â Â Â reader, writer = await asyncio.open_connection(url_parsed.hostname, 443, ssl=True) else: Â Â Â Â reader, writer = await asyncio.open_connection(url_parsed.hostname, 80) |
Next, we can create the HTTP GET request using the hostname and file path and write the encoded bytes to the socket using the StreamWriter.
1 2 3 4 5 6 7 |
... # send GET request query = f'GET {url_parsed.path} HTTP/1.1\r\nHost: {url_parsed.hostname}\r\n\r\n' # write query to socket writer.write(query.encode()) # wait for the bytes to be written to the socket await writer.drain() |
Next, we can read the HTTP response.
We only require the first line of the response that contains the HTTP status.
1 2 3 |
... # read the single line response response = await reader.readline() |
The connection can then be closed.
1 2 3 |
... # close the connection writer.close() |
Finally, we can decode the bytes read from the server, remote trailing white space, and return the HTTP status.
1 2 3 4 5 |
... # decode and strip white space status = response.decode().strip() # return the response return status |
Tying this together, the complete get_status() coroutine is listed below.
It does not have any error handling, such as the case where the host cannot be reached or is slow to respond.
These additions would make a nice extension for the reader.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
# get the HTTP/S status of a webpage async def get_status(url):     # split the url into components     url_parsed = urlsplit(url)     # open the connection     if url_parsed.scheme == 'https':         reader, writer = await asyncio.open_connection(url_parsed.hostname, 443, ssl=True)     else:         reader, writer = await asyncio.open_connection(url_parsed.hostname, 80)     # send GET request     query = f'GET {url_parsed.path} HTTP/1.1\r\nHost: {url_parsed.hostname}\r\n\r\n'     # write query to socket     writer.write(query.encode())     # wait for the bytes to be written to the socket     await writer.drain()     # read the single line response     response = await reader.readline()     # close the connection     writer.close()     # decode and strip white space     status = response.decode().strip()     # return the response     return status |
Next, we can call the get_status() coroutine for multiple web pages or websites we want to check.
In this case, we will define a list of the top 10 web pages in the world.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
... # list of top 10 websites to check sites = ['https://www.google.com/', Â Â Â Â 'https://www.youtube.com/', Â Â Â Â 'https://www.facebook.com/', Â Â Â Â 'https://twitter.com/', Â Â Â Â 'https://www.instagram.com/', Â Â Â Â 'https://www.baidu.com/', Â Â Â Â 'https://www.wikipedia.org/', Â Â Â Â 'https://yandex.ru/', Â Â Â Â 'https://yahoo.com/', Â Â Â Â 'https://www.whatsapp.com/' Â Â Â Â ] |
We can then query each, in turn, using our get_status() coroutine.
In this case, we will do so sequentially in a loop, and report the status of each in turn.
1 2 3 4 5 6 7 |
... # check the status of all websites for url in sites:     # get the status for the url     status = await get_status(url)     # report the url and its status     print(f'{url:30}:\t{status}') |
We can do better than sequential when using asyncio, but this provides a good starting point that we can improve upon later.
Tying this together, the main() coroutine queries the status of the top 10 websites.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
# main coroutine async def main():     # list of top 10 websites to check     sites = ['https://www.google.com/',         'https://www.youtube.com/',         'https://www.facebook.com/',         'https://twitter.com/',         'https://www.instagram.com/',         'https://www.baidu.com/',         'https://www.wikipedia.org/',         'https://yandex.ru/',         'https://yahoo.com/',         'https://www.whatsapp.com/'         ]     # check the status of all websites     for url in sites:         # get the status for the url         status = await get_status(url)         # report the url and its status         print(f'{url:30}:\t{status}') |
Finally, we can create the main() coroutine and use it as the entry point to the asyncio program.
1 2 3 |
... # run the asyncio program asyncio.run(main()) |
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 |
# SuperFastPython.com # check the status of many webpages import asyncio from urllib.parse import urlsplit  # get the HTTP/S status of a webpage async def get_status(url):     # split the url into components     url_parsed = urlsplit(url)     # open the connection     if url_parsed.scheme == 'https':         reader, writer = await asyncio.open_connection(url_parsed.hostname, 443, ssl=True)     else:         reader, writer = await asyncio.open_connection(url_parsed.hostname, 80)     # send GET request     query = f'GET {url_parsed.path} HTTP/1.1\r\nHost: {url_parsed.hostname}\r\n\r\n'     # write query to socket     writer.write(query.encode())     # wait for the bytes to be written to the socket     await writer.drain()     # read the single line response     response = await reader.readline()     # close the connection     writer.close()     # decode and strip white space     status = response.decode().strip()     # return the response     return status  # main coroutine async def main():     # list of top 10 websites to check     sites = ['https://www.google.com/',         'https://www.youtube.com/',         'https://www.facebook.com/',         'https://twitter.com/',         'https://www.instagram.com/',         'https://www.baidu.com/',         'https://www.wikipedia.org/',         'https://yandex.ru/',         'https://yahoo.com/',         'https://www.whatsapp.com/'         ]     # check the status of all websites     for url in sites:         # get the status for the url         status = await get_status(url)         # report the url and its status         print(f'{url:30}:\t{status}')  # run the asyncio program asyncio.run(main()) |
Running the example first creates the main() coroutine and uses it as the entry point into the program.
The main() coroutine runs, defining a list of the top 10 websites.
The list of websites is then traversed sequentially. The main() coroutine suspends and calls the get_status() coroutine to query the status of one website.
The get_status() coroutine runs, parses the URL, and opens a connection. It constructs an HTTP GET query and writes it to the host. A response is read, decoded, and returned.
The main() coroutine resumes and reports the HTTP status of the URL.
This is repeated for each URL in the list.
The program takes about 5.6 seconds to complete, or about half a second per URL on average.
This highlights how we can use asyncio to query the HTTP status of webpages.
Nevertheless, it does not take full advantage of the asyncio to execute tasks concurrently.
1 2 3 4 5 6 7 8 9 10 |
https://www.google.com/Â Â Â Â Â Â : HTTP/1.1 200 OK https://www.youtube.com/Â Â Â Â Â Â : HTTP/1.1 200 OK https://www.facebook.com/Â Â Â Â : HTTP/1.1 302 Found https://twitter.com/Â Â Â Â Â Â Â Â Â Â : HTTP/1.1 200 OK https://www.instagram.com/Â Â Â Â : HTTP/1.1 200 OK https://www.baidu.com/Â Â Â Â Â Â Â Â : HTTP/1.1 200 OK https://www.wikipedia.org/Â Â Â Â : HTTP/1.1 200 OK https://yandex.ru/Â Â Â Â Â Â Â Â Â Â Â Â : HTTP/1.1 302 Moved temporarily https://yahoo.com/Â Â Â Â Â Â Â Â Â Â Â Â : HTTP/1.1 301 Moved Permanently https://www.whatsapp.com/Â Â Â Â : HTTP/1.1 302 Found |
Next, let’s look at how we might update the example to execute the coroutines concurrently.
Example of Checking Website Status Concurrently
A benefit of asyncio is that we can execute many coroutines concurrently.
There are a number of ways that this can be achieved.
We will explore two examples in this section.
- 1. Issues independent tasks and waits.
- 2. Gather the coroutines.
Let’s dive in.
Example Using asyncio.wait()
We can update the above example to query the status of URLs concurrently using asyncio.
One approach is to issue each get_status() coroutine as an independent task. We can then wait for all of the issued tasks to complete. Once complete, we can report the status of each URL.
Firstly, we can schedule coroutines as independent tasks using the asyncio.create_task() function that returns an asyncio.Task object.
We need to associate the URL with each task. This can be achieved using a dict where the asyncio.Task objects are taken as keys and the URL string is taken as the value. We can then look up the URL for a task later when reporting results.
1 2 3 |
... # create all task requests tasks_to_url = {asyncio.create_task(get_status(url)):url for url in sites} |
You can learn more about how to issue asyncio tasks in the tutorial:
This allows all tasks to be executed concurrently.
Next, we can wait for all issued tasks to complete.
This can be achieved using the asyncio.wait() function. It takes a collection of awaitables, suspends the caller, and by default returns once all tasks are complete.
1 2 3 |
... # wait for all tasks to complete _ = await asyncio.wait(tasks_to_url) |
This allows the scheduled tasks to run.
You can learn more about waiting for tasks in the tutorial:
Once all tasks are complete, the status of each URL can be reported.
We can traverse the dict of tasks, retrieve the URL for each, retrieve the status from the task and report the result.
1 2 3 4 5 6 7 8 9 |
... # report all results for task in tasks_to_url:     # get the url     url = tasks_to_url[task]     # get the status     status = task.result()     # report status     print(f'{url:30}:\t{status}') |
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 |
# SuperFastPython.com # check the status of many webpages import asyncio from urllib.parse import urlsplit  # get the HTTP/S status of a webpage async def get_status(url):     # split the url into components     url_parsed = urlsplit(url)     # open the connection     if url_parsed.scheme == 'https':         reader, writer = await asyncio.open_connection(url_parsed.hostname, 443, ssl=True)     else:         reader, writer = await asyncio.open_connection(url_parsed.hostname, 80)     # send GET request     query = f'GET {url_parsed.path} HTTP/1.1\r\nHost: {url_parsed.hostname}\r\n\r\n'     # write query to socket     writer.write(query.encode())     # wait for the bytes to be written to the socket     await writer.drain()     # read the single line response     response = await reader.readline()     # close the connection     writer.close()     # decode and strip white space     status = response.decode().strip()     # return the response     return status  # main coroutine async def main():     # list of top 10 websites to check     sites = ['https://www.google.com/',         'https://www.youtube.com/',         'https://www.facebook.com/',         'https://twitter.com/',         'https://www.instagram.com/',         'https://www.baidu.com/',         'https://www.wikipedia.org/',         'https://yandex.ru/',         'https://yahoo.com/',         'https://www.whatsapp.com/'         ]     # create all task requests     tasks_to_url = {asyncio.create_task(get_status(url)):url for url in sites}     # wait for all tasks to complete     _ = await asyncio.wait(tasks_to_url)     # report all results     for task in tasks_to_url:         # get the url         url = tasks_to_url[task]         # get the status         status = task.result()         # report status         print(f'{url:30}:\t{status}')  # run the asyncio program asyncio.run(main()) |
Running the example executes the main() coroutine as before.
In this case, a get_status() is created and scheduled for execution for each URL in the list.
The dict is created, mapping each asyncio.Task object to its URL string.
The main() coroutine then suspends and waits for all issued tasks to complete.
The tasks complete as fast as they are able. The suspend when awaiting the connection to open and when reading a response from the server. This allows cooperating multitasking between the tasks.
Once all tasks are complete, the main() coroutine resumes and results are reported.
This highlights how we can query website status concurrently using asyncio by scheduling tasks and waiting for them to be completed.
It is faster than the sequential version above, completing in about 1.3 seconds on my system.
1 2 3 4 5 6 7 8 9 10 |
https://www.google.com/Â Â Â Â Â Â : HTTP/1.1 200 OK https://www.youtube.com/Â Â Â Â Â Â : HTTP/1.1 200 OK https://www.facebook.com/Â Â Â Â : HTTP/1.1 302 Found https://twitter.com/Â Â Â Â Â Â Â Â Â Â : HTTP/1.1 200 OK https://www.instagram.com/Â Â Â Â : HTTP/1.1 200 OK https://www.baidu.com/Â Â Â Â Â Â Â Â : HTTP/1.1 200 OK https://www.wikipedia.org/Â Â Â Â : HTTP/1.1 200 OK https://yandex.ru/Â Â Â Â Â Â Â Â Â Â Â Â : HTTP/1.1 302 Moved temporarily https://yahoo.com/Â Â Â Â Â Â Â Â Â Â Â Â : HTTP/1.1 301 Moved Permanently https://www.whatsapp.com/Â Â Â Â : HTTP/1.1 302 Found |
Next, let’s look at another way to execute the coroutines concurrently.
Example using asyncio.gather()
We can query the status of websites concurrently in asyncio using the asyncio.gather() function.
This function takes one or more coroutines, suspends executing the provided coroutines, and returns the results from each as an iterable. We can then traverse the list of URLs and iterable of return values from the coroutines and report results.
This may be a simpler approach than the above.
First, we can create a list of coroutines.
1 2 3 |
... # create all coroutine requests coros = [get_status(url) for url in sites] |
Next, we can execute the coroutines and get the iterable of results using asyncio.gather().
Note that we cannot provide the list of coroutines directly, but instead must unpack the list into separate expressions that are provided as positional arguments to the function.
1 2 3 |
... # execute all coroutines and wait results = await asyncio.gather(*coros) |
This will execute all of the coroutines concurrently and retrieve their results.
You can learn more about the asyncio.gather() function in the tutorial:
We can then traverse the list of URLs and returned status and report each in turn.
1 2 3 4 5 |
... # process all results for url, status in zip(sites, results):     # report status     print(f'{url:30}:\t{status}') |
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 |
# SuperFastPython.com # check the status of many webpages import asyncio from urllib.parse import urlsplit  # get the HTTP/S status of a webpage async def get_status(url):     # split the url into components     url_parsed = urlsplit(url)     # open the connection     if url_parsed.scheme == 'https':         reader, writer = await asyncio.open_connection(url_parsed.hostname, 443, ssl=True)     else:         reader, writer = await asyncio.open_connection(url_parsed.hostname, 80)     # send GET request     query = f'GET {url_parsed.path} HTTP/1.1\r\nHost: {url_parsed.hostname}\r\n\r\n'     # write query to socket     writer.write(query.encode())     # wait for the bytes to be written to the socket     await writer.drain()     # read the single line response     response = await reader.readline()     # close the connection     writer.close()     # decode and strip white space     status = response.decode().strip()     # return the response     return status  # main coroutine async def main():     # list of top 10 websites to check     sites = ['https://www.google.com/',         'https://www.youtube.com/',         'https://www.facebook.com/',         'https://twitter.com/',         'https://www.instagram.com/',         'https://www.baidu.com/',         'https://www.wikipedia.org/',         'https://yandex.ru/',         'https://yahoo.com/',         'https://www.whatsapp.com/'         ]     # create all coroutine requests     coros = [get_status(url) for url in sites]     # execute all coroutines and wait     results = await asyncio.gather(*coros)     # process all results     for url, status in zip(sites, results):         # report status         print(f'{url:30}:\t{status}')  # run the asyncio program asyncio.run(main()) |
Running the example executes the main() coroutine as before.
In this case, a list of coroutines is created in a list comprehension.
The asyncio.gather() function is then called, passing the coroutines and suspending the main() coroutine until they are all complete.
The coroutines execute, querying each website concurrently and returning their status.
The main() coroutine resumes and receives an iterable of status values. This iterable along with the list of URLs is then traversed using the zip() built-in function and the statuses are reported.
This highlights a simpler approach to executing the coroutines concurrently and reporting the results after all tasks are completed.
It is also faster than the sequential version above, completing in about 1.4 seconds on my system.
1 2 3 4 5 6 7 8 9 10 |
https://www.google.com/Â Â Â Â Â Â : HTTP/1.1 200 OK https://www.youtube.com/Â Â Â Â Â Â : HTTP/1.1 200 OK https://www.facebook.com/Â Â Â Â : HTTP/1.1 302 Found https://twitter.com/Â Â Â Â Â Â Â Â Â Â : HTTP/1.1 200 OK https://www.instagram.com/Â Â Â Â : HTTP/1.1 200 OK https://www.baidu.com/Â Â Â Â Â Â Â Â : HTTP/1.1 200 OK https://www.wikipedia.org/Â Â Â Â : HTTP/1.1 200 OK https://yandex.ru/Â Â Â Â Â Â Â Â Â Â Â Â : HTTP/1.1 302 Moved temporarily https://yahoo.com/Â Â Â Â Â Â Â Â Â Â Â Â : HTTP/1.1 301 Moved Permanently https://www.whatsapp.com/Â Â Â Â : HTTP/1.1 302 Found |
Next, let’s explore how we might query HTTP statuses concurrently and report results dynamically, as they become available.
Free Python Asyncio Course
Download your FREE Asyncio PDF cheat sheet and get BONUS access to my free 7-day crash course on the Asyncio API.
Discover how to use the Python asyncio module including how to define, create, and run new coroutines and how to use non-blocking I/O.
Example of Checking Website Status Dynamically
We can execute many tasks concurrently and traverse them in the order that they are completed.
This approach can then be used to report website status dynamically, as the statuses become available.
This can be achieved using the asyncio.as_completed() function.
This function takes a collection of iterables as an argument, such as a list of coroutines. It then wraps each iterable in a coroutine for execution and returns an iterable that yields these new coroutines in the order that they are completed.
Although we can use the as_completed() method to report HTTP statuses dynamically, we have no easy way to relate the new wrapping coroutines to the URL provided as an argument.
Instead, we can update the get_status() coroutine to return the status and the URL together.
First, we can create a list of coroutines, one for each URL using a list comprehension.
1 2 3 |
... # create all coroutine requests coros = [get_status(url) for url in sites] |
We can then call the asyncio.as_completed() function with the list of coroutines. This returns a list of iterables that yield coroutines in the order they are completed.
You can learn more about the asyncio.as_completed() function in the tutorial:
We can await each in turn to get the return value, in this case, the HTTP status and URL, which can be reported.
1 2 3 4 5 6 7 |
... # traverse tasks in completion order for coro in asyncio.as_completed(coros):     # get status from task     status, url = await coro     # report status     print(f'{url:30}:\t{status}') |
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 |
# SuperFastPython.com # check the status of many webpages import asyncio from urllib.parse import urlsplit  # get the HTTP/S status of a webpage async def get_status(url):     # split the url into components     url_parsed = urlsplit(url)     # open the connection     if url_parsed.scheme == 'https':         reader, writer = await asyncio.open_connection(url_parsed.hostname, 443, ssl=True)     else:         reader, writer = await asyncio.open_connection(url_parsed.hostname, 80)     # send GET request     query = f'GET {url_parsed.path} HTTP/1.1\r\nHost: {url_parsed.hostname}\r\n\r\n'     # write query to socket     writer.write(query.encode())     # wait for the bytes to be written to the socket     await writer.drain()     # read the single line response     response = await reader.readline()     # close the connection     writer.close()     # decode and strip white space     status = response.decode().strip()     # return the response     return status, url  # main coroutine async def main():     # list of top 10 websites to check     sites = ['https://www.google.com/',         'https://www.youtube.com/',         'https://www.facebook.com/',         'https://twitter.com/',         'https://www.instagram.com/',         'https://www.baidu.com/',         'https://www.wikipedia.org/',         'https://yandex.ru/',         'https://yahoo.com/',         'https://www.whatsapp.com/'         ]     # create all coroutine requests     coros = [get_status(url) for url in sites]     # traverse tasks in completion order     for coro in asyncio.as_completed(coros):         # get status from task         status, url = await coro         # report status         print(f'{url:30}:\t{status}')  # run the asyncio program asyncio.run(main()) |
Running the example executes the main() coroutine as before.
In this case, a list of coroutines is created.
This is then provided to the asyncio.as_completed() function. This wraps each coroutine in a new coroutine (in case a timeout is needed) and returns an iterable.
We can traverse the iterable using a normal for loop which will automatically await the coroutines and yield them in the order they are completed.
Once yielded, we can retrieve the return values by awaiting them and reporting the website status.
This highlights how we can check HTTP status concurrently with asyncio and report results dynamically as tasks are completed.
The example may be slightly faster than the above concurrent examples, completing in about 1.2 seconds on my system.
1 2 3 4 5 6 7 8 9 10 |
https://www.youtube.com/Â Â Â Â Â Â : HTTP/1.1 200 OK https://www.facebook.com/Â Â Â Â : HTTP/1.1 302 Found https://www.google.com/Â Â Â Â Â Â : HTTP/1.1 200 OK https://www.whatsapp.com/Â Â Â Â : HTTP/1.1 302 Found https://twitter.com/Â Â Â Â Â Â Â Â Â Â : HTTP/1.1 200 OK https://www.instagram.com/Â Â Â Â : HTTP/1.1 200 OK https://www.wikipedia.org/Â Â Â Â : HTTP/1.1 200 OK https://yahoo.com/Â Â Â Â Â Â Â Â Â Â Â Â : HTTP/1.1 301 Moved Permanently https://www.baidu.com/Â Â Â Â Â Â Â Â : HTTP/1.1 200 OK https://yandex.ru/Â Â Â Â Â Â Â Â Â Â Â Â : HTTP/1.1 302 Moved temporarily |
Overwhelmed by the python concurrency APIs?
Find relief, download my FREE Python Concurrency Mind Maps
Further Reading
This section provides additional resources that you may find helpful.
Python Asyncio Books
- Python Asyncio Mastery, Jason Brownlee (my book!)
- Python Asyncio Jump-Start, Jason Brownlee.
- Python Asyncio Interview Questions, Jason Brownlee.
- Asyncio Module API Cheat Sheet
I also recommend the following books:
- Python Concurrency with asyncio, Matthew Fowler, 2022.
- Using Asyncio in Python, Caleb Hattingh, 2020.
- asyncio Recipes, Mohamed Mustapha Tahrioui, 2019.
Guides
APIs
- asyncio — Asynchronous I/O
- Asyncio Coroutines and Tasks
- Asyncio Streams
- Asyncio Subprocesses
- Asyncio Queues
- Asyncio Synchronization Primitives
References
Takeaways
You now know how to check the status of multiple web pages concurrently using async in Python.
Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.
Photo by Lance Asper on Unsplash
Do you have any questions?