Last Updated on September 12, 2022
The ThreadPoolExecutor class in Python can be used to scan multiple ports on a server at the same time.
This can dramatically speed up the process compared to attempting to connect to each port sequentially, one by one.
In this tutorial, you will discover how to develop a multithreaded port scanner in Python.
After completing this tutorial, you will know:
- How to open a socket connection to each port sequentially and how slow it can be.
- How to use the ThreadPoolExecutor to manage a pool of worker threads.
- How to update the code to connect to multiple ports at the same time and dramatically accelerate the process.
Let’s dive in.
Scan Ports One-by-One (slowly)
We can connect to other computers by opening a socket, called socket programming.
Opening a socket requires both the name or IP address of the server and a port number on which to connect.
For example, when your web browser opens a web page on python.org, it is opening a socket connection to that server on port 80, then using the HTTP protocol to request and download (GET) an HTML file.
Socket programming or network programming is a lot of fun.
A good first socket programming project is to develop a port scanner.
This is a program that reports all of the open sockets on a given server.
A simple way to implement a port scanner is to loop over all the ports you want to test and attempt to make a socket connection on each. If a connection can be made, we disconnect immediately and report that the port on the server is open.
For example, we know that port 80 is open on python.org, but what other ports might be open?
Historically, having many open ports on a server was a security risk, so it is common to lock down a public facing server and close all non-essential ports to external traffic. This means scanning public servers will likely yield few open ports in the best case, or will deny future access in the worst case, if the server thinks you’re trying to break in.
As such, although developing a port scanner is a fun socket programming exercise, we must be careful in how we use it and what servers we scan.
Next, let’s look at how we can open a socket connection on a single port.
Open a Socket Connection on a Port
Python provides socket communication in the socket module.
A socket must first be configured in terms of the type of host address and type of socket we will create, then the configured socket can be connected.
You can learn more about the socket module in Python here:
There are many ways to specify a host address, although perhaps the most common is the IP address (IPv4) or the domain name resolved by DNS. We can configure a socket to expect this type of address via the AF_INET constant.
There are also different socket types, the most common being a TCP or stream type socket and a less reliable UDP type socket. We will attempt to open TCP sockets in this case, as they are more commonly used for services like email, web, ftp, and so on. We can configure our socket for TCP using the SOCK_STREAM constant.
We can create and configure our socket as follows:
1 2 3 |
... # set a timeout of a few seconds sock = socket(AF_INET, SOCK_STREAM) |
We must close our socket once we are finished with it by calling the close() function; for example:
1 2 3 |
... # close the socket sock.close() |
While working with the socket, an exception may be raised for many reasons, such as an invalid address or a failure to connect. We must ensure that the connection is closed regardless, therefore we can automatically close the socket using the context manager; for example:
1 2 3 4 |
... # create and configure the socket with socket(AF_INET, SOCK_STREAM) as sock: # ... |
Next, we can further configure the socket before we open a connection.
Specifically, it is a good idea to set a timeout because attempting to open network connections can be slow. We want to give up connecting and raise an exception if a given number of seconds elapses and we still haven’t connected.
This can be achieved by calling the settimeout() function on the socket. In this case, we will use a somewhat aggressive timeout of 3 seconds.
1 2 3 |
... # set a timeout of a few seconds sock.settimeout(3) |
Finally, we can attempt to make a connection to a server.
This requires a host name and a port, which we can pair together into a tuple and pass to the connect() function.
For example:
1 2 3 |
... # attempt to connect sock.connect((host, port)) |
If the connection succeeds, we could start sending data to the server and receive it back via this socket using the protocol suggested by the port number. We don’t want to communicate with the server so we will close the connection immediately.
If the connection fails, an exception will be raised indicating that the port is likely not open (or not open to us).
Therefore, we can wrap the attempt to connect in some exception handling.
1 2 3 4 5 6 7 8 |
... # connecting may fail try: # attempt to connect sock.connect((host, port)) # a successful connection was made except: # ignore the failure, the port is closed to us |
Tying this together, the test_port_number() will take a host number and a port will return True if a socket can be opened or False otherwise.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
# returns True if a connection can be made, False otherwise def test_port_number(host, port): # create and configure the socket with socket(AF_INET, SOCK_STREAM) as sock: # set a timeout of a few seconds sock.settimeout(3) # connecting may fail try: # attempt to connect sock.connect((host, port)) # a successful connection was made return True except: # ignore the failure return False |
Next, let’s look at how we can use this function we have developed to scan a range of ports.
Scan a Range of Ports on a Server
We can scan a range of ports on a given host.
Many common internet services are provided on ports between 0 and 1024.
The viable range of ports is 0 to 65535, and you can see a list of the most common port numbers and the services that use them in the file /etc/services on POSIX systems.
Wikipedia also has a page that lists the most common port numbers:
We will limit our scanning to the range of 0 to 1024.
To scan a range of ports, we can repeatedly call our test_port_number() function that we developed in the previous section and report any ports that permit a connection as ‘open’.
The port_scan() function below implements this reporting any open ports that are discovered.
1 2 3 4 5 6 7 |
# scan port numbers on a host def port_scan(host, ports): print(f'Scanning {host}...') # scan each port number for port in ports: if test_port_number(host, port): print(f'> {host}:{port} open') |
Finally, we can call this function and specify the host and range of ports.
In this case, we will port scan python.org (out of love for python, not malicious intent).
1 2 3 4 5 6 |
... # define host and port numbers to scan HOST = 'python.org' PORTS = range(1024) # test the ports port_scan(HOST, PORTS) |
We would expect that at the least port 80 would be open for HTTP connections.
Tying this together, the complete example of port scanning a host in Python is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
# SuperFastPython.com # scan a range of port numbers on host one by one from socket import AF_INET from socket import SOCK_STREAM from socket import socket # returns True if a connection can be made, False otherwise def test_port_number(host, port): # create and configure the socket with socket(AF_INET, SOCK_STREAM) as sock: # set a timeout of a few seconds sock.settimeout(3) # connecting may fail try: # attempt to connect sock.connect((host, port)) # a successful connection was made return True except: # ignore the failure return False # scan port numbers on a host def port_scan(host, ports): print(f'Scanning {host}...') # scan each port number for port in ports: if test_port_number(host, port): print(f'> {host}:{port} open') # define host and port numbers to scan HOST = 'python.org' PORTS = range(1024) # test the ports port_scan(HOST, PORTS) |
Running the example attempts to make a connection for each port number between 0 and 1023 (one minus 1024) and reports all open ports.
In this case, we can see that port 80 for HTTP is open as expected, and port 443 is also open for HTTPS.
The program works fine, but it is painfully slow.
On my system, it took 235.8 seconds to complete (nearly 4 minutes).
1 2 3 |
Scanning python.org... > python.org:80 open > python.org:443 open |
Next, we will look at the ThreadPoolExecutor class that can be used to create a pool of worker threads that will allow us to speed up this port scanning process.
Run loops using all CPUs, download your FREE book to learn how.
Create a Pool of Worker Threads
We can use the ThreadPoolExecutor to speed up the process of scanning multiple ports on a host.
The ThreadPoolExecutor class is provided as part of the concurrent.futures module for easily running concurrent tasks.
The ThreadPoolExecutor provides a pool of worker threads, which is different from the ProcessPoolExecutor that provides a pool of worker processes.
Generally, ThreadPoolExecutor should be used for concurrent IO-bound tasks, like downloading URLs, and the ProcessPoolExecutor should be used for concurrent CPU-bound tasks, like calculating.
Using the ThreadPoolExecutor was designed to be easy and straightforward. It is like the “automatic mode” for Python threads.
- Create the thread pool by calling ThreadPoolExecutor().
- Submit tasks and get futures by calling submit().
- Wait and get results as tasks complete by calling as_completed().
- Shut down the thread pool by calling shutdown().
1. Create the Thread Pool
First, a ThreadPoolExecutor instance must be created. By default, it will create a pool of five times the number of CPUs you have available. This is good for most purposes.
1 2 3 |
... # create a thread pool with the default number of worker threads executor = ThreadPoolExecutor() |
You can run tens to hundreds of concurrent IO-bound threads per CPU, although perhaps not thousands or tens of thousands. You can specify the number of threads to create in the pool via the max_workers argument; for example:
1 2 3 |
... # create a thread pool with 10 worker threads executor = ThreadPoolExecutor(max_workers=10) |
2. Submit Tasks to the Tread Pool
Once created, it can send tasks into the pool to be completed using the submit() function.
This function takes the name of the function to call any and all arguments and returns a Future object.
The Future object is a promise to return the results from the task (if any) and provides a way to determine if a specific task has been completed or not.
1 2 3 |
... # submit a task future = executor.submit(task, arg1, arg2, ...) |
The return from a function executed by the thread pool can be accessed via the result() function on the Future object. It will wait until the result is available, if needed, or return immediately if the result is available.
For example:
1 2 3 |
... # get the result from a future result = future.result() |
3. Get Results as Tasks Complete
The beauty of performing tasks concurrently is that we can get results as they become available, rather than waiting for tasks to be completed in the order they were submitted.
The concurrent.futures module provides an as_completed() function that we can use to get results for tasks as they are completed, just like its name suggests.
We can call the function and provide it a list of future objects created by calling submit() and it will return future objects as they are completed in whatever order.
For example, we can use a list comprehension to submit the tasks and create the list of future objects:
1 2 3 |
... # submit all tasks into the thread pool and create a list of futures futures = [executor.submit(task, item) for item in items] |
Then get results for tasks as they complete in a for loop:
1 2 3 4 5 6 |
... # iterate over all submitted tasks and get results as they are available for future in as_completed(futures): # get the result result = future.result() # do something with the result... |
4. Shutdown the Thread Pool
Once all tasks are completed, we can close down the thread pool which will release each thread and any resources it may hold (e.g. the stack space).
1 2 3 |
... # shutdown the thread pool executor.shutdown() |
An easier way to use the thread pool is via the context manager (the “with” keyword), which ensures it is closed automatically once we are finished with it.
1 2 3 4 5 6 7 8 9 10 |
... # create a thread pool with ThreadPoolExecutor(max_workers=10) as executor: # submit tasks futures = [executor.submit(task, item) for item in items] # get results as they are available for future in as_completed(futures): # get the result result = future.result() # do something with the result... |
Now that we are familiar with ThreadPoolExecutor and how to use it, let’s look at how we can adapt our program for port scanning to make use of it.
Scan Ports Concurrently
The program for port scanning a server can be adapted to use the ThreadPoolExecutor with very little change.
The test_port_number() function was already called separately for each port. This can be performed in a separate thread so each port is tested concurrently.
We want to report port numbers in numerical order. This can be achieved by submitting the tasks to the thread pool using the map() function then iterating the True/False results returned for each port number.
Firstly, we can create the thread pool with one thread per port to be tested.
1 2 3 4 |
... # create the thread pool with ThreadPoolExecutor(len(ports)) as executor: # ... |
Next, we can call map() and apply the test_port_number() function to the host and each port number. To achieve, this we must create a parallel iterable for the host name that can be passed to the test_port_number() along with the port number
1 2 3 |
... # dispatch all tasks results = executor.map(test_port_number, [host]*len(ports), ports) |
We can then iterate the results along with the port numbers at the same time.
This can be achieved using the zip() function and passing it the iterable of port numbers and results, and it will yield one of each per iteration.
1 2 3 4 5 |
... # report results in order for port,is_open in zip(ports,results): if is_open: print(f'> {host}:{port} open') |
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 |
# SuperFastPython.com # scan a range of port numbers on a host concurrently from socket import AF_INET from socket import SOCK_STREAM from socket import socket from concurrent.futures import ThreadPoolExecutor # returns True if a connection can be made, False otherwise def test_port_number(host, port): # create and configure the socket with socket(AF_INET, SOCK_STREAM) as sock: # set a timeout of a few seconds sock.settimeout(3) # connecting may fail try: # attempt to connect sock.connect((host, port)) # a successful connection was made return True except: # ignore the failure return False # scan port numbers on a host def port_scan(host, ports): print(f'Scanning {host}...') # create the thread pool with ThreadPoolExecutor(len(ports)) as executor: # dispatch all tasks results = executor.map(test_port_number, [host]*len(ports), ports) # report results in order for port,is_open in zip(ports,results): if is_open: print(f'> {host}:{port} open') # define host and port numbers to scan HOST = 'python.org' PORTS = range(1024) # test the ports port_scan(HOST, PORTS) |
Running the program attempts to open a socket connection for all ports in the range 0 and 1023 and reports ports 80 and 443 open as before.
In this case, the program is dramatically faster.
On my system, it completed in about 3.1 seconds, compared to the 235.8 seconds for the serial case, which is about 76 times faster.
1 2 3 |
Scanning python.org... > python.org:80 open > python.org:443 open |
Free Python ThreadPoolExecutor Course
Download your FREE ThreadPoolExecutor PDF cheat sheet and get BONUS access to my free 7-day crash course on the ThreadPoolExecutor API.
Discover how to use the ThreadPoolExecutor class including how to configure the number of workers and how to execute tasks asynchronously.
Further Reading
This section provides additional resources that you may find helpful.
Books
- ThreadPoolExecutor Jump-Start, Jason Brownlee, (my book!)
- Concurrent Futures API Interview Questions
- ThreadPoolExecutor Class API Cheat Sheet
I also recommend specific chapters from the following books:
- Effective Python, Brett Slatkin, 2019.
- See Chapter 7: Concurrency and Parallelism
- Python in a Nutshell, Alex Martelli, et al., 2017.
- See: Chapter: 14: Threads and Processes
Guides
- Python ThreadPoolExecutor: The Complete Guide
- Python ProcessPoolExecutor: The Complete Guide
- Python Threading: The Complete Guide
- Python ThreadPool: The Complete Guide
APIs
References
Overwhelmed by the python concurrency APIs?
Find relief, download my FREE Python Concurrency Mind Maps
Takeaways
In this tutorial, you discovered how to develop a multithreaded port scanner in Python.
- How to open a socket connection to each port sequentially and how slow it can be.
- How to use the ThreadPoolExecutor to manage a pool of worker threads.
- How to update the code to connect to multiple ports at the same time and dramatically accelerate the process.
Do you have any questions?
Leave your question in a comment below and I will reply fast with my best advice.
Photo by Simon Connellan on Unsplash
Dave Hafiddave says
brother very nice tutorial legit but, i try to create multiple host using with open its not working any idea how to create with multiple host…?
thanks
dave
Jason Brownlee says
Try passing a list of hosts to the function, then issue all ports for all hosts to the pool, then iterate and report results.
Bob Wachunas says
your superfast port scanner is great. I mean, it’s faster than threader3000. the tutorial is nice too, great details.
Jason Brownlee says
Thank you!