Last Updated on October 29, 2022
You can map() a method that takes multiple arguments to tasks in the ThreadPool via the starmap() method.
In this tutorial you will discover how to issue tasks to the ThreadPool that takes multiple arguments in Python.
Let’s get started.
Problem with ThreadPool map()
The multiprocessing.pool.ThreadPool in Python provides a pool of reusable threads for executing ad hoc tasks.
A thread pool object which controls a pool of worker threads to which jobs can be submitted.
— multiprocessing — Process-based parallelism
The ThreadPool class extends the Pool class. The Pool class provides a pool of worker processes for process-based concurrency.
Although the ThreadPool class is in the multiprocessing module it offers thread-based concurrency and is best suited to IO-bound tasks, such as reading or writing from sockets or files.
A ThreadPool can be configured when it is created, which will prepare the new threads.
We can issue one-off tasks to the ThreadPool using methods such as apply() or we can apply the same function to an iterable of items using methods such as map().
Results for issued tasks can then be retrieved synchronously, or we can retrieve the result of tasks later by using asynchronous versions of the methods such as apply_async() and map_async().
The ThreadPool provides a version of the map() method where the target function is called for each item in the provided iterable in parallel.
The problem with the ThreadPool map() method is that it only takes one iterable of items, allowing only a single argument to the target task function. This is different from the built-in map() function.
Task functions that only take one argument is a severe limitation.
The ThreadPool starmap() method provides a way to workaround this limitation.
Run loops using all CPUs, download your FREE book to learn how.
How to Use ThreadPool starmap()
The ThreadPool provides a version of map() that permits multiple arguments to the target task function via the starmap() method.
The starmap() method takes the name of a function to apply and an iterable.
It will then convert the provided iterable into a list and issue one task for each item in the iterable.
Importantly, each item in the iterable provided to the starmap() method may itself be an iterable containing the arguments to provide to the target task function.
Like map() except that the elements of the iterable are expected to be iterables that are unpacked as arguments. Hence an iterable of [(1,2), (3, 4)] results in [func(1,2), func(3,4)].
— multiprocessing — Process-based parallelism
This allows the target task function to receive multiple arguments.
For example, we may have an iterable in which each item in the iterable is an iterable of arguments for each function call.
We might have a target task function that takes two arguments.
1 2 3 |
# target task function def task(arg1, arg2): # ... |
We may then define an iterable that contains 3 items and will result in 3 calls to the target task function.
1 2 3 |
... # define an iterable items = [(1,2), (3,4), (5,6)] |
Each item in the iterable is a tuple that contains two items, for the two arguments to the target task function.
We can issue this to the ThreadPool using the starmap() method and traverse the iterable of return values.
1 2 3 4 |
... # issue tasks and iterate the results from the issued tasks for result in pool.starmap(task, items): # ... |
This will result in three tasks in the ThreadPool, each calling the target task() function with two arguments:
- task(1,2)
- task(3,4)
- task(5,6)
Like the map() method the starmap() method allows us to issue tasks in chunks to the ThreadPool. That is, we can group a fixed number of items from the input iterable and issue them as one task to be executed by a worker thread.
This can make completing a large number of tasks in a very long iterable more efficient as arguments and return values from the target task function can be transmitted in batches with less computational overhead.
This can be achieved via the “chunksize” argument to starmap().
The chunksize argument is the same as the one used by the map() method. For very long iterables using a large value for chunksize can make the job complete much faster than using the default value of 1.
— multiprocessing — Process-based parallelism
For example:
1 2 3 4 |
... # issue tasks in chunks and iterate the results for result in pool.starmap(task, items, chunksize=10): # ... |
Next, let’s take a closer look at how the starmap() method compares to other methods on the ThreadPool.
Difference Between starmap() and map()
How does the starmap() method compare to the map() method for issuing tasks to the ThreadPool?
Both the starmap() and map() may be used to issue tasks that call a function to all items in an iterable via the ThreadPool.
The key difference between the starmap() method and the map() method is that starmap() supports a target function with more than one argument, whereas the map() method supports target functions with only one argument.
Free Python ThreadPool Course
Download your FREE ThreadPool PDF cheat sheet and get BONUS access to my free 7-day crash course on the ThreadPool API.
Discover how to use the ThreadPool including how to configure the number of worker threads and how to execute tasks asynchronously
Difference Between starmap() and starmap_async()
How does the starmap() method compare to the starmap_async() for issuing tasks to the ThreadPool?
Both the starmap() and starmap_async() may be used to issue tasks that call a function in the ThreadPool with more than one argument.
The following summarizes the key differences between these two methods:
- The starmap() method blocks, whereas the starmap_async() method does not block.
- The starmap() method returns an iterable of return values from the target function, whereas the starmap_async() method returns an AsyncResult.
- The starmap() method does not support callback functions, whereas the starmap_async() method can execute callback functions on return values and errors.
The starmap() method should be used for issuing target task functions to the ThreadPool where the caller can or must block until all function calls are complete.
The starmap_async() method should be used for issuing target task functions to the ThreadPool where the caller cannot or must not block while the task is executing.
Now that we know how to use the starmap() method to execute tasks in the ThreadPool, let’s look at some worked examples.
Overwhelmed by the python concurrency APIs?
Find relief, download my FREE Python Concurrency Mind Maps
Example of ThreadPool starmap()
We can explore how to use the starmap() method with the ThreadPool.
In this example, we can define a target task function that takes two arguments, reports the values then returns the values that were passed in. We can then call this function for each integer between 0 and 9 using the ThreadPool starmap().
This will apply the function to each integer in parallel using as many cores as are available in the system.
Firstly, we can define the target task function.
The function takes an integer identifier and floating point value. It reports the values, then blocks for a fraction of a second to simulate computational effort, then returns the values that were provided as arguments.
The task() function below implements this.
1 2 3 4 5 6 7 8 |
# task executed in a worker thread def task(identifier, value): # report a message print(f'Task {identifier} executing with {value}') # block for a moment sleep(value) # return the generated value return (identifier, value) |
We can then create and configure a ThreadPool.
We will use the context manager interface to ensure the pool is shutdown automatically once we are finished with it.
If you are new to the context manager interface of the ThreadPool, you can learn more in the tutorial:
1 2 3 4 |
... # create and configure the thread pool with ThreadPool() as pool: # ... |
We can then define an iterable that provides the arguments to the task() function. The iterable will be a list of tuples, where each tuple will contain an integer value and randomly generated floating point value between 0 and 1.
This can be prepared in a list comprehension.
1 2 3 |
... # prepare arguments items = [(i, random()) for i in range(10)] |
We can then call the starmap() method on the thread pool to apply our task() function to each tuple of arguments in the prepared list.
This returns an iterator over the results returned from the task() function, in the order that function calls are completed. We will iterate over the results and report each in turn.
This can all be achieved in a for-loop.
1 2 3 4 |
... # execute tasks and thread results in order for result in pool.starmap(task, items): print(f'Got result: {result}') |
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
# SuperFastPython.com # example of using starmap() with the thread pool from random import random from time import sleep from multiprocessing.pool import ThreadPool # task executed in a worker thread def task(identifier, value): # report a message print(f'Task {identifier} executing with {value}') # block for a moment sleep(value) # return the generated value return (identifier, value) # protect the entry point if __name__ == '__main__': # create and configure the thread pool with ThreadPool() as pool: # prepare arguments items = [(i, random()) for i in range(10)] # execute tasks and thread results in order for result in pool.starmap(task, items): print(f'Got result: {result}') # thread pool is closed automatically |
Running the example first creates the ThreadPool with a default configuration.
It will have one worker thread for each logical CPU in your system.
The list of function arguments is then prepared, then the starmap() method is then called for the target function and the list of arguments.
This issues ten calls to the task() function, one for each tuple of arguments. An iterator is returned with the result for each function call, in order.
Each call to the task function reports the arguments as a message, blocks, then returns a tuple of the arguments.
The main thread iterates over the values returned from the calls to the task() function and reports the values, matching those reported in each worker thread.
Importantly, all task() function calls are issued and executed before the iterator of results is returned. We cannot iterate over results as they are completed by the caller.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
Task 0 executing with 0.6060584937888399 Task 1 executing with 0.6441783549779412 Task 2 executing with 0.8652443674754822 Task 3 executing with 0.6742852940641476 Task 4 executing with 0.015188976542785282 Task 5 executing with 0.32070199934525667 Task 6 executing with 0.8702579991525552 Task 7 executing with 0.909027807263158 Task 8 executing with 0.3653024671594146 Task 9 executing with 0.5517635063782863 Got result: (0, 0.6060584937888399) Got result: (1, 0.6441783549779412) Got result: (2, 0.8652443674754822) Got result: (3, 0.6742852940641476) Got result: (4, 0.015188976542785282) Got result: (5, 0.32070199934525667) Got result: (6, 0.8702579991525552) Got result: (7, 0.909027807263158) Got result: (8, 0.3653024671594146) Got result: (9, 0.5517635063782863) |
Next, let’s look at an example where we might call a starmap() for a function with no return value.
Example of starmap() with No Return Value
We can explore using the starmap() method to call a target task function that does not have a return value.
This means that we are not interested in the iterable of results returned by the call to starmap() and instead are only interested that all issued tasks get executed.
This can be achieved by updating the previous example so that the task() function does not return a value.
The updated task() function with this change is listed below.
1 2 3 4 5 6 |
# task executed in a worker thread def task(identifier, value): # report a message print(f'Task {identifier} executing with {value}') # block for a moment sleep(value) |
Then, in the main thread, we can call starmap() with our task() function and the list of arguments, and not iterate the results.
1 2 3 4 5 |
... # prepare arguments items = [(i, random()) for i in range(10)] # issue tasks to the thread pool and wait for tasks to complete pool.starmap(task, items) |
Importantly, the call to starmap() on the ThreadPool will block the main thread until all issued tasks are completed.
Once completed, the call will return and the ThreadPool will be closed by the context manager.
This is a helpful pattern to issue many tasks to the ThreadPool with a single function call.
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
# SuperFastPython.com # example of using starmap() in the thread pool with no return values from random import random from time import sleep from multiprocessing.pool import ThreadPool # task executed in a worker thread def task(identifier, value): # report a message print(f'Task {identifier} executing with {value}') # block for a moment sleep(value) # protect the entry point if __name__ == '__main__': # create and configure the thread pool with ThreadPool() as pool: # prepare arguments items = [(i, random()) for i in range(10)] # issue tasks to the thread pool and wait for tasks to complete pool.starmap(task, items) # thread pool is closed automatically |
Running the example first creates the ThreadPool with a default configuration.
The list of arguments is prepared, then the starmap() method is then called to apply the target function to each tuple in the list. This issues ten tasks into the ThreadPool. An iterator is returned with the result for each function call, but is ignored in this case.
Each call to the task() function reports a message, then blocks.
The main thread blocks until the starmap() method returns.
The tasks finish, starmap() returns, then the ThreadPool is closed.
This example again highlights that the call to starmap() blocks until all issued tasks are completed.
1 2 3 4 5 6 7 8 9 10 |
Task 0 executing with 0.23574918112447885 Task 1 executing with 0.614080352361466 Task 2 executing with 0.9237210178099842 Task 3 executing with 0.0422554532353151 Task 4 executing with 0.8254243293448289 Task 5 executing with 0.9907222951047835 Task 6 executing with 0.4682067818290677 Task 7 executing with 0.7036418914152218 Task 8 executing with 0.20575386373453264 Task 9 executing with 0.16702207772665179 |
Next, let’s explore the chunksize argument to the starmap() method.
Example of starmap() with chunksize
The starmap() method will apply a function to each item in an iterable.
If the iterable has a large number of items, it may be inefficient to issue function calls to the target function for each item.
This is for two reasons:
- One task is created for each item in the iterable to be passed as an argument to the target function.
- The argument data for each task must be transmitted to a worker thread and the return value transmitted back to the parent thread.
A more efficient approach would be to divide the items in the iterable into chunks and issue chunks of items to each worker thread to which the target function can be applied.
This can be achieved with the “chunksize” argument to the starmap() method.
The example below updates the example to use a chunksize of 2 when issuing tasks into the ThreadPool. With 10 function calls, this results in 5 tasks, each composed of two calls to the target task function.
1 2 3 4 |
... # execute tasks and thread results in order for result in pool.starmap(task, items, chunksize=2): print(f'Got result: {result}') |
Tying this together, the complete example is listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
# SuperFastPython.com # example of using starmap() in the thread pool with chunksize from random import random from time import sleep from multiprocessing.pool import ThreadPool # task executed in a worker thread def task(identifier, value): # report a message print(f'Task {identifier} executing with {value}') # block for a moment sleep(value) # return the generated value return (identifier, value) # protect the entry point if __name__ == '__main__': # create and configure the thread pool with ThreadPool() as pool: # prepare arguments items = [(i, random()) for i in range(10)] # execute tasks and thread results in order for result in pool.starmap(task, items, chunksize=2): print(f'Got result: {result}') # thread pool is closed automatically |
Running the example first creates the ThreadPool of thread workers.
The starmap() method is then called with an iterable of 10 items and a chunksize of 2.
This issues 5 units of work to the ThreadPool, each unit of work composed of 2 calls to the task() function.
Each call to the task function reports a message, blocks, then returns a tuple value.
The main thread iterates over the values returned from the calls to the task() function and reports the values, matching those reported in each worker thread.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
Task 0 executing with 0.8049569283892813 Task 2 executing with 0.8075512472613714 Task 4 executing with 0.5600682631117035 Task 6 executing with 0.298985657024606 Task 8 executing with 0.5933199379889679 Task 7 executing with 0.7766733652568677 Task 5 executing with 0.20759174230481925 Task 9 executing with 0.06910961363177981 Task 1 executing with 0.6712381296530213 Task 3 executing with 0.8741967167146941 Got result: (0, 0.8049569283892813) Got result: (1, 0.6712381296530213) Got result: (2, 0.8075512472613714) Got result: (3, 0.8741967167146941) Got result: (4, 0.5600682631117035) Got result: (5, 0.20759174230481925) Got result: (6, 0.298985657024606) Got result: (7, 0.7766733652568677) Got result: (8, 0.5933199379889679) Got result: (9, 0.06910961363177981) |
Further Reading
This section provides additional resources that you may find helpful.
Books
- Python ThreadPool Jump-Start, Jason Brownlee (my book!)
- Threading API Interview Questions
- ThreadPool PDF Cheat Sheet
I also recommend specific chapters from the following books:
- Python Cookbook, David Beazley and Brian Jones, 2013.
- See: Chapter 12: Concurrency
- Effective Python, Brett Slatkin, 2019.
- See: Chapter 7: Concurrency and Parallelism
- Python in a Nutshell, Alex Martelli, et al., 2017.
- See: Chapter: 14: Threads and Processes
Guides
- Python ThreadPool: The Complete Guide
- Python Multiprocessing Pool: The Complete Guide
- Python ThreadPoolExecutor: The Complete Guide
- Python Threading: The Complete Guide
APIs
References
Takeaways
You now know how to issue tasks to the ThreadPool that take multiple arguments in Python.
Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.
Photo by Gene Gallin on Unsplash
Do you have any questions?