Multi-threading vs Multi-processing programming in Python

INTRODUCTION

This post attempts to explain the difference between multi-threading and multi-processing in simple terms, and uses visualization to clarify the distinction.

MULTI-THREADING PROGRAMMING

Multithreading programming is a powerful technique that allows a program to perform multiple tasks concurrently. By dividing a program into multiple threads, the program can utilize the processing power of multiple CPUs at once, resulting in significant improvements in performance. However, multithreaded programming can be challenging, as it requires careful management of shared resources and synchronization between threads to avoid conflicts and ensure correct behaviour. Nonetheless, with proper implementation, multithreading can be an effective way to improve the efficiency and responsiveness of a program.

MULTI-PROCESSING PROGRAMMING

Multiprocessing is a technique used in computer programming to utilize multiple processors or cores in a computer system to perform a task. This technique is particularly useful for tasks that require a lot of computing power or for applications that need to run multiple tasks simultaneously. By dividing tasks among multiple processors, multiprocessing can significantly improve the performance and speed of a program. However, it also requires careful planning and implementation to ensure that the different processes do not interfere with each other and that the overall system remains stable and reliable.

SCENARIO

Do you remember Peter’s morning routine from our previous post? If not, click on the following link to get to know Peter better: https://semfionetworks.com/blog/sequential-vs-asynchronous-programming-in-python/

We want to run the same program in a multi-threaded style. Multi-threading can give the illusion of performing all tasks simultaneously, but tasks actually execute concurrently. A thread can be defined as a flow of execution. Consider Peter at the breakfast table now: he takes a bite of breakfast, drinks a sip of coffee, checks one notification, and this process repeats.

# Peter's morning
#wake-up; then, freshen-up
import time
import threading
 
def eat_breakfast():
    time.sleep(4)
    print("A delicious omelette")
 
def drink_coffee():
    time.sleep(3)
    print("Wow, it was a nice hot black coffee")
 
def scroll_instagram():
    time.sleep(5)
    print("You're all caught up")
 
# time.perf_counter() returns the value of a performance counter    
start = time.perf_counter()
# to define the threads
a = threading.Thread(target=eat_breakfast, args=())
b = threading.Thread(target=drink_coffee, args=())
c = threading.Thread(target=scroll_instagram, args=())
 
# to start the threads
a.start()
b.start()
c.start()
 
# to synchronize the threads
a.join()
b.join()
c.join()
 
end = time.perf_counter()
print("it's done in:",int(end - start),"seconds")

COUNTING WITH MULTI-THREADING AND MULTI-PROCESSING

Let’s count some numbers! First, let’s define a simple function to count from zero to one billion and measure the processing time.

import time
 
def counter(number):
    init_amount = 0
    while init_amount < number:
        init_amount += 1
start = time.perf_counter()
counter(1000000000)
end = time.perf_counter()
print("it's done in:",int(end - start),"Seconds")

To convert this program into a multi-processing Python code, we need to first import the relevant library. Next, we need to divide the counting task among different CPU cores. Instead of counting from zero to one billion, we can create four separate counting tasks, with each task counting from zero to two hundred and fifty million, and assign each task to a separate CPU core.

from multiprocessing import Process, cpu_count
import time
 
def counter(number):
    init_amount = 0
    while init_amount < number:
        init_amount += 1
 
def main():
    start = time.perf_counter()
    a = Process(target=counter, args=(250000000,))
    b = Process(target=counter, args=(250000000,))
    c = Process(target=counter, args=(250000000,))
    d = Process(target=counter, args=(250000000,))
 
 
    a.start()
    b.start()
    c.start()
    d.start()
 
    a.join()
    b.join()
    c.join()
    d.join()
 
    print("PROCESSING")
    end = time.perf_counter()
    print("done in: ", int(end - start), 'Seconds')
    print("Your CPU has:",cpu_count(),"cores")
 
 
 
if __name__ == '__main__':
    main()

Note: in multi processing format, since we have child processes, the program must execute with:

if __name__ == '__main__':
        main()

I bet the difference in processing time is eye-catching. In the previous scenario, when we changed Peter’s routine from sequential to multi-threading, we observed an optimization in processing time. Similarly, in counting, when we change from the regular single-core CPU to multi-processing, we can expect similar improvements.

What do you think about running the counting program with a multi-threaded style? Before running the below snippet, take a minute or two to analyze each format and take a guess. Then run the program and compare your guess with the result.

from threading import Thread
import time
 
def counter(number):
    init_amount = 0
    while init_amount < number:
        init_amount += 1
 
print("Counting...")        
start = time.perf_counter()
a = Thread(target=counter, args=(250000000,))
b = Thread(target=counter, args=(250000000,))
c = Thread(target=counter, args=(250000000,))
d = Thread(target=counter, args=(250000000,))
 
 
a.start()
b.start()
c.start()
d.start()
 
a.join()
b.join()
c.join()
d.join()
 
end = time.perf_counter()
print("done in: ", int(end - start), 'Seconds')

What was your guess?

Yes, surprisingly for this program, the multi-thread processing time is equal to the sequential programming time. The following GIF file visualizes each style of programming for this case.

TO SUM UP

Concurrency Type

 

Features

 

Use Criteria

 

Metaphor

 

Multiprocessing

 

Multiple processes, high CPU utilization.

 

CPU-bound

 

We have ten kitchens, ten chefs, ten dishes to cook.

 

Threading

 

Single process, multiple threads, pre-emptive multitasking, OS decides task switching.

 

Fast I/O-bound

 

We have one kitchen, ten chefs, ten dishes to cook. The kitchen is crowded when the ten chefs are present together.

 

AsyncIO

 

Single process, single thread, cooperative multitasking, tasks cooperatively decide switching.

 

Slow I/O-bound

 

We have one kitchen, one chef, ten dishes to cook.

 

In Python, both multi-threading and multi-processing are used to improve performance. Multi-threading allows multiple threads to execute concurrently within the same process. This is useful when there is a lot of I/O bound work, such as waiting for user input or network communication. On the other hand, multi-processing allows multiple processes to execute in parallel on different CPU cores. This is useful when there is a lot of CPU bound work, such as performing complex calculations.

While both approaches have their advantages, multi-processing can be more efficient in certain cases due to the Global Interpreter Lock (GIL) in Python, which limits the performance gains of multi-threading. Additionally, multi-processing can be more resilient to errors since a crash in one process will not affect others. However, multi-threading is easier to manage and has lower overhead than multi-processing since it does not require as much memory or context switching. Ultimately, the choice between multi-threading and multi-processing will depend on the specific requirements of the program and the resources available.

Post by Amin Sedighfar

Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments