In this post, we will discuss the concept of Threads in Python.
A thread refers to a group of statements or a separate execution path. For example, if you execute a Python program, the PVM(Python Virtual Machine) executes each statement one after the other. This way of execution is possible through threads, as the PVM relies on threads to execute the statements.
It means that whenever you try executing a Python program, PVM assigns a thread internally always to run and execute the statements in the program. Let us find out what a thread is in the program below.
#Python program to understand the concept of threads
#importing the threading module to work with threads
import threading
#name of the current thread
print("Thread that's running currently: ",threading.current_thread().getName())
#checking if current thread is main thread
if threading.current_thread() == threading.main_thread():
print("In this program, the current thread is the main thread!")
else:
print("In this program, the current thread is not the main thread!")
Output:
Thread that's running currently: MainThread
In this program, the current thread is the main thread!
In the program above, we imported the ‘threading‘ module to work with the concept of threads in the program. We used the current_thread() method from this module to know the name of the current Thread. Since it is an object, we also had to use the getName() method to print the name.
We also checked using an ‘if‘ condition whether the current Thread is the main Thread of the program using the current_thread() and main_thread() methods, respectively, from the threading module. And it looks like the current Thread in the above program is the program’s main Thread.
As mentioned earlier, a Thread is nothing but something that refers to the execution of statements. You can execute the statements in two ways, as discussed below.
- Single Tasking
- Multitasking
Single Tasking
In single-tasking, we give only one task to the processor for execution, and the processor remains idle for the rest of the time. Let us consider the diagram shown below. Say if the first task takes 10 minutes for execution, the processor executes the task in 10 minutes and remains idle until the next job arrives. Say if the second job takes 7 minutes for execution, the processor executes the second task or job for 7 minutes and remains idle until the arrival of the next job. The same goes for the third job from the diagram. Depending on the processing time of the third job, the processor executes it for the desired time and remains idle.
You can notice that idle time is higher in the Single Tasking. Following this concept in your program is not advisable unless the real-world requirements force you to do so. For instance, printing papers on a printer might require you to follow the Single Tasking approach.
Multitasking
Just as the name sounds, if a processor runs several jobs simultaneously, it is called Multitasking. It is easy to read in a sentence, but how can we let the processor run several jobs? For example, if there are four jobs, as shown in the diagram below and say if the processor wants to execute all the four jobs in one second, it could allocate 25 per cent of each second to each job and then execute the jobs one after the other.
If the job under this sort of execution is incomplete, its results are stored in temporary memory, and then the execution of the following job continues. It is the same for the other jobs as well. Their results get stored in temporary memory until the completion, and the processor always picks up the next job for execution after the prescribed time limit. Since tasks or jobs get executed circularly one after the other, it is called the round-robin method.
When the processor resumes the first task for execution, it begins exactly where it had left earlier. This way, all the tasks get processed at the same time. This process is called Multitasking in Python. You might not notice this difference as it happens faster.
There are two kinds of Multitasking in Python – Process-based Multitasking and Thread Based Multitasking. The example mentioned above for the Multitasking is good for the process-based Multitasking.
Let us now understand what a Thread based multitasking means. While several programs are executed simultaneously by a processor in process-based Multitasking, the thread-based Multitasking executes several parts of the same program, as shown in the diagram below.
As you can see in the diagram above, the program has two parts, and each part contains a block of code that performs a separate action. The processor relies on two individual threads to execute these parts. Since the processor executes these two different threads of the program simultaneously, it is called thread-based Multitasking.
What is the difference between a Process and a Thread?
A process refers to a group of statements that the PVM(Python Virtual Machine) executes by relying on the main Thread. You can take a running program as an illustration for a process. Each process has its memory, a stack that holds the data, and a program counter which tracks the instruction under execution. The data from a process is usually separated from other processes. It means that a process cannot have data from other processes unless both the processes communicate explicitly.
A thread also refers to a group of statements within a program. We have to create each Thread separately, and eventually, the main Thread will run all the threads in a program. Unlike processes, the threads do not have a program counter. A thread can access the data of another thread easily. It also means that a thread can manipulate the data of another thread in a program.
A program generally uses resources such as processor time and memory, referred to as a heavy-weight process. However, the Thread, a small part of the program, takes less processor time and memory and is called a lightweight process.
Concurrent Programming and GIL
In Python, it is possible to have multiple processes work at the same time. Similarly, it is also possible to have multiple threads to execute different parts of a program simultaneously. If the parts or tasks of a program are executed simultaneously, we call it ‘concurrent programming‘.
When more than one Thread is running at a time, one Thread’s data can become available to another thread and might go unnecessary changes. This can happen potentially when more than one Thread is performing on the data simultaneously. It could lead us to incorrect results.
PVM(Python Virtual Machine) therefore uses a Global Interpreter Lock (GIL) internally to let only one Thread to execute at any particular instant. The GIL does not allow more than one Thread to run simultaneously. However, this can be a hurdle for those who want to write concurrent programs in Python. The programmer is restricted to relying on only one processor by GIL even if multiple processors are available. This restriction is not imposed on normal programs we have seen so far in our tutorial series. GIL is imposed only when huge amounts of CPU processing are required.
Advantages of Threads
- Threads are handy when you want to action more than one task simultaneously.
- They are usually seen in server-side programs to provide the requirements of several clients in the same network.
- They are also used in animation and games that involve moving several objects from one place to another simultaneously.
This chapter is more like an introduction to the concept of threads in Python. The next chapter will look at how to create threads in Python.
Link to all chapters of our Python tutorial series here: Learn Python
Please feel free to comment your views in the comments section below, and we will get in touch with you. You can also reach us at ‘isapna2030@gmail.com’.