Multithreading and Multiprocessing in Python - Interview Questions and Answers
Multithreading uses multiple threads within the same process to execute tasks concurrently, while multiprocessing creates separate processes, each with its own memory space. Multiprocessing avoids the Global Interpreter Lock (GIL), making it better for CPU-bound tasks.
The GIL is a mechanism in CPython that ensures only one thread executes Python bytecode at a time. This means Python threads cannot fully utilize multi-core processors for CPU-bound tasks.
Since the GIL restricts threads from running in parallel for CPU-bound tasks, multiprocessing is preferred when you need true parallel execution.
The threading
module in Python provides support for creating and managing threads.
The multiprocessing
module provides support for process-based parallelism.
import threading
def print_message():
print("Hello from thread")
t = threading.Thread(target=print_message)
t.start()
t.join()
import multiprocessing
def print_message():
print("Hello from process")
p = multiprocessing.Process(target=print_message)
p.start()
p.join()
import threading
def greet(name):
print(f"Hello, {name}")
t = threading.Thread(target=greet, args=("Alice",))
t.start()
t.join()
import multiprocessing
def greet(name):
print(f"Hello, {name}")
p = multiprocessing.Process(target=greet, args=("Alice",))
p.start()
p.join()
import threading
import time
def task():
print("Starting task")
time.sleep(2)
print("Task completed")
t = threading.Thread(target=task)
t.start()
t.join()
t.is_alive() # Returns True if the thread is still running
Use .join()
on the thread to make the main program wait for it to finish.
import threading
def task():
print(f"Thread name: {threading.current_thread().name}")
t = threading.Thread(target=task, name="WorkerThread")
t.start()
t.join()
Thread
runs in the same memory space, while Process
runs in a separate memory space, making multiprocessing better for CPU-intensive tasks.
Use t.daemon = True
before starting the thread.
Daemon threads run in the background and exit automatically when the main program exits.
Use shared objects like lists, dictionaries, or synchronization mechanisms like threading.Lock
.
Use multiprocessing.Queue
, multiprocessing.Value
, or multiprocessing.Array
.
A Lock
is used to prevent multiple threads from accessing shared resources at the same time.
import threading
lock = threading.Lock()
def task():
with lock:
print("Thread is running with lock")
t1 = threading.Thread(target=task)
t2 = threading.Thread(target=task)
t1.start()
t2.start()
t1.join()
t2.join()
A Semaphore
is a synchronization primitive that limits the number of threads that can access a resource at the same time.
import threading
semaphore = threading.Semaphore(2)
def task():
with semaphore:
print(f"Thread {threading.current_thread().name} is running")
import time
time.sleep(2)
threads = [threading.Thread(target=task) for _ in range(5)]
for t in threads:
t.start()
for t in threads:
t.join()
An Event
is a flag that can be set or cleared to control thread execution.
import threading
event = threading.Event()
def wait_for_event():
print("Thread waiting for event...")
event.wait()
print("Event received, thread running!")
t = threading.Thread(target=wait_for_event)
t.start()
import time
time.sleep(2)
event.set()
t.join()
A Barrier
allows multiple threads to synchronize at a common point before proceeding.
import threading
barrier = threading.Barrier(3)
def task():
print(f"{threading.current_thread().name} waiting at barrier")
barrier.wait()
print(f"{threading.current_thread().name} passed barrier")
threads = [threading.Thread(target=task) for _ in range(3)]
for t in threads:
t.start()
for t in threads:
t.join()
threading.Queue
is used for inter-thread communication.multiprocessing.Queue
is used for inter-process communication.
Use queue.Queue
for thread-safe operations.
import queue
import threading
q = queue.Queue()
def worker():
while not q.empty():
item = q.get()
print(f"Processing {item}")
q.task_done()
for i in range(5):
q.put(i)
threads = [threading.Thread(target=worker) for _ in range(2)]
for t in threads:
t.start()
for t in threads:
t.join()
Python does not provide a direct way to terminate threads. Use flags or threading.Event.
It creates thread-local storage, allowing data to be unique per thread.
It allows parallel execution of a function using multiple processes.
import multiprocessing
def square(n):
return n * n
with multiprocessing.Pool(4) as pool:
result = pool.map(square, [1, 2, 3, 4, 5])
print(result)
from multiprocessing import Manager, Process
def worker(shared_list):
shared_list.append("Hello from process")
with Manager() as manager:
shared_list = manager.list()
p = Process(target=worker, args=(shared_list,))
p.start()
p.join()
print(shared_list)
apply()
runs a function in one process and returns the result.apply_async()
runs a function asynchronously.map()
applies a function to all elements of an iterable in parallel.map_async()
does the same but asynchronously.
Value
allows sharing a single value between processes, while Array
allows sharing a list.
Use multiprocessing.Queue
, which is process-safe by default.
It prevents multiple processes from accessing a shared resource simultaneously.
import multiprocessing
lock = multiprocessing.Lock()
def task():
with lock:
print("Process is running with lock")
p1 = multiprocessing.Process(target=task)
p2 = multiprocessing.Process(target=task)
p1.start()
p2.start()
p1.join()
p2.join()
It allows a process to wait until another process meets a certain condition.
ThreadPoolExecutor
is used for I/O-bound tasks, while ProcessPoolExecutor
is used for CPU-bound tasks.
from concurrent.futures import ProcessPoolExecutor
def square(n):
return n * n
with ProcessPoolExecutor() as executor:
results = executor.map(square, [1, 2, 3, 4, 5])
print(list(results))
Use synchronization primitives like Lock
, Semaphore
, or Queue
.
os.fork()
is Unix-specific and creates a child process by duplicating the parent process.multiprocessing.Process
is cross-platform and starts a new process.
No, due to the GIL, threads execute one at a time. Only multiprocessing achieves true parallel execution.
- Deadlocks due to improper locking.
- Pickling errors when passing non-picklable objects.
- Resource leaks if processes are not properly closed.
A daemon thread runs in the background and terminates when all non-daemon threads finish.
import threading
import time
def background_task():
while True:
print("Daemon thread running...")
time.sleep(2)
thread = threading.Thread(target=background_task, daemon=True)
thread.start()
time.sleep(5)
print("Main thread exiting")
Here, the daemon thread stops when the main thread exits.
Use the join(timeout)
method.
import threading
def task():
import time
time.sleep(10)
thread = threading.Thread(target=task)
thread.start()
thread.join(timeout=5) # Waits for 5 seconds
if thread.is_alive():
print("Thread is still running, timeout occurred")
A worker thread executes tasks asynchronously in a pool, improving performance in I/O-bound applications.
Use ThreadPoolExecutor
from concurrent.futures
.
from concurrent.futures import ThreadPoolExecutor
def worker(n):
return n * n
with ThreadPoolExecutor(max_workers=4) as executor:
results = executor.map(worker, [1, 2, 3, 4, 5])
print(list(results))
Pipe
allows bidirectional communication between two processes, while Queue
supports multiple producers and consumers.
from multiprocessing import Pipe, Process
def sender(conn):
conn.send("Hello from child process")
conn.close()
parent_conn, child_conn = Pipe()
p = Process(target=sender, args=(child_conn,))
p.start()
print(parent_conn.recv()) # Receiving data
p.join()
Catch exceptions inside the thread and use a shared variable to store errors.
- Zombie process: A child process that finishes execution but still has an entry in the process table.
- Orphan process: A child process whose parent has terminated.
- Use timeouts when acquiring locks.
- Use thread-safe data structures like
queue.Queue()
. - Avoid circular wait conditions.
- CPU-bound: Use
multiprocessing
to utilize multiple cores. - I/O-bound: Use
threading
orasyncio
since the GIL allows I/O tasks to run concurrently.
Use cProfile
for performance profiling.
import cProfile
import threading
def task():
sum(range(100000))
thread = threading.Thread(target=task)
cProfile.run("thread.start(); thread.join()")
Process affinity binds a process to specific CPU cores. Use psutil
to control it.
Use the terminate()
method in multiprocessing.Process
.
from multiprocessing import Process
import time
def task():
while True:
print("Running...")
time.sleep(1)
p = Process(target=task)
p.start()
time.sleep(5)
p.terminate()
p.join()
print("Process terminated")
Use multiprocessing.shared_memory
to avoid unnecessary copying.
It occurs when low-priority threads are indefinitely delayed by high-priority threads. Use fair scheduling or adjust thread priorities.
Use Semaphore
in threading
or BoundedSemaphore
in multiprocessing
.
Yes, but avoid if __name__ == "__main__"
issues by placing multiprocessing code inside functions.
- Windows: Uses
spawn
(creates a fresh process). - Linux: Uses
fork
(copies the parent process).
It yields the CPU to another thread or process, useful for cooperative multitasking.
asyncio
is best for single-threaded concurrency using coroutines.threading
is best for I/O-bound tasks.multiprocessing
is best for CPU-bound tasks.
- Use logging instead of
print()
. - Use threading.enumerate() to track active threads.
- Use GDB with Python extensions for deep debugging.
Barrier
synchronizes multiple threads, making them wait at a checkpoint before proceeding.
import threading
barrier = threading.Barrier(3)
def task(n):
print(f"Thread-{n} waiting at barrier")
barrier.wait()
print(f"Thread-{n} passed the barrier")
for i in range(3):
threading.Thread(target=task, args=(i,)).start()
Use threading.enumerate()
to list all active threads.
import threading
def worker():
import time
time.sleep(2)
t1 = threading.Thread(target=worker)
t2 = threading.Thread(target=worker)
t1.start()
t2.start()
print(threading.enumerate()) # Lists all running threads
Use the logging
module with ThreadingHandler
or QueueHandler
.
It provides shared objects like lists and dictionaries across processes.
from multiprocessing import Manager, Process
def worker(shared_list):
shared_list.append(1)
with Manager() as manager:
shared_list = manager.list()
processes = [Process(target=worker, args=(shared_list,)) for _ in range(3)]
for p in processes:
p.start()
for p in processes:
p.join()
print(shared_list) # Output: [1, 1, 1]
Python does not automatically resolve deadlocks. Use timeouts, lock hierarchy, or try-except blocks.
Array
allows sharing data between processes without using a Manager
.
from multiprocessing import Array, Process
def worker(arr):
arr[0] = 99
shared_array = Array('i', [1, 2, 3])
p = Process(target=worker, args=(shared_array,))
p.start()
p.join()
print(shared_array[:]) # Output: [99, 2, 3]
JoinableQueue
extends Queue
with a task_done()
method for better synchronization.
It uses a queue where tasks with higher priority are processed first.
No, Python only allows one running event loop per thread.
GIL prevents true parallel execution of Python threads, limiting performance in CPU-bound tasks.
Use multiprocessing
, cython
, numba
, or JIT
-optimized code.
os.fork()
creates a process by duplicating the current one (Linux only).multiprocessing.Process()
creates a new process that starts from the function entry.
Yes, using libraries like Ray
or Dask
.
Use concurrent.futures.ProcessPoolExecutor
.
subprocess
is for running external programs.multiprocessing
is for parallel execution of Python code.
Tutorials
Random Blogs
- Top 15 Recommended SEO Tools
- What Is SEO and Why Is It Important?
- Datasets for Exploratory Data Analysis for Beginners
- Top 10 Blogs of Digital Marketing you Must Follow
- String Operations in Python
- What is YII? and How to Install it?
- Create Virtual Host for Nginx on Ubuntu (For Yii2 Basic & Advanced Templates)
- The Ultimate Guide to Data Science: Everything You Need to Know
- Quantum AI – The Future of AI Powered by Quantum Computing
- Extract RGB Color From a Image Using CV2