Data Structures in Python - Interview Questions and Answers

Data structures in Python are ways to store, organize, and manipulate data efficiently. Common Python data structures include lists, tuples, dictionaries, sets, and user-defined structures like classes.

Mutable data structures (like lists, sets, and dictionaries) can be modified after creation. Immutable data structures (like tuples and strings) cannot be changed once created.

  • List: Mutable, dynamic, and can store heterogeneous elements.
  • Tuple: Immutable, faster, and also supports heterogeneous elements.

Lists are dynamic and can hold any data type, while arrays (from the array module) are more memory-efficient and can only store elements of the same data type.

The average time complexity of appending an element to a list is O(1)O(1).

A stack can be implemented using a list and the append() and pop() methods.

stack = []
stack.append(10)  # Push
stack.pop()       # Pop

 

A queue can be implemented using collections.deque for better performance.

from collections import deque
queue = deque()
queue.append(10)  # Enqueue
queue.popleft()   # Dequeue

 

  • Stack: Follows the Last In First Out (LIFO) principle.
  • Queue: Follows the First In First Out (FIFO) principle.

A priority queue is a data structure where each element is associated with a priority, and elements with higher priorities are dequeued before those with lower priorities. It can be implemented using heapq.

import heapq

 

deque from collections is optimized for fast appends and pops from both ends, whereas a list is better for random access and traversal.

A linked list can be implemented using classes.

class Node:
    def __init__(self, data):
        self.data = data
        self.next = None

 

  • Singly Linked List: Each node points to the next node.
  • Doubly Linked List: Each node points to both the next and previous nodes.

A hash table is a data structure that maps keys to values for efficient lookup. In Python, dictionaries (dict) are implemented as hash tables.

Collisions in hash tables can be handled using:

  • Chaining: Storing multiple items at the same index using a list.
  • Open addressing: Probing for the next available slot.

A set stores only unique elements, while a dictionary stores key-value pairs.

  • Insertion: O(1)O(1)
  • Deletion: O(1)O(1)
  • Lookup: O(1)O(1)

A binary tree is a hierarchical data structure where each node has at most two children (left and right).

class Node:
    def __init__(self, data):
        self.data = data
        self.left = None
        self.right = None

 

In a binary search tree (BST), the left subtree contains elements smaller than the root, and the right subtree contains elements larger than the root.

def in_order(node):
    if node:
        in_order(node.left)
        print(node.data)
        in_order(node.right)

 

  • BFS (Breadth-First Search): Traverses level by level.
  • DFS (Depth-First Search): Traverses as far down one branch as possible before backtracking.

A heap can be implemented using the heapq module.

import heapq
heap = []
heapq.heappush(heap, 10)
heapq.heappop(heap)

 

  • Min-Heap: The smallest element is at the root.
  • Max-Heap: The largest element is at the root.

A graph is a collection of vertices and edges. Graphs can be represented using:

  • Adjacency list: defaultdict(list)
  • Adjacency matrix: A 2D array.

  • Directed Graph: Edges have a direction.
  • Undirected Graph: Edges do not have a direction.

A trie is a tree-like data structure used for efficient retrieval of strings, such as in autocomplete systems.

class TrieNode:
    def __init__(self):
        self.children = {}
        self.is_end_of_word = False

 

A cycle occurs when a node can be revisited. It can be detected using DFS with a visited set.

  • Shallow Copy: Copies the object but not nested objects.
  • Deep Copy: Copies the object and all nested objects.

A deque (double-ended queue) supports O(1) appends and pops from both ends. It's ideal for implementing stacks and queues.

, where hh is the height of the tree.

A red-black tree is a self-balancing binary search tree with properties to ensure balance.

A B-tree is a self-balancing search tree designed for systems that read and write large blocks of data.

BFS uses a queue, whereas DFS uses a stack or recursion.

from collections import deque
def bfs(graph, start):
    queue = deque([start])
    visited = set()
    while queue:
        node = queue.popleft()
        if node not in visited:
            visited.add(node)
            queue.extend(graph[node])

 

A hash collision occurs when two keys hash to the same index in a hash table.

  1. A set is mutable, whereas a frozenset is immutable.

A sparse matrix has mostly zero values. It can be implemented using a dictionary of keys (DOK) representation.

Dynamic programming uses memoization to store intermediate results, often leveraging arrays or dictionaries.

It efficiently performs union and find operations for disjoint set operations, useful in Kruskal's algorithm.

  • Create a trie (prefix tree) where each node represents a character.
  • Insert words into the trie such that each path from the root to a leaf represents a word.
  • To autocomplete, traverse the trie using the given prefix and collect all words below that node.
  • Optionally, maintain a frequency count or timestamp in each node to rank the suggestions.

  • Unweighted graph: Use Breadth-First Search (BFS) to calculate the shortest path.
  • Weighted graph with non-negative weights: Use Dijkstra's Algorithm.
  • Weighted graph with negative weights: Use the Bellman-Ford Algorithm.
  • All-pairs shortest paths: Use the Floyd-Warshall Algorithm.

Use two stacks:

  • Undo Stack: Push the current state before making a change.
  • Redo Stack: Push the state from the undo stack when an undo action is performed.
  • Clear the redo stack when a new change is made.

  • Use a Self-Balancing Binary Search Tree (BST) like AVL or Red-Black Tree, which automatically adjusts the balance during insertions and deletions.
  • Alternatively, if the tree is unbalanced, perform a traversal to collect nodes, then reconstruct the tree by inserting nodes in a balanced manner.

class Node:
    def __init__(self, value):
        self.value = value
        self.next = None

def reverse_linked_list(head):
    prev = None
    current = head
    while current:
        next_node = current.next
        current.next = prev
        prev = current
        current = next_node
    return prev  # New head of the reversed list

 

class TreeNode:
    def __init__(self, value):
        self.value = value
        self.left = None
        self.right = None

def is_balanced(root):
    def height(node):
        if not node:
            return 0
        left_height = height(node.left)
        right_height = height(node.right)
        if abs(left_height - right_height) > 1:
            raise ValueError("Tree is not balanced")
        return 1 + max(left_height, right_height)

    try:
        height(root)
        return True
    except ValueError:
        return False

 

def find_intersection(list1, list2):
    return list(set(list1) & set(list2))

# Example
list1 = [1, 2, 3, 4]
list2 = [3, 4, 5, 6]
print(find_intersection(list1, list2))  # Output: [3, 4]

 

from collections import Counter
import heapq

def top_k_frequent(nums, k):
    frequency = Counter(nums)
    return heapq.nlargest(k, frequency.keys(), key=frequency.get)

# Example
nums = [1, 1, 1, 2, 2, 3]
k = 2
print(top_k_frequent(nums, k))  # Output: [1, 2]

 

Share   Share