NumPy - Interview Questions and Answers

NumPy (Numerical Python) is a Python library used for working with large, multi-dimensional arrays and matrices, along with a collection of mathematical functions. It is widely used in data science, machine learning, and scientific computing due to its efficiency and speed.

You can install NumPy using pip:

pip install numpy

 

  • A list can hold different data types, while a NumPy array is homogeneous.
  • NumPy arrays are faster and use less memory than lists.
  • NumPy provides built-in vectorized operations, whereas lists require explicit loops.

import numpy as np
arr = np.array([1, 2, 3, 4, 5])
print(arr)  # Output: [1 2 3 4 5]

 

print(arr.shape)  # Returns tuple of array dimensions
print(arr.size)   # Returns total number of elements

 

np.zeros((2,3))  # Array of zeros
np.ones((3,3))   # Array of ones
np.full((2,2), 7)  # Array filled with a specific value
np.eye(4)   # Identity matrix
np.random.rand(3,3)  # Random values

 

np.arange(1, 10, 2)  # [1 3 5 7 9]
np.linspace(0, 1, 5)  # [0.   0.25 0.5  0.75 1. ]
  • arange(start, stop, step): Creates evenly spaced values.
  • linspace(start, stop, num): Creates a specified number of evenly spaced values.

arr = np.array([1, 2, 3, 4, 5, 6])
arr_reshaped = arr.reshape((2,3))  

 

arr = np.array([10, 20, 30, 40, 50])
print(arr[1:4])  # Output: [20 30 40]

 

arr = np.array([10, 20, 30, 40])
print(arr[arr > 20])  # Output: [30 40]

 

a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
print(a + b)  # Output: [5 7 9]
print(a * b)  # Output: [4 10 18]

 

It allows arithmetic operations on arrays of different shapes.

a = np.array([1, 2, 3])
b = np.array([[1], [2], [3]])
print(a + b)  

 

a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6], [7, 8]])
print(np.dot(a, b))

 

np.linalg.det(np.array([[1, 2], [3, 4]]))

 

np.random.rand(3,3)  # Random numbers between 0 and 1
np.random.randint(1, 10, (3,3))  # Random integers

 

arr = np.array([5, 2, 8, 1])
print(np.sort(arr))  # Output: [1 2 5 8]

 

arr = np.array([1, 2, 2, 3, 4, 4])
print(np.unique(arr))  # Output: [1 2 3 4]

 

a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
print(np.concatenate((a, b)))  # Output: [1 2 3 4 5 6]

 

arr = np.array([1, 2, 3, 4, 5, 6])
print(np.split(arr, 3))  

 

NumPy uses C-based optimizations, contiguous memory storage, and vectorized operations, making it significantly faster.

NumPy uses contiguous memory blocks and typed arrays, reducing overhead compared to Python lists.

It creates coordinate matrices for vectorized evaluations of functions.

x, y = np.meshgrid(np.arange(3), np.arange(3))

 

arr = np.array([1, 2, 3])
print(np.vectorize(lambda x: x**2)(arr))  # Output: [1 4 9]

 

np.where() returns the indices of elements that satisfy a condition.

arr = np.array([10, 20, 30, 40])
indices = np.where(arr > 20)
print(indices)  # Output: (array([2, 3]),)

 

They return the index of the maximum or minimum value in an array.

arr = np.array([5, 2, 8, 1])
print(np.argmax(arr))  # Output: 2
print(np.argmin(arr))  # Output: 3

 

np.clip() limits the values in an array within a specified range.

arr = np.array([1, 20, 50, 100])
print(np.clip(arr, 10, 50))  # Output: [10 20 50 50]

 

arr = np.array([1, 2, 3])
print(np.cumsum(arr))  # Output: [1 3 6]
print(np.cumprod(arr))  # Output: [1 2 6]

 

It counts occurrences of integers in an array, useful for histograms.

arr = np.array([1, 1, 2, 2, 2, 3])
print(np.bincount(arr))  # Output: [0 2 3 1]

 

arr = np.array([10, 20, 30, 40])
print(np.take(arr, [0, 2]))  # Output: [10 30]
np.put(arr, [1, 3], [99, 77])
print(arr)  # Output: [10 99 30 77]

 

Selects elements from multiple arrays based on an index array.

a = np.array([0, 1, 2])
b = np.array([10, 20, 30])
c = np.array([100, 200, 300])
choices = np.array([b, c])
print(np.choose(a, choices))  # Output: [10 200 300]

 

arr = np.array([1, 2, 3])
print(np.tile(arr, 3))  # Output: [1 2 3 1 2 3 1 2 3]

 

arr = np.array([[1, 2], [3, 4]])
print(arr.flatten())  # Output: [1 2 3 4]

 

  • ravel() returns a view (modifications affect the original array).
  • flatten() returns a copy (modifications do not affect the original array).

Adds a new axis to an array.

arr = np.array([1, 2, 3])
print(arr.shape)  # (3,)
arr_expanded = np.expand_dims(arr, axis=0)
print(arr_expanded.shape)  # (1, 3)

 

Removes single-dimensional entries from the shape of an array.

arr = np.array([[[1], [2], [3]]])
print(arr.shape)  # (1, 3, 1)
print(np.squeeze(arr).shape)  # (3,)

 

Computes the numerical gradient of an array.

arr = np.array([1, 2, 4, 7, 11])
print(np.gradient(arr))  # Output: [1.  1.5 2.5 3.5 4. ]

 

arr = np.array([1, 2, 3, 4, 5])
window_size = 3
cumsum = np.cumsum(arr)
moving_avg = (cumsum[window_size:] - cumsum[:-window_size]) / window_size

 

arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(np.corrcoef(arr))

 

Converts a Python function into a NumPy-friendly vectorized function.

def square(x):
    return x * x

vec_square = np.vectorize(square)
arr = np.array([1, 2, 3])
print(vec_square(arr))  # Output: [1 4 9]

 

  • C_CONTIGUOUS (C-order) stores elements row-wise.
  • F_CONTIGUOUS (Fortran-order) stores elements column-wise.

np.random.normal(loc=0, scale=1, size=(3,3))

 

arr = np.array([1, 2, 3, 4, 5])
np.random.shuffle(arr)

 

Sets a seed for reproducibility of random numbers.

np.random.seed(42)

 

arr = np.array([1, 2, 3, 4])
print(np.median(arr))  # Output: 2.5
print(np.var(arr))  # Output: 1.25
print(np.std(arr))  # Output: 1.118

 

a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
print(np.hstack((a, b)))  # Output: [1 2 3 4 5 6]
print(np.vstack((a, b)))  # Output: [[1 2 3] [4 5 6]]

 

a = np.array([1, 2, 3])
b = np.array([1, 2, 3])
print(np.array_equal(a, b))  # Output: True

 

arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(np.diag(arr))  # Output: [1 5 9]

 

  • np.triu() extracts the upper triangular part.
  • np.tril() extracts the lower triangular part.
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(np.triu(arr))  
# Output:
# [[1 2 3]
#  [0 5 6]
#  [0 0 9]]

print(np.tril(arr))  
# Output:
# [[1 0 0]
#  [4 5 0]
#  [7 8 9]]

 

np.linalg.det() computes the determinant.

arr = np.array([[1, 2], [3, 4]])
print(np.linalg.det(arr))  # Output: -2.0

 

  • np.linalg.eig() computes both eigenvalues and eigenvectors.
arr = np.array([[4, -2], [1, 1]])
eigenvalues, eigenvectors = np.linalg.eig(arr)
print(eigenvalues)  
# Output: [3. 2.]
print(eigenvectors)  
# Output:
# [[ 0.89442719  0.70710678]
#  [ 0.4472136   0.70710678]]

 

np.linalg.inv() calculates the inverse.

arr = np.array([[4, 7], [2, 6]])
inv_matrix = np.linalg.inv(arr)
print(inv_matrix)

 

np.linalg.pinv() computes the pseudoinverse.

arr = np.array([[1, 2], [3, 4], [5, 6]])
pseudo_inv = np.linalg.pinv(arr)
print(pseudo_inv)

 

np.linalg.solve() solves Ax = B.

A = np.array([[3, 1], [1, 2]])
B = np.array([9, 8])
solution = np.linalg.solve(A, B)
print(solution)  # Output: [2. 3.]

 

  • It creates a coordinate grid from two 1D arrays.
x = np.array([1, 2, 3])
y = np.array([4, 5])
X, Y = np.meshgrid(x, y)
print(X)  
# Output:
# [[1 2 3]
#  [1 2 3]]
print(Y)
# Output:
# [[4 4 4]
#  [5 5 5]]

 

  • Broadcasting allows operations between arrays of different shapes.
arr1 = np.array([[1], [2], [3]])
arr2 = np.array([4, 5, 6])
print(arr1 + arr2)  
# Output:
# [[ 5  6  7]
#  [ 6  7  8]
#  [ 7  8  9]]

 

Use .nbytes to check memory usage in bytes.

arr = np.ones((1000, 1000))
print(arr.nbytes)  # Output: 8000000 (8 MB)

 

  • It creates an array from a buffer without copying data.
import array
buffer = array.array('i', [1, 2, 3, 4])
arr = np.frombuffer(buffer, dtype=int)
print(arr)  # Output: [1 2 3 4]

 

  • It creates an array from an iterable.
iterable = (x*x for x in range(5))
arr = np.fromiter(iterable, dtype=int)
print(arr)  # Output: [ 0  1  4  9 16]

 

arr = np.array([1, 2, 3])
print(arr.tolist())  # Output: [1, 2, 3]
print(arr.tobytes())  # Convert to byte stream

 

  • Use np.split() and np.concatenate().
arr = np.array([1, 2, 3, 4, 5, 6])
split_arrays = np.split(arr, 3)
print(split_arrays)  # Output: [array([1, 2]), array([3, 4]), array([5, 6])]

joined = np.concatenate(split_arrays)
print(joined)  # Output: [1 2 3 4 5 6]

 

Use np.random.randint() for integers or np.random.uniform() for floating-point numbers within a range.

arr = np.random.randint(10, 50, size=(3, 3))  # 3x3 matrix with numbers between 10 and 50
print(arr)

 

  • Use np.random.seed() to ensure reproducibility.
np.random.seed(42)
arr = np.array([1, 2, 3, 4, 5])
np.random.shuffle(arr)
print(arr)  # Output will be the same every time due to the seed.

 

  • np.cumsum() computes the cumulative sum of elements.
  • np.cumprod() computes the cumulative product.
arr = np.array([1, 2, 3, 4])
print(np.cumsum(arr))  # Output: [ 1  3  6 10]
print(np.cumprod(arr))  # Output: [ 1  2  6 24]

 

  • Use np.unique() and compare its length with the original array.
arr = np.array([1, 2, 3, 4, 5, 2])
print(len(np.unique(arr)) == len(arr))  # Output: False

 

  • Use boolean indexing.
arr = np.array([-3, 5, -1, 7, -9])
arr[arr < 0] = 0
print(arr)  # Output: [0 5 0 7 0]

 

  • Use np.bincount() (for small positive integers) or np.unique() with counts.
arr = np.array([1, 2, 3, 4, 2, 2, 3, 3, 3])
most_frequent = np.bincount(arr).argmax()
print(most_frequent)  # Output: 3

 

  • np.mean() computes the average, while np.median() finds the middle value.
arr = np.array([1, 2, 3, 4, 100])
print(np.mean(arr))   # Output: 22.0
print(np.median(arr)) # Output: 3.0 (not affected by outliers)

 

  • Use np.power().
arr = np.array([1, 2, 3, 4])
print(np.power(arr, 3))  # Output: [ 1  8 27 64]

 

  • Use np.diag().
arr = np.array([[10, 20, 30], [40, 50, 60], [70, 80, 90]])
print(np.diag(arr))  # Output: [10 50 90]

 

  • Use np.vectorize().
def square(x):
    return x ** 2

arr = np.array([1, 2, 3, 4])
vectorized_func = np.vectorize(square)
print(vectorized_func(arr))  # Output: [ 1  4  9 16]

 

Share   Share