NumPy - Interview Questions and Answers
NumPy (Numerical Python) is a Python library used for working with large, multi-dimensional arrays and matrices, along with a collection of mathematical functions. It is widely used in data science, machine learning, and scientific computing due to its efficiency and speed.
You can install NumPy using pip:
pip install numpy
- A list can hold different data types, while a NumPy array is homogeneous.
- NumPy arrays are faster and use less memory than lists.
- NumPy provides built-in vectorized operations, whereas lists require explicit loops.
import numpy as np
arr = np.array([1, 2, 3, 4, 5])
print(arr) # Output: [1 2 3 4 5]
print(arr.shape) # Returns tuple of array dimensions
print(arr.size) # Returns total number of elements
np.zeros((2,3)) # Array of zeros
np.ones((3,3)) # Array of ones
np.full((2,2), 7) # Array filled with a specific value
np.eye(4) # Identity matrix
np.random.rand(3,3) # Random values
np.arange(1, 10, 2) # [1 3 5 7 9]
np.linspace(0, 1, 5) # [0. 0.25 0.5 0.75 1. ]
arange(start, stop, step)
: Creates evenly spaced values.linspace(start, stop, num)
: Creates a specified number of evenly spaced values.
arr = np.array([1, 2, 3, 4, 5, 6])
arr_reshaped = arr.reshape((2,3))
arr = np.array([10, 20, 30, 40, 50])
print(arr[1:4]) # Output: [20 30 40]
arr = np.array([10, 20, 30, 40])
print(arr[arr > 20]) # Output: [30 40]
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
print(a + b) # Output: [5 7 9]
print(a * b) # Output: [4 10 18]
It allows arithmetic operations on arrays of different shapes.
a = np.array([1, 2, 3])
b = np.array([[1], [2], [3]])
print(a + b)
a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6], [7, 8]])
print(np.dot(a, b))
np.linalg.det(np.array([[1, 2], [3, 4]]))
np.random.rand(3,3) # Random numbers between 0 and 1
np.random.randint(1, 10, (3,3)) # Random integers
arr = np.array([5, 2, 8, 1])
print(np.sort(arr)) # Output: [1 2 5 8]
arr = np.array([1, 2, 2, 3, 4, 4])
print(np.unique(arr)) # Output: [1 2 3 4]
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
print(np.concatenate((a, b))) # Output: [1 2 3 4 5 6]
arr = np.array([1, 2, 3, 4, 5, 6])
print(np.split(arr, 3))
NumPy uses C-based optimizations, contiguous memory storage, and vectorized operations, making it significantly faster.
NumPy uses contiguous memory blocks and typed arrays, reducing overhead compared to Python lists.
It creates coordinate matrices for vectorized evaluations of functions.
x, y = np.meshgrid(np.arange(3), np.arange(3))
arr = np.array([1, 2, 3])
print(np.vectorize(lambda x: x**2)(arr)) # Output: [1 4 9]
np.where()
returns the indices of elements that satisfy a condition.
arr = np.array([10, 20, 30, 40])
indices = np.where(arr > 20)
print(indices) # Output: (array([2, 3]),)
They return the index of the maximum or minimum value in an array.
arr = np.array([5, 2, 8, 1])
print(np.argmax(arr)) # Output: 2
print(np.argmin(arr)) # Output: 3
np.clip()
limits the values in an array within a specified range.
arr = np.array([1, 20, 50, 100])
print(np.clip(arr, 10, 50)) # Output: [10 20 50 50]
arr = np.array([1, 2, 3])
print(np.cumsum(arr)) # Output: [1 3 6]
print(np.cumprod(arr)) # Output: [1 2 6]
It counts occurrences of integers in an array, useful for histograms.
arr = np.array([1, 1, 2, 2, 2, 3])
print(np.bincount(arr)) # Output: [0 2 3 1]
arr = np.array([10, 20, 30, 40])
print(np.take(arr, [0, 2])) # Output: [10 30]
np.put(arr, [1, 3], [99, 77])
print(arr) # Output: [10 99 30 77]
Selects elements from multiple arrays based on an index array.
a = np.array([0, 1, 2])
b = np.array([10, 20, 30])
c = np.array([100, 200, 300])
choices = np.array([b, c])
print(np.choose(a, choices)) # Output: [10 200 300]
arr = np.array([1, 2, 3])
print(np.tile(arr, 3)) # Output: [1 2 3 1 2 3 1 2 3]
arr = np.array([[1, 2], [3, 4]])
print(arr.flatten()) # Output: [1 2 3 4]
ravel()
returns a view (modifications affect the original array).flatten()
returns a copy (modifications do not affect the original array).
Adds a new axis to an array.
arr = np.array([1, 2, 3])
print(arr.shape) # (3,)
arr_expanded = np.expand_dims(arr, axis=0)
print(arr_expanded.shape) # (1, 3)
Removes single-dimensional entries from the shape of an array.
arr = np.array([[[1], [2], [3]]])
print(arr.shape) # (1, 3, 1)
print(np.squeeze(arr).shape) # (3,)
Computes the numerical gradient of an array.
arr = np.array([1, 2, 4, 7, 11])
print(np.gradient(arr)) # Output: [1. 1.5 2.5 3.5 4. ]
arr = np.array([1, 2, 3, 4, 5])
window_size = 3
cumsum = np.cumsum(arr)
moving_avg = (cumsum[window_size:] - cumsum[:-window_size]) / window_size
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(np.corrcoef(arr))
Converts a Python function into a NumPy-friendly vectorized function.
def square(x):
return x * x
vec_square = np.vectorize(square)
arr = np.array([1, 2, 3])
print(vec_square(arr)) # Output: [1 4 9]
C_CONTIGUOUS
(C-order) stores elements row-wise.F_CONTIGUOUS
(Fortran-order) stores elements column-wise.
np.random.normal(loc=0, scale=1, size=(3,3))
arr = np.array([1, 2, 3, 4, 5])
np.random.shuffle(arr)
Sets a seed for reproducibility of random numbers.
np.random.seed(42)
arr = np.array([1, 2, 3, 4])
print(np.median(arr)) # Output: 2.5
print(np.var(arr)) # Output: 1.25
print(np.std(arr)) # Output: 1.118
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
print(np.hstack((a, b))) # Output: [1 2 3 4 5 6]
print(np.vstack((a, b))) # Output: [[1 2 3] [4 5 6]]
a = np.array([1, 2, 3])
b = np.array([1, 2, 3])
print(np.array_equal(a, b)) # Output: True
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(np.diag(arr)) # Output: [1 5 9]
np.triu()
extracts the upper triangular part.np.tril()
extracts the lower triangular part.
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(np.triu(arr))
# Output:
# [[1 2 3]
# [0 5 6]
# [0 0 9]]
print(np.tril(arr))
# Output:
# [[1 0 0]
# [4 5 0]
# [7 8 9]]
np.linalg.det()
computes the determinant.
arr = np.array([[1, 2], [3, 4]])
print(np.linalg.det(arr)) # Output: -2.0
np.linalg.eig()
computes both eigenvalues and eigenvectors.
arr = np.array([[4, -2], [1, 1]])
eigenvalues, eigenvectors = np.linalg.eig(arr)
print(eigenvalues)
# Output: [3. 2.]
print(eigenvectors)
# Output:
# [[ 0.89442719 0.70710678]
# [ 0.4472136 0.70710678]]
np.linalg.inv()
calculates the inverse.
arr = np.array([[4, 7], [2, 6]])
inv_matrix = np.linalg.inv(arr)
print(inv_matrix)
np.linalg.pinv()
computes the pseudoinverse.
arr = np.array([[1, 2], [3, 4], [5, 6]])
pseudo_inv = np.linalg.pinv(arr)
print(pseudo_inv)
np.linalg.solve()
solves Ax = B
.
A = np.array([[3, 1], [1, 2]])
B = np.array([9, 8])
solution = np.linalg.solve(A, B)
print(solution) # Output: [2. 3.]
- It creates a coordinate grid from two 1D arrays.
x = np.array([1, 2, 3])
y = np.array([4, 5])
X, Y = np.meshgrid(x, y)
print(X)
# Output:
# [[1 2 3]
# [1 2 3]]
print(Y)
# Output:
# [[4 4 4]
# [5 5 5]]
- Broadcasting allows operations between arrays of different shapes.
arr1 = np.array([[1], [2], [3]])
arr2 = np.array([4, 5, 6])
print(arr1 + arr2)
# Output:
# [[ 5 6 7]
# [ 6 7 8]
# [ 7 8 9]]
Use .nbytes
to check memory usage in bytes.
arr = np.ones((1000, 1000))
print(arr.nbytes) # Output: 8000000 (8 MB)
- It creates an array from a buffer without copying data.
import array
buffer = array.array('i', [1, 2, 3, 4])
arr = np.frombuffer(buffer, dtype=int)
print(arr) # Output: [1 2 3 4]
- It creates an array from an iterable.
iterable = (x*x for x in range(5))
arr = np.fromiter(iterable, dtype=int)
print(arr) # Output: [ 0 1 4 9 16]
arr = np.array([1, 2, 3])
print(arr.tolist()) # Output: [1, 2, 3]
print(arr.tobytes()) # Convert to byte stream
- Use
np.split()
andnp.concatenate()
.
arr = np.array([1, 2, 3, 4, 5, 6])
split_arrays = np.split(arr, 3)
print(split_arrays) # Output: [array([1, 2]), array([3, 4]), array([5, 6])]
joined = np.concatenate(split_arrays)
print(joined) # Output: [1 2 3 4 5 6]
Use np.random.randint()
for integers or np.random.uniform()
for floating-point numbers within a range.
arr = np.random.randint(10, 50, size=(3, 3)) # 3x3 matrix with numbers between 10 and 50
print(arr)
- Use
np.random.seed()
to ensure reproducibility.
np.random.seed(42)
arr = np.array([1, 2, 3, 4, 5])
np.random.shuffle(arr)
print(arr) # Output will be the same every time due to the seed.
np.cumsum()
computes the cumulative sum of elements.np.cumprod()
computes the cumulative product.
arr = np.array([1, 2, 3, 4])
print(np.cumsum(arr)) # Output: [ 1 3 6 10]
print(np.cumprod(arr)) # Output: [ 1 2 6 24]
- Use
np.unique()
and compare its length with the original array.
arr = np.array([1, 2, 3, 4, 5, 2])
print(len(np.unique(arr)) == len(arr)) # Output: False
- Use boolean indexing.
arr = np.array([-3, 5, -1, 7, -9])
arr[arr < 0] = 0
print(arr) # Output: [0 5 0 7 0]
- Use
np.bincount()
(for small positive integers) ornp.unique()
with counts.
arr = np.array([1, 2, 3, 4, 2, 2, 3, 3, 3])
most_frequent = np.bincount(arr).argmax()
print(most_frequent) # Output: 3
np.mean()
computes the average, whilenp.median()
finds the middle value.
arr = np.array([1, 2, 3, 4, 100])
print(np.mean(arr)) # Output: 22.0
print(np.median(arr)) # Output: 3.0 (not affected by outliers)
- Use
np.power()
.
arr = np.array([1, 2, 3, 4])
print(np.power(arr, 3)) # Output: [ 1 8 27 64]
- Use
np.diag()
.
arr = np.array([[10, 20, 30], [40, 50, 60], [70, 80, 90]])
print(np.diag(arr)) # Output: [10 50 90]
- Use
np.vectorize()
.
def square(x):
return x ** 2
arr = np.array([1, 2, 3, 4])
vectorized_func = np.vectorize(square)
print(vectorized_func(arr)) # Output: [ 1 4 9 16]
Tutorials
Random Blogs
- Generative AI - The Future of Artificial Intelligence
- 15 Amazing Keyword Research Tools You Should Explore
- Datasets for Natural Language Processing
- What is YII? and How to Install it?
- Extract RGB Color From a Image Using CV2
- The Ultimate Guide to Starting a Career in Computer Vision
- Datasets for analyze in Tableau
- The Ultimate Guide to Machine Learning (ML) for Beginners
- Best Platform to Learn Digital Marketing in Free
- Store Data Into CSV File Using Python Tkinter GUI Library