Indexing and Performance Optimization in SQL - Interview Questions and Answers

An index is a database structure that improves query performance by enabling faster data retrieval.

Example: Creating an index on Aadhaar_Number for fast lookups

CREATE INDEX idx_aadhaar ON citizens(aadhaar_number);

Clustered Index: Data is physically sorted based on the index. (Each table can have only one clustered index.)
Non-clustered Index: Stores pointers to the data, allowing multiple indexes.
Unique Index: Ensures no duplicate values in a column.
Full-text Index: Used for searching text-based data efficiently.

Instead of scanning the entire table, an index lets the DBMS find rows faster, reducing I/O operations.

Example: Searching for an PAN number:

SELECT * FROM tax_payers WHERE pan_number = 'ABCDE1234F';

With an index on pan_number, lookup is instant, instead of scanning millions of records.

A clustered index determines the physical order of rows in a table.
Example: An Indian Railways Reservation System, where PNR_Number is a clustered index:

CREATE CLUSTERED INDEX idx_pnr ON railway_tickets(pnr_number);

A non-clustered index keeps a separate structure from the data and stores pointers

Example: A bank database where account numbers need quick search

CREATE NONCLUSTERED INDEX idx_acc ON bank_accounts(account_number);

Use a clustered index for columns frequently used in sorting/searching (e.g., Aadhaar numbers, PNR numbers).
Use a non-clustered index when a column is used only for lookups but not sorting (e.g., bank account balance).

Only one, since it defines the physical order of data.

A table can have multiple non-clustered indexes.

CREATE INDEX idx_product_name ON products(product_name);
CREATE INDEX idx_category ON products(category);

A covering index contains all the columns needed for a query, avoiding additional table lookups.

Example:

CREATE INDEX idx_cover ON transactions(customer_id, transaction_amount, transaction_date);

An index on multiple columns to optimize specific queries.

Example:

CREATE INDEX idx_order ON orders(customer_id, order_date);

Slows down INSERTs/UPDATEs as indexes need updating.
Consumes more storage.

Indexes speed up SELECT queries but slow down INSERT, UPDATE, and DELETE because every index must also be updated

Example: In a bank_accounts table, if we update balance, a non-clustered index on balance must also be updated:

UPDATE bank_accounts SET balance = balance + 500 WHERE account_number = '1234567890';

Too many indexes = performance degradation for updates.

Implicit Index: Created automatically (e.g., Primary Key and Unique constraints).
Explicit Index: Created manually using CREATE INDEX.

Example (Implicit Index):

CREATE TABLE employees (
    emp_id INT PRIMARY KEY,  -- Automatically indexed
    emp_name VARCHAR(50)
);

Example (Explicit Index)

CREATE INDEX idx_emp_name ON employees(emp_name);

Speeds up lookups by reducing the number of rows scanned.

Example: Searching for an PAN number in an income tax database:

SELECT * FROM tax_payers WHERE pan_number = 'ABCDE1234F';

With an index on pan_number, lookup is instant, avoiding a full table scan.

On small tables (as full table scans may be faster).
On highly updated columns (as updates slow down).
On columns with many duplicate values (like gender or status).

Indexing computed values to optimize function-based queries.

Example: Indexing on UPPERCASE email in an banking system

SELECT * FROM customers WHERE UPPER(email) = 'RAHUL@GMAIL.COM';

This optimizes case-insensitive searches

SELECT * FROM customers WHERE UPPER(email) = 'RAHUL@GMAIL.COM';

Example: Dropping an index on pan_number:

DROP INDEX idx_pan ON tax_payers;

MySQL

SHOW INDEX FROM bank_accounts;

PostgreSQL

SELECT indexname FROM pg_indexes WHERE tablename = 'bank_accounts';

Used for fast searching in text-heavy columns.

Example: Searching in a large database of Indian legal case records

CREATE FULLTEXT INDEX idx_case_desc ON court_cases(case_description);

Query optimized for full-text search

SELECT * FROM court_cases WHERE MATCH(case_description) AGAINST ('property dispute');

Indexing only a subset of rows for performance optimization.

Example: Indexing only active users in an e-commerce system:

CREATE INDEX idx_active_users ON users(email) WHERE status = 'active';

This makes searching faster for active users while ignoring inactive ones.

MySQL

SHOW INDEX FROM bank_accounts;

PostgreSQL

SELECT * FROM pg_stat_user_indexes WHERE relname = 'transactions';

Index fragmentation happens when indexes become inefficient due to frequent updates/deletes.

Solution:

Rebuild the index in SQL Server

ALTER INDEX idx_acc_number ON bank_accounts REBUILD;

Optimize in MySQL

OPTIMIZE TABLE bank_accounts;

Reverses the stored value to improve lookup speeds for sequential values (e.g., Aadhaar numbers).

Example:

CREATE INDEX idx_reverse_phone ON users(REVERSE(phone_number));

Yes, but it depends on the database system.

PostgreSQL example (Indexing a JSON field in an Indian Aadhaar verification database):

CREATE INDEX idx_json ON citizens USING gin(aadhaar_details);

This speeds up queries on JSON fields.

It shows how a query is executed, helping identify bottlenecks.

Example:

EXPLAIN SELECT * FROM bank_transactions WHERE transaction_id = 'TXN12345';

Look for Full Table Scans - Use indexes.
Identify high-cost joins - Optimize with proper keys.

WHERE filters before aggregation, while HAVING filters after aggregation.

Example:

SELECT state, COUNT(*) FROM citizens GROUP BY state HAVING COUNT(*) > 100000;

Index the column used in ORDER BY.

Example:

CREATE INDEX idx_salary ON employees(salary);
SELECT * FROM employees ORDER BY salary DESC;

Use indexes on JOIN keys.

Example (E-commerce database):

CREATE INDEX idx_customer ON orders(customer_id);
SELECT o.order_id, c.customer_name 
FROM orders o 
JOIN customers c ON o.customer_id = c.customer_id;

CREATE INDEX idx_upi ON transactions(upi_id, transaction_time);

CREATE INDEX idx_aadhaar ON citizens(aadhaar_number);

Partitioning divides large tables into smaller pieces for better performance.

Example: Partitioning UPI transactions by month:

CREATE TABLE upi_transactions (
    transaction_id INT,
    upi_id VARCHAR(20),
    transaction_date DATE
) PARTITION BY RANGE (YEAR(transaction_date));

Normalization reduces redundancy but can slow JOIN-heavy queries.

Stores query results to speed up repeated queries.
Example: Frequently used UPI report queries should be cached.

It fetches unnecessary columns, increasing execution time.

Indexed columns improve GROUP BY performance.

SELECT * FROM orders FORCE INDEX(idx_customer) WHERE customer_id = 101;

Partitioning divides a large table into smaller manageable parts.
Indexes apply to each partition separately, improving query performance.

Example: Partitioning a sales table by year in an e-commerce database:

CREATE TABLE sales (
    sale_id INT,
    sale_date DATE,
    amount DECIMAL(10,2),
    region VARCHAR(50)
) PARTITION BY RANGE (YEAR(sale_date)) (
    PARTITION p2022 VALUES LESS THAN (2023),
    PARTITION p2023 VALUES LESS THAN (2024)
);

Effect: Queries for 2023 skip irrelevant partitions, improving speed.

Ordering matters in composite indexes.
The most frequently filtered column should come first.

Example:

CREATE INDEX idx_customer_region ON orders(customer_id, region);

If queries mostly filter by customer_id, the index is efficient

SELECT * FROM orders WHERE customer_id = 101;

But if filtering by region, the index may not be used efficiently.

Indexed Views store results physically, reducing computation on large datasets.

Example: Precomputed total sales per city in an Indian retail system

CREATE VIEW sales_summary AS
SELECT city, SUM(amount) AS total_sales
FROM sales GROUP BY city;

CREATE UNIQUE CLUSTERED INDEX idx_sales_summary ON sales_summary(city);

Benefit: Faster aggregated queries without recalculating each time.

Index SEEK: Uses an index to quickly find data.
Index SCAN: Reads the entire index/table.

Example:

SEEK (Efficient)

SELECT * FROM customers WHERE customer_id = 123;

SCAN (Inefficient on large tables)

SELECT * FROM customers WHERE LOWER(name) = 'rahul';

Solution: Create a function-based index

CREATE INDEX idx_lower_name ON customers(LOWER(name));

A covering index includes all columns needed for a query, avoiding extra lookups.

Example: Optimizing customer_orders query in an Indian UPI payments database.

CREATE INDEX idx_orders ON transactions (customer_id, order_date, amount);

Now this query is fully covered

SELECT order_date, amount FROM transactions WHERE customer_id = 567;

No need to access the table, only the index!

Probabilistic data structure used for fast lookups in large datasets.
Reduces disk IO by eliminating unnecessary searches.
Used in distributed SQL engines (e.g., Apache Spark, PostgreSQL).
Example: Fast Aadhaar number lookups in voter databases.

Indexing JSON fields optimizes searching inside JSON data

Example: Indexing customer preferences stored in JSON

CREATE INDEX idx_json ON users USING GIN (preferences);

Query optimized for a specific key

SELECT * FROM users WHERE preferences @> '{"language": "Hindi"}';

Use columnstore indexes for analytical workloads.

Example: Optimizing Indian stock market transactions:

CREATE COLUMNSTORE INDEX idx_trades ON stock_trades (trade_date, stock_id, volume);

Benefit: Compresses data, speeds up aggregation queries.

Use pg_stat_user_indexes to check index usage:

SELECT indexrelname, idx_scan, idx_tup_read, idx_tup_fetch 
FROM pg_stat_user_indexes WHERE relname = 'transactions';

High idx_scan count = Index is effective.

Horizontal Partitioning: Splitting rows (e.g., by state in India)

CREATE TABLE customers_delhi AS SELECT * FROM customers WHERE state = 'Delhi';

Vertical Partitioning: Splitting columns (e.g., separating frequently used columns)

CREATE TABLE customer_contact AS SELECT customer_id, phone, email FROM customers;

Solution 1: Add indexes for frequent filters.
Solution 2: Use LIMIT for large datasets.

Example: Query optimized for a large tax_payers table in India

SELECT * FROM tax_payers WHERE state = 'Maharashtra' LIMIT 100;

MySQL Query Cache stores the result of SELECT queries.
Redis can cache database results for repeated queries.

Example: Caching an Indian movie ticket booking report

SET SESSION query_cache_type = ON;
SELECT * FROM bookings WHERE city = 'Mumbai';

Next time, it retrieves data instantly from cache.

Sharding splits data across multiple databases for scalability.

Example: Indian railway booking system.

CREATE TABLE bookings_shard_1 PARTITION OF bookings FOR VALUES FROM ('1') TO ('500000');
CREATE TABLE bookings_shard_2 PARTITION OF bookings FOR VALUES FROM ('500001') TO ('1000000');

Indexing per shard speeds up queries on large distributed datasets.

Share Share Add to Bookmark

File Handling in Python

55 Questions

SQL Interview Questions for Freshers

101 Questions

Software Testing

22 Questions

Matplotlib/Seaborn

60 Questions

Pandas

70 Questions

WordPress

20 Questions

Functions in Python

50 Questions

T-SQL

50 Questions

Flask

55 Questions

PostgreSQL

21 Questions

Unit Testing in Python

61 Questions

Inheritance in Python

50 Questions

JavaScript Interview Questions for 0–1 Year Experience

100 Questions

Tuples in Python

30 Questions

HTML5

24 Questions

Lists in Python

50 Questions

← Previous Interview

SQL Global Variables

Next Interview →

Indexing and Performance Optimization in SQL - Interview Questions and Answers

File Handling in Python

SQL Interview Questions for Freshers

Software Testing

Matplotlib/Seaborn

Pandas

WordPress

Functions in Python

T-SQL

Flask

PostgreSQL

Unit Testing in Python

Inheritance in Python

JavaScript Interview Questions for 0–1 Year Experience

Tuples in Python

HTML5

Lists in Python

SQL Global Variables

Error Handling and Exceptions in Python

Tutorials

Random Blogs

Categories

Indexing and Performance Optimization in SQL - Interview Questions and Answers

1. What is an index in SQL?

2. What are the types of indexes in SQL?

3. How does an index improve query performance?

4. What is a clustered index? Give an example.

5. What is a non-clustered index?

6. When should you use a clustered vs. non-clustered index?

7. How many clustered indexes can a table have?

8. How many non-clustered indexes can a table have?

9. How would you index a large Indian e-commerce product table for fast search?

10. What is a covering index?

11. What is a composite index?

12. Why should you be careful with too many indexes?

13. What is the impact of indexing on INSERT, UPDATE, and DELETE operations?

14. What is an implicit vs. explicit index?

15. How does indexing impact SELECT queries?

16. When should you not create an index?

17. What is a function-based index?

18. How do you drop an index?

19. How do you check if an index exists in a table?

20. What is a full-text index?

21. What is a partial index?

22. How do you monitor index usage?

23. What is an index fragmentation issue, and how do you fix it?

24. What is a reverse index, and when is it useful?

25. Can an index be created on a JSON column?

26. What is an SQL execution plan?

27. How do you optimize queries using EXPLAIN?

28. What is the difference between WHERE and HAVING?

29. How do you improve performance in an ORDER BY query?

30. How do you optimize JOIN queries?

31. How would you optimize a UPI transactions table?

32. How would you improve Aadhaar-based searches?

33. What are partitioned tables, and when should you use them?

34. What is database normalization, and how does it impact performance?

35. What is query caching, and when should you use it?

36. Why should you avoid SELECT * in queries?

37. How does indexing affect GROUP BY queries?

38. When should you use an index hint?

39. What is the impact of table partitioning on indexing?

40. How does column order in an index affect performance?

41. How do indexed views improve performance?

42. What is the difference between SEEK and SCAN in query execution?

43. How does a covering index improve performance?

44. What is a bloom filter in indexing?

45. How do we use indexing with JSON or NoSQL databases?

46. How do you optimize queries in large fact tables (OLAP systems)?

47. How can you measure index efficiency in PostgreSQL?

48. What is the difference between horizontal and vertical partitioning?

49. How do you optimize full-table scans?

50. How can SQL query caching improve performance?

51. How does sharding impact index performance?

File Handling in Python

SQL Interview Questions for Freshers

Software Testing

Matplotlib/Seaborn

Pandas

WordPress

Functions in Python

T-SQL

Flask

PostgreSQL

Unit Testing in Python

Inheritance in Python

JavaScript Interview Questions for 0–1 Year Experience

Tuples in Python

HTML5

Lists in Python

SQL Global Variables

Error Handling and Exceptions in Python

Tutorials

Random Blogs

Categories