Home/Blog/Interview Prep/Top 20 database interview questions and answers
Home/Blog/Interview Prep/Top 20 database interview questions and answers

Top 20 database interview questions and answers

12 min read
Jul 04, 2024

Are you gearing up for a database interview? Databases are important components of modern information systems, serving as efficient organizers and managers of data. Understanding databases is compulsory for various job positions, including data analysts, database administrators, software engineers, and more. To help you get ready, we’ve collected a list of the top 20 frequently asked database interview questions, along with their answers, to ensure you are well prepared. Explore the key concepts to boost your confidence and stand out in your next interview.

Database Interview Questions #

1. What is DBMS and its types?#

A database management system (DBMS) is software that serves as a bridge between the data and the software application. It is a collection of processes that collectively store, manage, and retrieve information for the user. DBMS makes system calls to work with the data stored in memory or disk. It also interacts with the OS for other system calls, for instance, network requests and responses.

Database management system (DBMS)
Database management system (DBMS)

Types of DBMS

  • Relational DBMS: Data is organized into tables with rows and columns. This uses SQL (Structured Query Language) for querying and managing data.

  • Object-oriented DBMS: Data is represented as objects, similar to object-oriented programming concepts. It is useful for managing complex data structures.

  • NoSQL DBMS: This stands for “not only SQL.” It is designed for large-scale distributed data storage and retrieval and is often used for big data and real-time web applications.

  • Hierarchical DBMS: Data is organized like a family tree. Each piece of information is connected to another, like branches on a tree.

  • Network DBMS: It is similar to hierarchical DBMS, but a complex relationship in which records can have multiple parent and child records.

2. What is RDBMS?#

A relational database management system (RDBMS) is a type of software that helps organize and manage data in a structured way. It manages application data stored on disk in a way that creates an abstraction of tables. Each table has rows (representing individual records) and columns (representing attributes or fields of the records). The arrows represent the relationship between the tables. This structure establishes the relationships between tables, enabling efficient data retrieval and manipulation. RDBMS uses SQL to query and manipulate data.

Relational database management system
Relational database management system

3. What are the different languages present in DBMS?#

In a DBMS, we use the following languages to manage the databases:

  • Data Definition Language (DDL) defines the structure and schema of the database.

  • Data Manipulation Language (DML) manipulates the data stored in the database.

  • Data Control Language (DCL) manages user permissions and controls in the database system.

  • Transaction Control Language (TCL) manages transactions in the database.

Commands present in different languages
Commands present in different languages

4. What is the difference between primary and unique keys?#

Here are the differences between a primary key and a unique key:

Primary key:

  • It uniquely identifies each record in a table.

  • It cannot contain NULL values.

  • Each table can have only one primary key.

Unique key:

  • It ensures that all values in a column are unique.

  • It allows NULL values.

  • Each table can have multiple unique keys.

Primary key and unique key
Primary key and unique key

As you can see in the table on the right, the ID column uniquely identifies each record. That’s why we label it as the primary key. The table that the Phone column has a blank entry in the last row. That’s why we label it as a unique key.

5. What is the difference between normalization and denormalization?#

Normalization organizes a database into tables and columns. It divides large tables into smaller tables and defines relationships between them. By organizing the fields and tables of a database, the goal is to minimize data redundancy and dependency.

Denormalization is used to add redundant data to a normalized database. It combines tables and columns to reduce the number of joins needed to retrieve data. The goal is to improve read performance, but it can also lead to data redundancy and increase the database’s complexity.

Normalized databases require data to be cross-referenced across multiple tables, which can sometimes suffer read performance. This is because retrieving information may involve joining multiple tables, which can impact the speed of data retrieval.

Let’s look at the example of normalization and denormalization:

Consider an example of a table, Students, that stores information about students in a school. The table will look like this:

Data with redundancy
Data with redundancy

In the table above:

  • StudentID uniquely identifies each student and serves as the primary key.

  • Name, Age, and Department_name columns contain information about each student.

Let’s identify redundancy. Note that both Alice and Charlie are in the engineering department. Similarly, both Bob and David are in the accounting department. This repetition of information leads to redundancy.

To address this, we can normalize the table by creating separate tables for distinct entities like Departments and linking them to the Students table using foreign keys. This approach reduces redundancy and improves data integrity.

Normalized data
Normalized data

In this normalized structure:

  • The Students table links to the Departments table via the DID column.

  • The Departments table contains unique departments.

  • The Students table references the appropriate department for each student.

This structure reduces redundancy and ensures data consistency by avoiding repeating groups and storing related information separately.

6. What are the types of normalization?#

There are different forms of database normalization, each building on the previous one to improve data integrity. 

The most common normal forms are:

  • First normal form (1NF): Each column in a table contains atomic (indivisible) values, and each column has a unique name.

  • Second normal form (2NF): The table is in 1NF, and all nonkey attributes are fully functional dependentFunctional dependency in databases means that the value of one attribute uniquely determines the value of another attribute in the same table. on the primary key.

  • Third normal form (3NF): The table is in 2NF, with no transitive dependenciesThis means that nonkey attributes are not dependent on other nonkey attributes..

  • Boyce-Codd normal form (BCNF): A stricter form of 3NF, where every determinantA determinant is an attribute or set of attributes that uniquely determines the value of other attributes in the same table. is a candidate key.

Types of normalization
Types of normalization

7. What are the differences between DDL, DML, and DCL? #

We can think of the differences between DDL, DML, and DCL in two different dimensions:

  1. Effect on the data:

  • DDL commands affect the structure of the database schema.

  • DML commands affect the actual data stored in the database tables.

  • DCL commands affect the access and permissions of users and roles in the database.

  1. Transaction management:

  • DDL and DCL commands are typically autocommitted, which means they are immediately applied and cannot be rolled back.

  • DML commands are transactions that can be committed or rolled back as a single unit.

8. What are the advantages of DBMS?#

The advantages of DBMS are as follows:

  • Data integrity: It ensures that data is accurate and consistent.

  • Data security: It provides mechanisms to protect data from unauthorized access.

  • Data independence: It allows changes in the database structure without affecting the applications.

  • Data backup and recovery: It offers features to backup and restore data in case of failures.

  • Efficient data access: It utilizes indexing and caching to retrieve data fast.

  • Concurrent access control: It manages simultaneous access to data by multiple users.

9. What is the difference between HAVING and WHERE clause? #

The WHERE clause filters data from individual rows in a table based on specified conditions, while the HAVING clause filters data from groups of rows in a query based on conditions involving aggregate values.

The WHERE clause#

  • The WHERE clause is used to filter rows before any groupings are made.

  • It is used with the SELECT, UPDATE, and DELETE statements to specify a condition.

SELECT * FROM employees WHERE department = 'Sales';

The HAVING clause#

  • The HAVING clause filters rows after the grouping has been done.

  • It is used with the GROUP BY clause in a SELECT statement to specify a condition.

SELECT department, COUNT(*) FROM employees GROUP BY department HAVING COUNT(*) > 5;

10. What is the subquery?#

A subquery, a nested query or inner query, is nested inside another query. For example, we have two tables, students and grades, and we want to find the names of students who scored above average. We will use a subquery to do this. Here’s how we can write the subquery:

SELECT name FROM students WHERE student_id IN (
SELECT student_id FROM grades WHERE grade > (SELECT AVG(grade) FROM grades));

The subquery (SELECT AVG(grade) FROM grades) calculates the average grade, and the outer query uses this result to find the names of students who scored above average.

11. What are aggregate functions?#

Aggregate functions are used to calculate values and return a single result. These functions include finding the number of items (COUNT), sum (SUM), average (AVG), minimum (MIN), or maximum (MAX) value in a column of a table.

Function Name

Meaning

Example

SUM()

The sum of the values in a column

SUM(Coumn

_Name)

AVG()

The average of the value in a column

AVG(Coumn

_Name)


MAX()

The largest values in a column

MAX(Coumn

_Name)


MIN()

The smallestt values in a column

MIN(Coumn

_Name)


COUNT()

Number of values in a column

COUNT(*)


12. What is a join and its types? #

A join is like combining two tables in a database to get information from both.

Here are the different types of join:

Types of join
Types of join

Let’s look at an example of students and courses.

Join example of students and courses
Join example of students and courses
  • Inner join combines rows from both tables where the specified condition is met.

SELECT students.Name, courses.CourseName
FROM students
INNER JOIN courses ON students.StudentID = courses.StudentID;
  • Left join returns all rows from the left table (students) and the matched rows from the right table (courses).

SELECT students.Name, courses.CourseName
FROM students
LEFT JOIN courses ON students.StudentID = courses.StudentID;
  • Right join returns all rows from the right table (courses) and the matched rows from the left table (students).

SELECT students.Name, courses.CourseName
FROM students
RIGHT JOIN courses ON students.StudentID = courses.StudentID;
  • Outer join combines the results of both left and right outer joins.

SELECT s.Name, c.CourseName
FROM students s
FULL JOIN courses c ON s.StudentID = c.StudentID;

13. What are the uses of view? #

We use view for the following:

  • Views are virtual tables that have rows and columns like tables.

  • Tables are like storage containers where data is kept on disk. Views, on the other hand, don’t store data directly. They’re like virtual windows that show data from tables based on a query. Views can also do more than just display data. They can group data, do aggregation, or set up rules for who can see what data.

  • Tables allow for data modification, including insert, update, and delete operations. However, we can’t directly modify data through a view if it involves multiple tables.

  • Views can be used to precompute and store the results of complex queries, improving performance by reducing the need to recompute the results each time the query is run.

14. What is a trigger? #

A trigger is a special kind of stored procedure in a database that automatically triggers or executes in response to certain events, such as inserting, updating, or deleting data in a table. It’s like a set of instructions that the database follows when specific actions occur, helping to maintain data integrity and automate tasks.

15. What is a stored procedure? #

A stored procedure is a set of SQL statements stored in the database that can be called by name to perform a specific task or operation. It helps reuse code, improve performance, and maintain consistency in database operations.

16. What is a transaction? #

A transaction is a unit of work performed on a database that involves one or more operations, such as reading, writing, or updating data. Transactions have ACID properties (atomicity, consistency, isolation, durability).

17. What are ACID properties? #

The ACIS properties are listed below:

  • Atomicity: Ensures that all operations within a transaction are completed successfully or none of them are.

  • Consistency: Ensures that a transaction brings the database from one valid state to another. It maintains the integrity constraints and rules defined for the database.

  • Isolation: Ensures that the execution of multiple transactions concurrently does not lead to data inconsistencies.

  • Durability: Ensures that once a transaction is committed, the changes made by the transaction persist, even in the event of a system failure.

18. What are clustered and nonclustered indexes? #

The difference between clustered and nonclustered indexes are as follows:

  • Clustered index: In a clustered index, the rows of the table are stored in the order of the index key. There can only be one clustered index per table, as the rows are physically sorted based on the index key.

  • Nonclustered index: In a nonclustered index, the index key entries point to the actual rows in the table. This means that the rows are not stored in the order of the index key, and there can be multiple nonclustered indexes per table. It allows for fast retrieval of specific rows based on criteria not covered by the clustered index, improving query performance.

Clustered and nonclustered indexes
Clustered and nonclustered indexes

19. What are the differences between the DROP, TRUNCATE, and DELETE commands?#

The difference between DROP, TRUNCATE, and DELETE are as follows:

  • DROP removes an entire table, including its structure, data, indexes, and triggers. It essentially deletes the table from the database.

  • TRUNCATE removes all table rows but retains the table structure, indexes, and triggers. A DELETE statement can achieve the same result, but a TRUNCATE is faster because it does not log individual row deletions.

  • DELETE removes one or more rows from a table based on a condition. It removes specific rows while leaving the table structure, indexes, and triggers intact. Unlike TRUNCATE, DELETE is logged and can be rolled back using a transaction.

DROP, TRUNCATE, and DELETE commands example
DROP, TRUNCATE, and DELETE commands example

20. What are the different levels of abstraction in the DBMS?#

The different abstraction levels are listed below:

  1. Physical level: This is the lowest level of abstraction and deals with how data is stored on the storage medium, such as disks. It includes details like data structures, file organization, and access methods.

  2. Logical level: This level describes what data is stored in the database and the relationships among the data. It hides the complexities of the physical storage and focuses on the logical structure of the database, including tables, views, and constraints.

  3. View level: This is the highest level of abstraction and represents how users perceive the data. It defines different database views for different users or applications, providing a customized and simplified interface to the underlying database.

Wrap up#

This blog has explored the key topics commonly asked in database interview questions. Specifically, we covered DBMS types, normalization, keys, relational databases, and joins.

If you want to further enhance your skills and tackle more interview questions, consider enrolling in our courses below. It’s designed to help you grasp the concepts guiding you from basic database principles to more advanced topics.


Written By:
Muhammad Usama
 
Join 2.5 million developers at
Explore the catalog

Free Resources