Subqueries in SQL
Video Tutorial
Overview
A subquery is a query that is used within another SQL query embedded in the WHERE clause. Subqueries are majorly used in SELECT, INSERT, UPDATE, DELETE, FROM and WHERE clauses. The subqueries are used wherever an expression is allowed. The data returned by the subquery is used by the outer statement in the same way a literal value would be used. Subqueries in SQL provide an easy and efficient way to handle the queries which depend on the results from another query. There are some rules which are to be followed while applying subqueries in SQL which are explained in this article.
Scope of the Article
- This article covers introduction to subqueries in SQL, rules to be followed while using a subquery.
- How to use subqueries with the SELECT and INSERT statement is explained with the examples.
- The usage of subqueries with the UPDATE and DELETE statement is also covered with the examples.
- Using subqueries with FROM and WHERE clause is covered with examples.
- How to use subqueries with the IN, NOT IN, EXISTS, NOT EXISTS is covered with examples.
- There is a brief introduction to nested and correlated subqueries.
Introduction to Subqueries in SQL
Subqueries also called a nested query or inner query is a query which is present inside another SQL query. In SQL, it is possible to place a SQL query inside another query. For example,
Let's take a look at subquery and outer query of the above given example.
In a subquery, the outer query's result is dependent on the result-set of the inner query. This is the reason why subqueries are also called nested queries. Let's take an example to understand the working of the subqueries.
Consider a table of Students on which the below given query is applied.
Students:
student_id | name | marks | country |
---|---|---|---|
41 | Richa Shah | 26 | India |
42 | Malina Joseph | 25 | England |
43 | Kavish Shah | 23 | Canada |
44 | Heta Ramya | 26 | UK |
45 | Kevin Peterson | 23 | USA |
The above query is applied to the given student's table. Here, the subquery is executed first which selects minimum marks from the student's table. Then this result is passed on to the WHERE clause of the outer query. Then the outer query is executed which selects rows where marks is equal to the result of the subquery and returns the final result.
The output of the above SQL query is given below:
Before getting into the examples of using subqueries with different clauses and keywords, let's first understand why to use subqueries?
Subqueries are used to execute a query dependent on the outcome of another query. Subqueries allow you to accomplish the same task without writing two distinct queries. Subqueries are majorly used with INSERT, DELETE, SELECT, UPDATE statements along with the comparison operators like <, >, >=, <=, =, BETWEEN, IN, etc.
Subquery Rules
There are various rules that a subquery must follow, like:
- Subqueries must be enclosed within parenthesis.
For example:
In the above example, the query which is enclosed in the parenthesis is called a subquery.
- Subqueries always runs first followed by the main query.
- The subqueries in MySQL cannot use the ORDER BY keyword whereas the main query can use the ORDER BY keyword whenever required. You can use the GROUP BY command to perform the same function as the ORDER BY in the subquery.
- You cannot use BETWEEN operator within a SQL subquery. However, it can be used with the main query.
- It is not possible for subqueries to be enclosed in a set of functions.
- Subqueries that are returning multiple rows can only be used together with the numerous value operators. Such operators include the IN operator.
- The SELECT list must not include any references evaluating to a CLOB, NCLOB, BLOB, or ARRAY.
- If there is a subquery that returns multiple rows together then it must be used with the multiple value operators only, like an IN operator.
To understand the subqueries with the different keywords in SQL, the following table is used as a reference throughout the article.
Employee:
emp_id | emp_name | emp_age | emp_dept | emp_salary | emp_country |
---|---|---|---|---|---|
121 | Lisa Carol | 27 | Finance | 30,000 | USA |
122 | Farooq Shaikh | 29 | HR | 20,000 | USA |
123 | Nastya Henry | 28 | Technology | 45,000 | UK |
124 | Christine Maybach | 30 | Research | 50,000 | USA |
125 | Ryan Renolds | 27 | Research | 50,000 | Canada |
Subqueries with SELECT Statement
In most cases, subqueries are used with the SELECT statements.
Syntax:
To understand more clearly let's take the above mentioned Employee table and run the below mentioned query and observe the output.
Example:
Do not worry about the working of the above query, you'll get to know in the next section but let's first see what's the result when the query is run on the Employee table.
The final result of the above query is:
Let's break the query statements to understand it better. In this example, the employee name having the maximum salary is filtered out and is displayed as the result. To achieve this result a subquery is created which finds the maximum salary using the max() function as shown below:
The above query acts as a subquery in our example and returns the value 50,000. Now, to further fetch the employee name having the salary as 50,000 the below query is applied.
Here, the query to find out the maximum salary is the subquery. The outer query takes the result of the inner query and executes the remaining SQL commands based on that result. So, the subquery returns the maximum salary, i.e. 50,000. And this result is passed on to the outer query which fetches the name of the employee having the salary of 50,000.
Subqueries with INSERT statement
After the SELECT statement, the second-best option to use subqueries is with INSERT statements. The selected data in the subquery can be further modified through any characters, number functions, etc.
The INSERT statement uses data returned from the subquery to enter into another table.
Syntax:
To understand more clearly let's take the above mentioned Employee table and also consider that a new employee table i.e. EmployeeNew is available in the database and run the below mentioned query and observe the output.
Example:
The final result of the above query i.e. the EmployeeNew table would look like below:
EmployeeNew:
Here, the EmployeeNew table is empty and the above query copies all the columns from the Employee table to the EmployeeNew table. Let's break the query statements to understand clearly.
The INSERT statement specifies that the new data is added to this table i.e. EmployeeNew table. As we are copying all the data from the Employee table to the EmployeeNew table, there is no need to specify the column name in the INSERT statement otherwise you would need to mention the column name in which you want to add the data.
Consider the EmployeeNew table which is created above and this is also used in the below examples.
EmployeeNew:
emp_id | emp_name | emp_age | emp_dept | emp_salary | emp_country |
---|---|---|---|---|---|
121 | Lisa Carol | 27 | Finance | 30000 | USA |
122 | Farooq Shaikh | 29 | HR | 20000 | USA |
123 | Nastya Henry | 28 | Technology | 45000 | UK |
124 | Christine Maybach | 30 | Research | 50000 | USA |
125 | Ryan Renolds | 27 | Research | 50000 | Canada |
Example:
Let's take another example in which only the employees whose age is less than 29 are inserted into a new table i.e. EmployeeNew table using the subqueries in the INSERT statement.
The final result of the above query would look like the below:
EmployeeNew:
In this example, the subquery filters out the result and the employees whose age is less than 29 are included in the result set of the inner query or subquery. Then, this result is passed on to the outer query, and all the employees' data from the Employee table whose age is less than 29 are added to the EmployeeNew table as shown in the final result above.
Subqueries with UPDATE Statement
The UPDATE statement is used to modify an existing record within the table. To benefit from the usage of subqueries the WHERE clause should be used with the UPDATE statement otherwise the overall functioning of the table may get disturbed. In the UPDATE statement, the two clauses in which subqueries are most commonly used are SET and WHERE.
Syntax:
To understand more clearly let's take the above mentioned Employee table and also consider that a new employee table i.e. EmployeeNew is available in the database and run the below mentioned query and observe the output.
Example:
The final result of the above query would look like the below:
Employee:
In this example, the Employee table is updated using UPDATE and SET. Here, the SET clause defines the new value for the emp_salary column which is being modified by the UPDATE statement. As you can see in the subquery, the employees having age greater than 27 are filtered out and this result set is passed on to the outer query. The outer query updates the emp_salary column while considering the result set of the subquery i.e. salary of all the Employees having an age greater than 27 will increase by two times as mentioned in the outer query.
Subqueries with DELETE Statement
The subqueries can also be used with the DELETE statements to delete one or multiple records from the table.
Syntax:
To understand more clearly let's take the above mentioned Employee table and also consider that a new employee table i.e. EmployeeNew is available in the database that is mentioned above in the article and run the below mentioned query and observe the output.
The final result of the above query i.e. the Employee table would look like the below:
In this example, the DELETE statement is used to delete the rows from the Employee table with the help of EmployeeNew table. Firstly, the subquery is executed and the employees from the EmployeeNew table whose department is "Research" are fetched out and this result set is passed on to the outer query.
The outer query then deletes the records from the Employee table who is in the Research department. Note that not all the records that are present in the EmployeeNew and Employee table are deleted but only those records or employees who are in the Research department are deleted.
Subqueries in FROM clause
FROM is used to specify the source from which the data is to be fetched. Here, subqueries in FROM clause create an intermediate table that can be used directly to retrieve the results for the main SELECT query or joined with the other tables and then used subsequently.
To understand more clearly let's create a new Employee table and the Department table and run the below mentioned query and observe the output.
Department:
dept_id | dept_name | dept_head | head |
---|---|---|---|
0330 | Finance | Mr Patel | 121 |
0335 | HR | Mrs Shah | 123 |
0340 | Technology | Mr Ramchandani | 125 |
0345 | Research | Mr Bajaj | 122 |
Employee:
emp_id | emp_name | dept_id | emp_dept | emp_salary | emp_country |
---|---|---|---|---|---|
121 | Lisa Carol | 0330 | Finance | 30,000 | USA |
122 | Farooq Shaikh | 0335 | HR | 20,000 | USA |
123 | Nastya Henry | 0340 | Technology | 45,000 | UK |
124 | Christine Maybach | 0345 | Research | 50,000 | USA |
125 | Ryan Renolds | 0345 | Research | 50,000 | Canada |
The final result of the above query would look like below:
Subqueries in WHERE Clause
To filter the rows from the result set, subqueries in the WHERE clause can be used by comparing a column in the main table with the results of the subquery.
Syntax:
To understand more clearly let's take the above mentioned Employee table and Department table is available in the database and run the below mentioned query and observe the output.
Example:
The final result of the above query is:
In this example, the subquery extracts out the rows in which the employee's country is the USA and emp_id corresponding to that employee is noted. The result set of this subquery is passed on to the outer query and the outer query looks for the values of head 121, 122, 124 because these are the head values corresponding to the employees residing in the USA. But only 121 and 122 head values are present in the Department table and thus dept_name associated with these head values are printed in the result set of the above query.
Subquery with IN and NOT IN, EXISTS, Not EXIST
-
Subquery with IN and NOT IN
Subqueries can be used with the IN and NOT IN keywords and the result of after executing these subqueries and result into zero or more values. The result set of this subquery is used by the outer query to further filter out the results.
Let's take an example to understand how the subqueries work with the IN and NOT IN keywords.
Example:
The final result of the above query would look like below:
In this example, the subquery is used with the IN keyword. Here, the subquery extracts out the record from the department table whose department is Finance. Then the result set of this query is used by the outer query to further execute the remaining query. The outer query then prints the result from the Employee table where the Employee department belongs to Finance and thus only one record i.e. on employee is available in the Employee table who is in the Finance department.
Example:
The final result of the above query would look like below:
In the above example, the subquery filters out the rows from the department table whose department is Finance. The result set of this subquery is used by the outer query and all the records from the Employee table are available in the result set except those employees who are in Finance department. Therefore, the final result set of the above query contains all the employees except the one in Finance department.
-
Subquery with EXISTS and NOT EXISTS
Subqueries with the EXISTS keyword can be used to know whether any row exists on the table or not. Let's take the examples below to understand more about how EXISTS and NOT EXISTS work with the subqueries.
Subqueries that uses EXISTS keyword will return TRUE if the subquery returns any rows. If the subquery using EXISTS keyword does not return any rows then it will return FALSE.
Example:
The final result of the above query would look like below:
In the above example, the result set returned by the subquery contains NULL which causes the EXISTS operator to return to TRUE. Therefore, the above given query returns all the rows from the Employee table.
Example:
The final result of the above query would look like below:
In the above example, the EXISTS keyword is used with the subquery. In the subquery all the records from the Employee table which has same emp_id as that in head of the department table is printed out in the result set. Therefore, all the records from the Employee table where ther is atleast one record from the department table with the same emp_id is available in the result set.
Example:
The final result of the above query would look like the below:
As shown in the above example, the subquery checks if there is any record available which has same emp_id from the Employee table and head from the department table. The result set of this query is then used by the outer query. The outer query consists of NOT EXISTS condition so all the records that are not present in the result set of the subquery is printed out in the final result of the query i.e. only one record which has emp_id = 345 is available in the result set of the above given example.
Subquery in HAVING Clause
Subqueries can also be used with the HAVING clause which filters the groups for the result set, by comparing a column in the main table with the results of the subquery.
To understand more clearly let's take the above mentioned Employee table and Department table available in the database and run the below mentioned query and observe the output.
The final result of the above query is:
Nested and Correlated Subqueries
Nested queries mean when the subquery is executed first and its results are inserted into the WHERE clause of the outer query. In the case of the nested query, the subquery is independent of the outer query and both can run independently. The subqueries in the nested query are executed only once and the result set after executing the subquery is used by the outer query.
In a correlated subquery, the main query is analysed first and based on its results the subquery is initiated and executed. It is called correlated because the inner query references the column of the outer query. The correlated subquery is slow as the subquery is executed for every row returned by the outer query.
Check out this article to learn more about Types of Subqueries.
Conclusion
- You learned about SQL subqueries and how these can be super useful and flexible to use to filter out data from the tables in SQL.
- SQL subqueries are nested queries embedded inside the other query. Subqueries are good alternative to SQL joins as they increase efficiency.
- In a subquery, the outer query's result is dependent on the result-set of the inner subquery.
- The main query can use the ORDER BY command but ORDER BY cannot be used in the subquery. You can use the GROUP BY command to perform the same function as the ORDER BY in the subquery.
- The result returned by a single row subquery can be set to the new column value using an UPDATE statement.
- Subqueries are also used with the HAVING, WHERE, clauses.
- Subqueries can be used with DELETE statement. It can be useful to delete the records from various tables.