HAVING have different functions. But they both filter out rows, so many people don’t know the important difference between them. Let’s shred some light.
In a single-table query,
WHERE comes in at the beginning of a query execution. We used to think that it determines which rows will be returned by the query, but this is not accurate:
WHERE determines which rows will be processed by the query.
The difference becomes clear when we use a
GROUP BY clause.
HAVING comes in at the end of a query execution.
After all rows have been processed,
HAVING determines which of rows will be sent to the client.
The theory should be clear. But let’s see the differences between
HAVING in practice.
The difference between
HAVING becomes clear when we run a query with
SELECT department_id, count(*) AS employees_no FROM employee WHERE gender = 'F' GROUP BY department_id HAVING employees_no < 10;
The query counts the number of female employees in each department, and only returns the departments where this number is less than 10.
WHEREexcludes non-female employees. Those rows are not read at all by the query.
GROUP BYgroups (or aggregates) the found rows, producing only one row for each distinct
HAVINGeliminates the aggregated rows where
employees_nois less than 10.
WHERE employees_no < 10would fail with an error, because that value doesn’t exist before aggregation.
HAVING gender = 'F'would fail with an error, because the gender column doesn’t exist in the aggregated rows (or, if you prefer, in the
Sometimes you may think that both
HAVING can be used, and that they’re equivalent. An example:
SELECT * FROM employee WHERE date_of_birth > '2000-01-01';
This query finds the employees that were born in this century.
Could we use
HAVING instead of
WHERE? In theory, yes. But in that case we’re telling the database to read all rows, and only return the ones that match the condition. This is unnecessarily slow.
WHERE clause, we’re asking the database to only read the rows we’re interested in. If there is an index that starts with the date_of_birth column, it will be used.
Note that some DBMSs are smart enough to translate an unnecessary
HAVING clause into a
WHERE clause. But not all DBMSs do so, and there may always be complex cases when a DBMS fails to apply this optimisation.