Aggregate data that match a condition

How to aggregate data that match a given condition. How to perform different aggregations of data that match different conditions, in a single query. Includes convenient SQLite shortcuts.

A simple aggregation with a condition

Example: Find the average and the maximum salaries in the company, leaving out the outliers (below 30,000 or above 90,000).

        avg(salary) AS avg_salary,
        max(salary) AS max_salary
    FROM employee
    WHERE salary BETWEEN 30000 AND 90000;

Note that this example is trivial, because the WHERE condition apply to both the aggregations.

Different conditions for each aggregation

The standard syntax

Example: Find the average of all the salaries, and the average of the salaries that are not outliers.

        avg(salary) AS avg_simple,
                WHEN salary BETWEEN 30000 AND 90000
                    THEN salary;
            END CASE
        ) AS avg_without_outliers
    FROM employee;

Now we have no WHERE conditions, so the first expression is an average of all the salaries in the table. The second aggregates a CASE expression. This expression returns salary when it matches the condition. When it doesn’t, NULL is returned. This is what we want, because NULLs are ignored by the avg() function.

This syntax works on all major DBMSs.

SQLite syntax

Let’s see how to do the same with a more clear syntax. This syntax is only supported by SQLite.

        avg(salary) AS avg_simple,
            FILTER (WHERE salary BETWEEN 30000 AND 90000)
            AS avg_without_outliers
    FROM employee;

For the second expression, the FILTER clause represents a WHERE condition that only applies to that particular expression.

FILTER is also used by other DBMSs, but only in the context of window functions (a different topic).


Related articles:

SQLite documentation: