Publishers of technology books, eBooks, and videos for creative people

Home > Articles

  • Print
  • + Share This
This chapter is from the book

This chapter is from the book

Filtering Groups with HAVING

The HAVING clause sets conditions on the GROUP BY clause similar to the way that WHERE interacts with SELECT. The HAVING clause’s important characteristics are:

  • The HAVING clause comes after the GROUP BY clause and before the ORDER BY clause.

  • Just as WHERE limits the number of rows displayed by SELECT, HAVING limits the number of groups displayed by GROUP BY.

  • The WHERE search condition is applied before grouping occurs, and the HAVING search condition is applied after.

  • HAVING syntax is similar to the WHERE syntax, except that HAVING can contain aggregate functions.

  • A HAVING clause can reference any of the items that appear in the SELECT list.

The sequence in which the WHERE, GROUP BY, and HAVING clauses are applied is:

  1. The WHERE clause filters the rows that result from the operations specified in the FROM and JOIN clauses.

  2. The GROUP BY clause groups the output of the WHERE clause.

  3. The HAVING clause filters rows from the grouped result.

To filter groups:

  • Following the GROUP BY clause, type:

    HAVING search_condition

    search_condition is a search condition used to filter groups. search_condition can contain aggregate functions but otherwise is identical to the WHERE search condition, described in “Filtering Rows with WHERE” and subsequent sections in Chapter 4. You may combine and negate multiple HAVING conditions with the logical operators AND, OR, and NOT.

    The HAVING search condition is applied to the rows in the output produced by grouping. Only the groups that meet the search condition appear in the result. You can apply a HAVING clause only to columns that appear in the GROUP BY clause or in an aggregate function.

In Listing 6.18 and Figure 6.18, I revisit Listing 6.9 earlier in this chapter, but instead of listing the number of books that each author wrote (or co-wrote), I use HAVING to list only the authors who have written three or more books.

Listing 6.18 List the number of books written (or co-written) by each author who has written three or more books. See Figure 6.18 for the result.

SELECT
    au_id,
    COUNT(*) AS "num_books"
  FROM title_authors
  GROUP BY au_id
  HAVING COUNT(*) >= 3;

Figure 6.18Figure 6.18 Results of Listing 6.18.

In Listing 6.19 and Figure 6.19, the HAVING condition also is an aggregate expression in the SELECT clause. This query still works if you remove the AVG() expression from the SELECT list (Listing 6.20 and Figure 6.20).

Listing 6.19 List the number of titles and average revenue for the types with average revenue over $1 million. See Figure 6.19 for the result.

SELECT
type,
COUNT(price) AS "COUNT(price)",
AVG(price * sales) AS "AVG revenue"
FROM titles
GROUP BY type
HAVING AVG(price * sales) > 1000000;

Figure 6.19Figure 6.19 Results of Listing 6.19.

Listing 6.20 Listing 6.19 still works without AVG(price * sales) in the SELECT list. See Figure 6.20 for the result.

SELECT
    type,
    COUNT(price) AS "COUNT(price)"
  FROM titles
  GROUP BY type
  HAVING AVG(price * sales) > 1000000;

Figure 6.20Figure 6.20 Results of Listing 6.20.

In Listing 6.21 and Figure 6.21, multiple grouping columns count the number of titles of each type that each publisher publishes. The HAVING condition removes groups in which the publisher has one or fewer titles of a particular type. This query retrieves a subset of the result of Listing 6.14 earlier in this chapter.

Listing 6.21 List the number of books of each type for each publisher, for publishers with more than one title of a type. See Figure 6.21 for the result.

SELECT
    pub_id,
    type,
    COUNT(*) AS "COUNT(*)"
  FROM titles
  GROUP BY pub_id, type
  HAVING COUNT(*) > 1
  ORDER BY pub_id ASC, "COUNT(*)" DESC;

Figure 6.21Figure 6.21 Results of Listing 6.21.

In Listing 6.22 and Figure 6.22, the WHERE clause first removes all rows except for books from publishers P03 and P04. Then the GROUP BY clause groups the output of the WHERE clause by type. Finally, the HAVING clause filters rows from the grouped result.

Listing 6.22 For books from publishers P03 and P04, list the total sales and average price by type, for types with more than $10,000 total sales and less than $20 average price. See Figure 6.22 for the result.

SELECT
    type,
    SUM(sales) AS "SUM(sales)",
    AVG(price) AS "AVG(price)"
  FROM titles
  WHERE pub_id IN ('P03', 'P04')
  GROUP BY type
  HAVING SUM(sales) > 10000
    AND AVG(price) < 20;

Figure 6.22Figure 6.22 Results of Listing 6.22.

TIP

  • Generally, HAVING clauses should involve only aggregates. The only conditions that should be specified in the HAVING clause are those conditions that must be applied after the grouping operation has been performed. It’s more efficient to specify conditions that can be applied before the grouping operation in the WHERE clause. The following statements, for example, are equivalent, but the first statement is preferable because it reduces the number of rows that have to be grouped:

    SELECT pub_id, SUM(sales) --Faster 
      FROM titles 
      WHERE pub_id IN (‘P03’, ‘P04’) 
      GROUP BY pub_id 
      HAVING SUM(sales) > 10000;
    
    SELECT pub_id, SUM(sales) --Slower 
      FROM titles 
      GROUP BY pub_id 
      HAVING SUM(sales) > 10000 
        AND pub_id IN (‘P03’, ‘P04’);
  • + Share This
  • 🔖 Save To Your Account