๐Ÿ“– MySQL Subqueries

A subquery is a SELECT statement coded inside another select statement. Same syntax and rules as other SELECT statements.

4 ways to use subqueries.

  • In a WHERE clause as a search condition.
  • In a HAVING clause as a search condition.
  • In the FROM clause as a table specification.
  • In the SELECT clause as a column specification.

Subqueries cannot be used in an ORDER BY clause. A subquery can return the following.

  • A single value
  • A list of values
  • A table or values
Using a Subquery to Replace a JOIN

Most subqueries can be restated as JOINs and most JOINs can be restated as subqueries.

JOIN Example

SELECT 
    invoice_number, invoice_date, invoice_total
FROM
    invoices
        JOIN
    vendors ON invoices.vendor_id = vendors.vendor_id
WHERE
    vendor_state = 'CA'
ORDER BY invoice_date;

Subquery Example

SELECT 
    invoice_number, invoice_date, invoice_total
FROM
    invoices
WHERE
    vendor_id IN (SELECT 
            vendor_id
        FROM
            vendors
        WHERE
            vendor_state = 'CA')
ORDER BY invoice_date;
Advantages of JOINS
  • SELECT clause can include columns from both tables.
  • May be more intuitive with primary key/foreign key relationship.
Advantages of Subqueries
  • You can use a subquery to pass an aggregate value to the main query.
  • May be more intuitive when relationship between tables is ad hoc.
  • Long, complex queries may be easier to code using subqueries.
WHERE or HAVING Clause

Subqueries work the same way for both WHERE and HAVING clauses.

IN Operator
  • You can use a subquery with the IN operator to provide the list of values that are tested against the test expression.
  • The subquery must return a single column of values.

Format

WHERE test_expression [NOT] IN (subquery);

Example

Show all invoices from vendors in California.

(Note that you do not need to show any columns from the vendors table โ€“ only use it as criteria.)

SELECT 
    invoice_number, invoice_date, invoice_total
FROM
    invoices
WHERE
    vendor_id IN (SELECT 
            vendor_id
        FROM
            vendors
        WHERE
            vendor_state = 'CA')
ORDER BY invoice_date;

Example

Show vendors without any invoices.

SELECT 
    vendor_id, vendor_name, vendor_state
FROM
    vendors
WHERE
    vendor_id NOT IN (SELECT DISTINCT
            vendor_id
        FROM
            invoices)
ORDER BY vendor_id;
Comparison Operator

Using aggregate value. Subquery returns a single value.

Example

Display invoices with a larger than average total amount.

SELECT 
    invoice_number, invoice_total
FROM
    invoices
WHERE
    invoice_total > (SELECT 
            AVG(invoice_total)
        FROM
            invoices)
ORDER BY invoice_total;
ALL

Using a list of values. Must use the ANY, SOME or ALL keywords since the query returns more than one value.

Example

Get invoices larger than the largest invoice for vendor 34.

SELECT 
    vendor_id, invoice_number, invoice_total
FROM
    invoices
WHERE
    invoice_total > ALL (SELECT 
            invoice_total
        FROM
            invoices
        WHERE
            vendor_id = 34);
  • If no rows are returned by the subquery, the comparison will be true for every row in the table.
  • If all the rows returned from the subquery contain a null value, the comparison will be false for every row in the table.
ANY or SOME

ANY and SOME work the same way.

Example

Get invoices smaller than the largest invoice for vendor 115.

SELECT 
    vendor_id, Invoice_number, invoice_total
FROM
    invoices
WHERE
    invoice_total < ANY (SELECT 
            invoice_total
        FROM
            invoices
        WHERE
            vendor_id = 115);
FROM clause

A subquery coded in the FROM clause returns a result set that is used like a temporary table sometimes referred to as an inline view. You MUST assign an alias to the subquery. You can then use the alias just like a table alias. Use aliases for any columns in the subquery that perform a calculation to be used as column names.

Example

SELECT 
    vendor_name, average_invoice_amt, max_invoice_amount
FROM
    (SELECT 
        vendor_id,
            AVG(invoice_total) AS average_invoice_amt,
            MAX(invoice_total) AS max_invoice_amt
    FROM
        invoices
    GROUP BY vendor_id) v
        JOIN
    vendors USING (vendor_id)
ORDER BY vendor_name;
Correlated Subqueries

A subquery that executes once for each row in the main query. Refers to a value thatโ€™s provided by a column in the main query. Use table aliases to distinguish the table in the subquery from the table in the main query.

Example

List invoices that are larger than the average invoice for that vendor.

SELECT 
    vendor_id invoice_number, invoice_total
FROM
    invoices i
WHERE
    invoice_total > (SELECT 
            AVG(invoice_total)
        FROM
            invoices
        WHERE
            vendor_id = i.vendor_id);
EXISTS Operator

Tests whether the subquery returns any result. NOT EXISTS test that no rows are returned. Doesnโ€™t matter which columns are returned.

Example

Get all vendors that donโ€™t have invoices.

SELECT 
    vendor_id, vendor_name
FROM
    vendors
WHERE
    NOT EXISTS( SELECT 
            *
        FROM
            invoices
        WHERE
            vendor_id = vendors.vendor_id);