All SQL queries give deterministic output. The output is equivalent to the output you'll get after following the clause execution order below -
The logical execution order
The order is:
FROM
,ON
,JOIN
(from, ON, join in this order)WHERE
GROUP BY
HAVING
SELECT
(window functions are executed at this step)- Apply
distinct()
if used ORDER BY
LIMIT
Simply - The engine goes from larger to smaller.
It takes in all relevant data and then progressively cuts it down until it get to the specified result.
FROM
,ON
,JOIN
- Take in all source tables. Join them to create a single large table.WHERE
- Filter the large combined table to remove unnecessary rows.GROUP BY
,HAVING
- Group the rows, filter rows within the groups.SELECT
,distinct()
- Select required columns from the table. Apply window functions if required.ORDER BY
- Order the rows.LIMIT
- Remove rows that don't fit in the specified limit.- Return the data as result.
This can change a bit depending on your SQL engine (if it adds new non-standard features. Eg - If QUALIFY
clause is supported, it is executed after the SELECT
clause). But the standard is pretty much the same across databases.
Theoretical background
SQL is declarative.
When you write a SQL query, you tell the execution engine about the outcome/end-state that you want. The database engine figures out the steps to achieve that end state. You don't tell the actual steps (operations and their order of execution).
So, internally during execution, the engine's query optimizer may actually execute clauses in some other order (to optimize execution).
You get one guarantee - No matter what the engine does internally, the output will always be equivalent to the output you'll get after following the logical clause execution order.
You don't have to care about what the engine does internally. The output if exactly what you get after following the logical clause execution order.
It is as if the engine says "Why do you care in what order I did things internally? The output is exactly what you expected"