More efficient SQL with query planning and optimization
SELECTs across multiple tables, it's a good time to talk about the efficiency of your SQL queries - how quickly do they execute, and could they execute faster?
Why do SQL queries need a plan?
SELECT * FROM books WHERE author = "J K Rowling";
- Do a "full table scan": look at every single row in the table, return the matching rows.
- Create an "index": Make a copy of the table sorted by author, then do a binary search to find the row where the author is "J K Rowling", find the matching IDs, then do a binary search on the original table that returns the rows that match the ID.
The lifecycle of a SQL query
- The query parser makes sure that the query is syntactically correct (e.g. commas out of place) and semantically correct (i.e. the tables exist), and returns errors if not. If it's correct, then it turns it into an algebraic expression and passes it to the next step.
- The query planner and optimizer does the hard thinking work. It first performs straightforward optimizations (improvements that always result in better performance, like simplifying 5*10 into 50). It then considers different "query plans" which may have different optimizations, estimates the cost (CPU and time) of each query plan based on the number of rows in the relevant tables, then it picks the optimal plan and passes it on to the next step.
- The query executor takes the plan and turns it into operations for the database, returning the results back to us if there are any.
Where do humans come in?
EXPLAIN QUERY PLANin front of any SQL to see what it's doing behind the scenes. If you use that, be prepared to dig deep into the
EXPLAIN QUERY PLANreference, because the "explanation" is pretty detailed and implementation specific. If you're using another SQL engine, you can search for "how do I get an execution plan in X".
WHEREon the author column, then we could explicitly create the index, using
CREATE INDEX. Then the SQL engine would be able to use that index to efficiently find the matching rows. You can read this guide about SQLite query planning to help you understand when indexes would help.