If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

## Computer programming - JavaScript and the web

### Course: Computer programming - JavaScript and the web>Unit 3

Lesson 3: Relational queries in SQL

# Splitting data into related tables

So far, we've just been working with one table at a time, and seeing what interesting data we can select out of that table. But actually, most of the time, we have our data distributed across multiple tables, and those tables are "related" to each other in some way.
For example, let's say we have a table for logging how well students do on their tests, and we include emails in case we need to email parents about slipping grades:
`student_name``student_email``test``grade`
Peter Rabbitpeter@rabbit.comNutrition95
Alice Wonderlandalice@wonderland.comNutrition92
Peter Rabbitpeter@rabbit.comChemistry85
Alice Wonderlandalice@wonderland.comChemistry95
We might also have a table for logging what books each student reads:
`student_name``book_title``book_author`
Peter RabbitThe Tale of Mrs. Tiggy-WinkleBeatrix Potter
Peter RabbitJabberwockyLewis Carroll
Alice WonderlandThe Hunting of the SnarkLewis Carroll
Alice WonderlandJabberwockyLewis Carroll
We might also have a table just for detailed student information:
`id``student_first``student_last``student_email``phone``birthday`
1PeterRabbitpeter@rabbit.com555-66662001-05-10
2AliceWonderlandalice@wonderland.com555-44442001-04-02
What do you think of these tables? Would you change them in any way?
There is one big thing to realize about these tables: they are describing relational data - as in, they are describing data that relates to each other. Each of these tables describe data related to a particular student, and many of the tables replicate the same data. When the same data is replicated across multiple tables, there can be interesting consequences.
For example, what if a student's email changed? Which tables would we need to change?
We'd need to change the student information table, but since we also included that data in the grades table, we'd also have to find every row about that student, and change the email there too.
It's often preferable to make sure that a particular column of data is only stored in a single location, so there are fewer places to update and less risk of having different data in different places. If we do that, we need to make sure we have a way to relate the data across the tables, which we'll get to later.
Let's say we decide to remove email from the grades table, because we realize it's redundant with the email in the student details table. This is what we'd have:
`student_name``test``grade`
Peter RabbitNutrition95
Alice WonderlandNutrition92
Peter RabbitChemistry85
Alice WonderlandChemistry95
How could we figure out the email for each student? We could find the row in the student info table, matching by name. What if 2 students had the same name? (Did you know that in Bali, every person has only 1 of 4 possible first names?) We can't rely on name to look up a student, and really, we should never rely on something like name to identify anything uniquely in a table.
So the best thing to do is to remove the `student_name` and replace that with `student_id`, since that is a guaranteed unique identifier:
`student_id``test``grade`
1Nutrition95
2Nutrition92
1Chemistry85
2Chemistry95
We would make the same change to our books table, using `student_id` instead of `student_name`:
`student_id``book_title``book_author`
1The Tale of Mrs. Tiggy-WinkleBeatrix Potter
1JabberwockyLewis Carroll
2The Hunting of the SnarkLewis Carroll
2JabberwockyLewis Carroll
Notice how we have the title and author repeated twice for Jabberwocky? That's another warning sign that we could break our table up into multiple related tables, so that we don't have to update multiple places if something changes about a book.
We could have a table just about books:
`id``book_title``book_author`
1The Tale of Mrs. Tiggy-WinkleBeatrix Potter
2JabberwockyLewis Carroll
3The Hunting of the SnarkLewis Carroll
And then our `student_books` table becomes:
`student_id``book_id`
11
12
23
22
I know, this table doesn't look nearly as readable as the old table that had all of information stuffed into every row. But, tables are often not designed to be readable to humans-- they're designed to be the easiest to maintain and least prone to bugs. In many cases, it may be best to split information into multiple related tables, so that there is less redundant data and fewer places to update.
It's important to understand how to use SQL to deal with data that has been split up into multiple related tables, and bring the data back together across the tables when you need it. We do that using a concept called "join"s and that's what I'll show you next.

## Want to join the conversation?

• what is the purpose of having so many different tables, when we can have a single table with all information.?
1. Why can't you make something like a variable in SQL? I mean, you can do it in JavaScript, but why not SQL?
2. Why do you have to go row by row changing each thing if you can just change a variable name?
3. And why not put all the information from all the tables into 1 table?
4. What do you need an id for?
5. What's the importance of SQL tables?

1. SQL itself is not a programming language. It's a Query Language. You would use another programming language to write a program that sends queries to your database.
Here I use JavaScript with a hypothetical `performSqlQuery` function.
``var query = "SELECT * FROM students WHERE id = " + studentId;var result = performSqlQuery(query);``

2. You can use variables of the programming language where you create your SQL queries.

3. We want to avoid putting duplicate data into the database.
You could put things into one table like this:
``┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐│ INVOICE                                                                                                                                                                                │├───────────┬────────────┬──────────────┬────────────────┬─────────────────┬──────────┬────────────┬───────────────┬──────────────────────────┬────────────────────┬───────┬───────┬─────┤│ invoicenr │ customernr │ date         │ name           │ address         │ zipcode  │ city       │ country       │ iban                     │ product            │ count | price │ vat │├───────────┼────────────┼──────────────┼────────────────┼─────────────────┼──────────┼────────────┼───────────────┼──────────────────────────┼────────────────────┼───────┼───────┼─────┤│ "0003417" │ "93276"    │ "2017-09-01" │ "Joseph Smith" │ "120 Penstreet" │ "3829AD" │ "Somecity" │ "Somecountry" │ "GB29NWBK60161331926819" │ "Brause 361 nib"   │     1 │  2.99 │   19││ "0003417" │ "93276"    │ "2017-09-01" │ "Joseph Smith" │ "120 Penstreet" │ "3029AD" │ "Somecity" │ "Somecountry" │ "GB29NWBK60161331926819" │ "Art paper A4"     │     1 │ 13.29 │   19││ "0003417" │ "93276"    │ "2017-09-01" │ "Joseph Smith" │ "120 Penstreet" │ "3829AD" │ "Somecity" │ "Somecountry" │ "GB29NWBK60161331926819" │ "Pen holder, blue" │     1 │  5.19 │   19│└───────────┴────────────┴──────────────┴────────────────┴─────────────────┴──────────┴────────────┴───────────────┴──────────────────────────┴────────────────────┴───────┴───────┴─────┘``
(You may have to copy this somewhere to view correctly.)

The problem here is that a lot of data is the same. Only the product and the price are different.
Another problem is keeping the data consistent. In this table someone made a slight mistake. Can you spot it? This invoice may never arrive at the correct address.

It's better to split the data. This way you have less data to store, but more importantly, the data is only in one place. If you want to change someone's ZIP code for example, you don't have to go looking for it in many different places.
``┌─────────────────────────────────────────────────────────────────────────────────────────┐│ CUSTOMER                                                                                │├────┬────────────┬───────────┬────────┬──────────┬────────────┬──────────────────────────┤│ id │ customernr │ firstname │ prefix │ lastname │ address_id │ iban                     │├────┼────────────┼───────────┼────────┼──────────┼────────────┼──────────────────────────┤│  1 │ "93276"    │ "Joseph"  │ ""     │ "Smith"  │          1 │ "GB29NWBK60161331926819" │└────┴────────────┴───────────┴────────┴──────────┴────────────┴──────────────────────────┘┌─────────────────────────────────────────────────────────────────────┐│ ADDRESS                                                             │├────┬─────────────┬──────────┬──────────┬────────────┬───────────────┤│ id │ streetname  │  housenr │ zipcode  │ city       │ country       │├────┼─────────────┼──────────┼──────────┼────────────┼───────────────┤│  1 │ "Penstreet" │ "120"    │ "3829AD" │ "Somecity" │ "Somecountry" │└────┴─────────────┴──────────┴──────────┴────────────┴───────────────┘┌───────────────────────────────────────┐│ PRODUCT                               │├────┬────────────────────┬───────┬─────┤│ id │ name               │ price │ vat │├────┼────────────────────┼───────┼─────┤│  1 │ "Brause 361 nib"   │  2.99 │ 19  ││  2 │ "Art paper A4"     │ 13.29 │ 19  ││  3 │ "Pen holder, blue" │  5.19 │ 19  │└────┴────────────────────┴───────┴─────┘┌─────────────────────────────────────────────┐│ INVOICE                                     │├────┬───────────┬─────────────┬──────────────┤│ id │ invoicenr │ customer_id │ date         │├────┼───────────┼─────────────┼──────────────┤│  1 │ "0003417" │           1 │ "2017-09-01" │└────┴───────────┴─────────────┴──────────────┘┌──────────────────────────────────────┐│ INVOICELINE                          │├────┬────────────┬────────────┬───────┤│ id │ invoice_id │ product_id │ count │├────┼────────────┼────────────┼───────┤│  1 │          1 │          1 │     1 ││  2 │          1 │          2 │     1 ││  3 │          1 │          3 │     1 │└────┴────────────┴────────────┴───────┘``
(You may have to copy this somewhere to view correctly.)

4. In the INVOICE table you can see a `customer_id` of 1. That means that that particular invoice belongs to the customer with id 1. In this case that's Joseph Smith.
Can you tell how the INVOICELINE table works?

5. Perhaps, if you understand the previous answers, you have enough information to answer this yourself. :)
• What's the different between sub query and "join"? When do we need to use join or sub query? Which is better?
• What's a scenario where a book detail would change? Seems static to me.
(1 vote)
• It's possible for books to have the same title and same author. Or you may need to track revisions of a book to identify specific editions or versions.
• What are some resources on how good relational databases are structured? I will only be querying databases and not building them. It is good to know some of the basics when it comes to structure, I imagine.
• How do the different tables help to organize?
• Typically, in a database system, each table will represent a different entity - students, classrooms, courses - or it will be a a bridge table (which is used to linked tables with many-to-many relationships).

Keeping the data related to an entity in a single table logically organized and it can reduce the number of rows to a minimum. It also keeps data easier to maintain.

Imagine you had a table that listed a row with a teacher's name, the course the teacher is teaching, and the classroom number. Now, imagine that you have one teacher who teaches eight different courses. This means you have rows with the teacher's name repeated eight times.

A few days into the term, you discover that the teacher's name has been misspelled. If you had a single table with teacher names, you could just update the name in one row in the `teachers` table. However, in this case, you need to change the name eight different times.

That may not sound so bad. Now, imagine that you are working for a large company that does business with other large companies. One of the companies that supplies you with hundreds of parts decides to rebrand and change its name. Now, imagine having to change the name across hundreds of parts and thousands of documents. Imagine trying to remember every place that the name is used. Changing the name in one table is much faster.
• When I have multiple Tables can I maintain a directory of sorts to help me relate which table stands for what?
What happens in an Organization where there are hundreds of rows and columns and so many tables all relating to one another. And the person who created that database is no longer working there. How will I know what table is for what? And how does one maintain a directory, is there such a thing for SQL called "directory"?