Metabase's question builder writes SQL for you, so you don't *have to* know SQL. But understanding how SQL is structured will help to make sure you're getting the data you want out of the database.
A query uses keywords, or STATEMENTS, to structure a request from the database. Metabase gives you access to the essential statements needed to build basic queries.
---
## QUERIES
Let's write a basic query.
We want to see all the data in the books table. To do this, first we say what data we need.
```
SELECT *
```
SELECT is where we list our data points, or the columns, we want in our new table.
The asterisk * means EVERYTHING. As in, return all the columns available.
Next, we tell the database where to find the data.
```
SELECT *
FROM books
```
FROM is where we specify which table where the data we want is stored.
"books" is the name of our table.
---
## QUERIES
Now we have a complete query that we can send to the database.
```
SELECT *
FROM books
```
When we run this query, the database will return a new table with all the data in the books table.
But maybe we don't want to see all the data in the books table.
For example, what if we only want to see data about books from a particular publisher?
+++
To do this, we use the WHERE statement to create filters.
```
SELECT *
FROM books
WHERE imprint_id = 1
```
WHERE tells the database what rows to add to our table.
imprint_id is the name of the column we're filtering on.
We've limited our table to only include rows where the imprint_id is 1.
---
## QUERIES
Now let's step back to the SELECT statement.
We don't want to see all the columns in the books table. The id is only useful to the database, and we've already limited the data to rows where the imprint id equals 1.
We only want to see the print isbn, title, and author name, so we'll list those columns in the SELECT statement.
```
SELECT print_isbn, title, author_name
FROM books
WHERE imprint_id = 1
```
+++
The results from that query would look something like:
![](./img/query_results1.png)
---
## QUICK BREATH
Ok. Cool.
Easy enough, so far?
If you're keeping up and feel like you understand the basic concepts as much you can by listening to me talk, you're pretty much ready to start building queries in Metabase using the question builder.
The basic format of the question builder, with a few word changes, is the same as the query we wrote, and intuitive enough to use without knowing SQL.
+++
At this point, you could pull some columns and rows out of a table, but for most of the data we want to interact with, there are better sources available to us.
Your reports won't be very useful unless you can aggregate data from multiple tables. Otherwise, you're only seeing a small part of the data, and you're looking at id numbers instead of names or other useful information.
To do that, we need to join tables, and that's where things get complicated.
But understanding joins is essential to making the most of Metabase and minimizing the postprocessing you'll need to do in Excel.
---
## BEFORE YOU JOIN
Not to make it sound scary or intimidating. Joins can be fairly simple and straightforward.
Understanding how joins work is important.
Quite often even experienced query writers will choose the wrong type of join for their data. It's an easy mistake to make and can be hard to notice without careful scrutiny, depending on the data.
A danger of gathering your own data is assuming you've captured a dataset, when you actually used the wrong join and you're missing a chunk of relevant data.
Does that sound scary enough? Grrr. Boo! Be careful!
The essential thing here is to evaluate and scrutinize your data.
Whether it's a bad join or some other error, never assume you got it right without review.
---
## FIRST, A SIMPLE JOIN
Once more to the books table.
We want to use the data in the books table, but we want the imprint name in our table, not the id number.
We'll start with our original query, but change the filter to search by author name.
```
SELECT print_isbn, title, author_name
FROM books
WHERE author_name ilike '%Barthleme'
```
Don't worry too much about what's happening in the WHERE statement. The question builder will write your WHERE statement, and we'll talk about those options later.
---
## FIRST, A SIMPLE JOIN II
The JOIN statement joins our tables together.
We need two things to join tables:
the name of the table we're joining to and
the column we're joining on
+++
We're joining the books table to the imprints table.
To do that, we match the values in the imprint_id column in the books table to the values in the id column in the imprints table.
```
SELECT print_isbn, title, author_name
FROM books
JOIN imprints ON books.imprint_id = imprints.id
WHERE author_name ilike '%Barthleme'
```
JOIN states the name of the database to join and ON states where the table intersects with the first table.
For joining purposes, the table in the FROM statement is referred to as the LEFT table and the table in the JOIN is referred to as the RIGHT table.
---
## FIRST, A SIMPLE JOIN III
Now that the imprints table is joined to books, we can add columns from imprints to our SELECT statement.
```
SELECT print_isbn, title, author_name, company_name
FROM books
JOIN imprints ON books.imprint_id = imprints.id
WHERE author_name ilike '%Barthleme'
```
+++
And now we send our query to the database and get:
![](./img/join-results.png)
Our results, showing company name instead of imprint id.
---
## JOINS
There are four types of joins:
INNER JOIN: Returns records that have matching values in both tables
LEFT OUTER JOIN: Returns all records from the left table, and the matched records from the right table
RIGHT OUTER JOIN: Returns all records from the right table, and the matched records from the left table
FULL OUTER JOIN: Returns all records when there is a match in either left or right table
---
## JOINS
So far we haven't specified what type of join to use.
When writing SQL, an unspecified join defaults to INNER.
In Metabase, the question builder uses LEFT OUTER by default.
But what does all this mean?!
---
## JOINS
LEFT or RIGHT?
The LEFT table is the table we're joining to.
For our query, this the table in our FROM statement.
The RIGHT table is the table being joined.
In SQL, the order you write things in the JOIN statement DOES NOT determine RIGHT or LEFT.
```
SELECT print_isbn, title, author_name
FROM books
JOIN imprints ON books.imprint_id = imprints.id /*same*/
WHERE author_name ilike '%Barthleme'
```
```
SELECT print_isbn, title, author_name
FROM books
JOIN imprints ON imprints.id = books.imprint_id /*same*/
WHERE author_name ilike '%Barthleme'
```
---
## JOINS
### the diagrams
This is the classic set of Venn diagrams demonstrating how each join works.
In the diagrams, Table A is the LEFT table, and Table B is the RIGHT table.
---
## JOINS
### inner
An INNER join is the intersection of the two tables.
The table our join creates will only contain rows which contain data in both tables.
+++
For instance:
if a row in our books table doesn't have an imprint id or
it's imprint id isn't in the imprints table
using an inner join will exclude that row from our results.
If we're searching for books by a particular publisher, we don't need to include rows without an imprint id.
---
## JOINS
### inner
```
SELECT print_isbn, title, author_name, company_name
FROM books
INNER JOIN imprints ON books.imprint_id = imprints.id
WHERE books.imprint_id = 1
```
For this query, we're filtering to books with an imprint id of 1.
In this case, an INNER join works well because we only want rows that appear in both the books and imprints tables.
---
## JOINS
### inner
```
SELECT print_isbn, title, author_name, company_name
FROM books
INNER JOIN imprints ON books.imprint_id = imprints.id
WHERE author_name ilike '%Barthleme'
```
But in this query, we're filtering to books by the author's name.
In this case, an INNER join will exclude any books that don't have an imprint id.
We want to see all books by Barthelme even if the book doesn't have an imprint id or the imprint is not in our imprints table.
Using an INNER join here might prevent us from adding relevant data to our table.
---
## JOINS
### left
A LEFT OUTER join returns all the rows in the LEFT table, and the matching rows in the RIGHT table.
Our resulting table will contain all the rows from the LEFT table that match our filter conditions, as well as any rows from the RIGHT table that match our join conditions.
+++
For instance:
if a row in our books table doesn't have an imprint id, or
it's imprint id isn't in the imprints table
the imprint_name column in our new table will simply be blank, or in SQL terms, NULL.
---
## JOINS
### left
```
SELECT print_isbn, title, author_name, company_name
FROM books
LEFT JOIN imprints ON books.imprint_id = imprints.id
WHERE author_name ilike '%Barthleme'
```
Using a LEFT join this query, our new table will have a row for every book with author_name Barthelme.
If the book doesn't have a matching imprint in the imprints table, the company_name column in our new table will be NULL.
---
## JOINS
### right and full
For the most part, you'll be using INNER and LEFT joins.
A RIGHT join, unsurprisingly, is the opposite of a LEFT join.
A FULL join is a combination of LEFT and RIGHT.
There are definitely times we these joins are useful, but since INNER and LEFT joins will be more relevant at the moment, we're going to skip getting into the details of RIGHT and FULL joins.
---
## JOINS
### multiple tables
Sometimes we need to join multiple tables.
In our database, some of our books have been tagged. A tag can be reused, and we don't won't to store the value of the tag over and over again in our database.
+++
So we have a table called tags, which stores the values of tags. When a new tag is created, it's added to this table.
But if a tag already exists, we just refer back to the value using the tag's unique id number.
![](./img/tags_table.png)
The tags table.
---
## JOINS
### multiple tables
We don't need to see tag data in the books table because tags are only used by a few applications, and we want to keep the books table as small as possible.
+++
Instead, we have a books_tags table, which contains a column with a book_id and a tag_id.
![](./img/books-tags-table.png)
The books_tags table.
---
## JOINS
### multiple tables
If we want to connect the text value of a tag to a book:
first we join the books_tags table to the books table, matching on book id numbers
then we join the tags table to the books_tags table, matching on tag id numbers
```
SELECT print_isbn, title, author_name, company_name, tags.value
FROM books
LEFT JOIN imprints ON books.imprint_id = imprints.id
LEFT JOIN books_tags ON books.id = books_tags.book_id
LEFT JOIN tags ON books_tags.tag_id = tags.id
WHERE tags.value like 'Fiction'
```
---
## CONCLUSION
Hopefully at this point you have a basic understanding of relational databases and how queries work to fetch data from relational databases.
You might not be able to write SQL queries, but fortunately, Metabase does a lot of the work for you.
The best way to learn about the Question Builder is getting hands on with it.