Table of contents
The SELECT statement
You can guess what the SELECT statement does: it's what we use to query the database for data. From the SQLite documentation:
The result of a SELECT is zero or more rows of data where each row has a fixed number of columns. A SELECT statement does not make any changes to the database.
There's not many kinds of statements in SQL, and 95% of our queries will involve using the SELECT statement. As a corollary – if you read the excerpt above – this means most what we generally do with SQL databases does not alter the data, i.e. creating, updating, and deleting data.
Trying a basic
Let's get right into executing a query.
Open a database file – I'm assuming you've downloaded and unzipped the following file:
If you are using the client, DB Browser for SQLite, select the Execute SQL tab and type the following into the first input box:
Then hit the Play button (or better yet, the keyboard shortcut: Ctrl + Return).
The result should look something like this (note that I've left off the semi-colon, which is optional here but is generally used to denote the end of a command):
Here's an animated GIF of the process:
From this point on, I’ll be displaying the input and output of these examples as just text, rather than screenshotting the GUI.
This is partly to emphasize that whatever graphical client we use is merely a user-friendly shell for the code we execute. Also, text is a lot easier to produce for these tutorials.
What does it mean?
We asked the database to "select the number 100" and it obliged. The result of that "query" is simply the number
100, returned as a single column (with a header of
100) and a single row.
Selecting multiple values
In SQL syntax, we can specify a series of values to select by separating them with commas; note in the example below, it's only the commas that delimit the values; whitespace and newlines have no effect.
SELECT 100, 200, -300,
The result is just…more values, each in their own column:
Specifying literal text strings
We can specify literal text strings by enclosing a word in single quotes (i.e. apostrophes):
SELECT 'hello', 'world'
I re-emphasize the term literal here; the previous query selected the literal words
'world', separately. What if we wanted to select the phrase,
We just enclose the phrase in single quotes, which specifies that we want
world again, literally, but as a single string connected with a literal comma and space:
SELECT 'hello, world';
Text strings without the quotes?
What happens if you omit the single quotes, as we did in the first example queries with just numbers (e.g.
SELECT 100, 200, 300)?
You should get an error message:
Error: no such column: hello
Without the single-quotes, the SQLite interpreter interprets
hello as something non-literal, such as the the name of a column or some other SQLite-specific command. So when I use the term literal, I mean, just the plain, ordinary value that a text string has for humans, rather than the special meaning that it might have for the computer.
If it's not obvious by now, the word
SELECT has a special meaning to the SQLite interpreter, as it is a keyword in the syntax of the SQLite language. The following query would return an error, because it doesn't make sense to the interpreter:
What if we wanted to
SELECT the literal text string of
'SELECT'? Again, by using the single-quotes, we specify to the interpreter that we want the word "SELECT", literally:
Are you literally tired of seeing the word literal explained over and over?? I belabor the point here because not knowing when to use single-quotes – or if you're new to programming in general, the importance concept of computer syntax – will bite you very hard, very soon. But don't worry, we'll reiterate the concept by the end of the lesson.
FROM clause to select data from tables
So selecting values such as
100 is not very exciting, nor does it make much sense. What we really want to do is select values from our data tables. To do that, we use the
FROM keyword to specify from which table we want the data.
Remember that in our data file –
ssa_baby_names--2010_2014--all.sqlite.zip – there are 2 tables:
baby_names – which has 600,000+ rows for every combination of baby name, sex, year, and state:
totals, which has a mere 780 rows representing every combination of year, state, and sex:
Select one column from one table
Let's get right into the syntax. To select the
state column from all the rows in the
The result as a screenshot:
Take particular note of the log window, which gives some statistics about the result:
780 Rows returned from: SELECT state
FROM totals; (took 5ms)
In other words, every row from the
totals table was returned, but only the
Let's repeat the previous
SELECT statement but now choose the
The preview of the returned data rows looks the same as before, but the results log shows that many, many more data rows were returned:
633608 Rows returned from: SELECT state
FROM baby_names; (took 45ms)
Choosing multiple columns from a table
Selecting one column at a time is not very exciting. To select multiple columns, simply list each column's name as a comma-delimited list. Here's 3 columns from
SELECT name, state, year
The ordering of the columns is dictated by the order of the column names in the
SELECT year, sex, state, name
Trying to select non-existent columns
Let's go back to the
totals table, but use the same query as above otherwise:
SELECT year, sex, state, name
The results log should display an error message:
The relevant part of that error message is:
no such column: name
If you browse the
totals table, you'll see that there is no
name column, hence, the error.
Trying to select columns without a table
Here's a query that will return a similar error of column
name not found:
SELECT name, sex, state;
I've completely omitted the
FROM clause, which means I haven't specified a table. So the error message is technically true – a
name column can't be found. But it's maddeningly vague.
You might be looking at this thinking that only an idiot could ever forget the
FROM clause – there's literally only two keywords to remember, right? But when you include the half-dozen clauses (and countless other keywords) we're about to learn, it might surprise you how often you forget to include
FROM because you assume the computer can read your mind as to which table you want to access. So, fair warning.
Trying to select from a non-existent table
Again, you might think, what kind of idiot would SELECT FROM a table that doesn't exist? The most common situation is when the table's name is misspelled:
SELECT name, sex
So now you're thinking: But there's only 2 tables, can't the computer fix it like the way my phone has autocorrect?.
Sorry, but there's no autocorrect when it comes to executing SQL statements.
Using * to select all columns
Sometimes (but not always) it's nice just to select all the columns in a table without having to type all their names out. Use the star operator, i.e.
*, after the
Other selection minutiae
OK, we've basically covered the core concept: the syntax for selecting data rows from a single table:
SELECT column_a, column_b
But how do you select columns from multiple tables? The answer to that is surprisingly complicated – we won't get to it this week.
In the meantime, here are a few things and concepts about SQL syntax that aren't fundamental – i.e. you probably won't memorize them – but that are worth mentioning because they reinforce the strict and peculiar logic of the SQL language that we've seen so far. Also, they will become relevant as we cover more advanced data-wrangling techniques in SQL.
Selecting columns repeatedly
If you can select a column once, you can select it more than once:
SELECT name, year, name, year, name, name
Note how the number of rows returned is still the number of rows total in the
baby_names table, i.e. the number of columns has no effect on the number of rows.
Why would we need to refer to one column multiple times in the same
SELECT statement? What if I told you SQL had functions that can transform text in the same way that spreadsheets do?
SELECT name, UPPER(name), LOWER(name)
We'll get to those functions soon enough…
Selecting values and columns
Remember this query?
What happens if we include a
FROM clause? And throw in a couple of column names for good measure?
SELECT name, 100, state FROM baby_names;
The result is the number
100 for every row retrieved from the
Why would that ever be useful? To be honest, there's not a ton of obvious use cases for including a constant value in the results. But we may run into its use in real-world scenarios down the road…
Single-quotes and double-quotes
Earlier in this tutorial, we covered the use of single-quotes in denoting string literals, or as I refer to them, text strings with literal values. To review:
SELECT name, 'name'
We already know what happens when we don't enclose
name in quotes: it's treated as an identifier for the column,
SELECT name, name
But what if we use double-quotes around
name? If you've ever used another programming language in which you can enclose text strings in either single- or double-quotes, you should be curious about it:
SELECT name, "name"
The result is that SQLite interprets
"name" as a column identifier, i.e., the column known by the name of
To reiterate the difference, try this query (oh, by the way, the case of the letters do not matter for an identifier:
SELECT name, NAME, "NaMe", 'name'
If this is your first time with a programming language, all of these details should seem overwhelming. That's fine; as I said before, the most important takeaway is how to
FROM a table:
SELECT column_a, column_b
All the other details in this lesson are described so that you begin to appreciate just how literal and simple-minded the interpreter will be when reading your SQL code. With SQL, we'll become significantly more powerful in our data-crunching ability, but it requires becoming significantly more careful and specific in how we talk to our computers.