Table of contents
Word of warning: This tutorial has very little to do with SQL specifically. Instead, it covers the more common ways to express logical comparisons. If that's something you've done in another programming language, then you can basically skip this chapter.
If you're new to programming, the bad news is that this will seem like a wall of new syntax and minutiae. There's no way to get past that perception until you practice it a little and slowly realize that there's not many actual new things to memorize, it's just that – as in plain English – there's many different ways to describe things as true or false – and you will be clumsy at it, at first.
The good news is that these logical operations are universal to math and programming. They're useful and essential in SQL, and everywhere else, too. I cover a battery of examples and variations, but for 90% of your actual work, you can easily get by with the most simple kinds of statements.
Finding non-matches with !=
We know how to filter rows by using the equals sign to make exact matches. To filter for non-matches, or rather, not equals, put an exclamation mark before the equals sign.
The following query will return all rows in which the name is not 'Daniel'
:
SELECT *
FROM baby_names
WHERE name != 'Daniel'
As with the plain equals-sign operator, we can use !=
in a variety of combinations. Here's a query for rows in which the name is neither 'Daniel'
nor 'John'
, for all years except 2013
:
SELECT *
FROM baby_names
WHERE name != 'Daniel'
AND name != 'John'
AND year != 2013
Beware of using OR
in an exclusionary condition
Remember that the OR
keyword, by its very definition, expands the possible set of rows that can pass a condition. For example the following query:
SELECT * FROM baby_names
WHERE name = 'Daniel'
OR name = 'John'
Contains all of the rows by these two separate queries:
SELECT * FROM baby_names
WHERE name = 'Daniel'
SELECT * FROM baby_names
WHERE name = 'John'
OR
wants everyone to have fun! But if you are trying to exclude rows, then OR
becomes a bit too permissive. Let's revisit the previous example that used !=
SELECT *
FROM baby_names
WHERE name != 'Daniel'
OR name != 'John'
Guess what that is equal to?
SELECT *
FROM baby_names
By using OR
, the WHERE
condition can be described in English as:
Include all the rows in which the name
is NOT "Daniel",
OR, the name is NOT "John"
Well, a row in which the name is 'John'
is also a row in which the name is not 'Daniel'
. The condition as stated above includes a row if its name
is not equal to 'Daniel'
. Thus, 'John'
rows are included, even though they fail the other (optional) requirement: name
is not equal to 'John'
.
But again, OR
makes the requirements optional. Thus, two exclusive filters joined by an OR
can result in nothing being excluded.
Here's a picture of beef
The above explanation was pretty convoluted, so let's throw in a photo of food to break up the text:
data:image/s3,"s3://crabby-images/dbeb9/dbeb9c8db0a9e0c3970a9e0f8637880c63178feb" alt="image"
To describe the previous concept of NOT
and OR
in a contrived real-life situation:
You tell the waiter that you want something that is either:
He ends up serving you steak because, even though steak is beef, it is also not chicken, which satisfies one of your conditions.
What you should have told the waiter:
I want something that is not beef and is not chicken
Sure, you say, but what kind of waiter would be dumb enough to misinterpret the previous request? Well, a computer is far dumber than humans and won't attempt to guess what we really mean. So that's why we have to be careful – and pedantic – when writing code.
Basically, don't get fancy when trying to not include things. It's not unlike the confusion generated by a double negative, if you don't don't get my drift, eh?
Greater and less than
If you remember elementary school math, this is pretty straightforward.
To fetch all records before (i.e. year
less than) 2012
:
SELECT *
FROM baby_names
WHERE year < 2012;
To fetch all records after (i.e. year
greater than) 2012
:
SELECT *
FROM baby_names
WHERE year > 2012;
Greater/less than or equal to
To include the boundary number in a comparison, append the equals sign after the comparison operator.
To fetch all records before or during (i.e. year
less than or equal to) 2012
:
SELECT *
FROM baby_names
WHERE year <= 2012;
To fetch all records during or after (i.e. year
greater than or equal to) 2012
:
SELECT *
FROM baby_names
WHERE year >= 2012;
Combining numerical comparisons to create a range
Using the AND
operator, you can specify a lower- and upper-bound to a range:
All records after 2011 but before 2014:
SELECT *
FROM baby_names
WHERE year > 2011
AND year < 2014;
Using OR
, you can specify values that fall outside a range:
All records that are before 2011 or after 2013:
SELECT *
FROM baby_names
WHERE year < 2011
OR year > 2013;
Comparing text characters
Text characters can be compared against each other, i.e. the letter 'M'
is "greater than" the letter 'H'
.
To get the names between the letters 'T'
and 'Y'
, i.e. all the names that begin with either of the letters 'U'
, 'V'
, and 'X'
:
SELECT *
FROM baby_names
WHERE name >= 'U'
AND name < 'Y';
More logical pitfalls
But be careful that the combination of conditions doesn't lead to a logical impossibility:
SELECT *
FROM baby_names
WHERE year > 2011
AND year < 2013;
SELECT *
FROM baby_names
WHERE name < 'G'
AND name > 'Z';
The NOT
keyword
The NOT
keyword can be used to modify another keyword, such as AND
or OR
.
The following query looks for records in which the year
is 2013
but the sex
is not equal to 'F'
:
SELECT *
FROM baby_names
WHERE year = 2013
AND NOT sex = 'F'
This is equivalent to this use of plain AND
and the !=
operator:
SELECT *
FROM baby_names
WHERE year = 2013
AND sex != 'F'
Negating a multiple OR statement
Don't be negative if you don't want to be
The NOT
keyword is just another way to express yourself logically. However, just as in plain English, things can get confusing when you mix up negative and affirmative type comparisons.
For example:
Iowa is smaller than California when you consider that California has more land mass than Iowa does.
Versus:
Iowa is smaller than California when you don't ignore that California does not have less land mass than Iowa does.
In general, don't use NOT
if it just don't sound like plain English.
There's never not many ways to phrase the same expression
This tutorial doesn't introduce any fundamental new concepts about SQL. But by just going over the comparison operators, it might feel as if there's an unlimited number of ways to do things.
And that's just about right. Which is why it's important to understand that SQL really is a language. And with any proper language, there's many different – and differently annoying – yet correct ways to say the same thing.
Think about comparisons in English:
Mary is older than Tom
– is the same as:
Tom is older than Mary
Likewise, in SQL:
SELECT * FROM baby_names
WHERE year > 2011;
– i.e., "select all rows with year
after 2011
"" – can be expressed as:
SELECT * FROM baby_names
WHERE 2011 < year
– i.e. "select all rows where 2011
is less than the year
value".
When you're new to SQL, you're not familiar with all the different ways that an expression can be stated or simplified. So it's perfectly fine to create a seemingly verbose and clumsy sequence of otherwise simple statements.
For example, if asked to find all rows in which the year
is from 2012 to 2014, and where the name
is either 'John'
or 'Lisa'
, you might write this out:
SELECT * FROM baby_names
WHERE name = 'John' AND year = 2012
OR name = 'John' AND year = 2013
OR name = 'John' AND year = 2014
OR name = 'Lisa' AND year = 2012
OR name = 'Lisa' AND year = 2013
OR name = 'Lisa' AND year = 2014;
But thinking through the logic, you might simplify it like this:
SELECT * FROM baby_names
WHERE (name = 'John' OR name = 'Lisa')
AND (year = 2012 OR year = 2013 OR year = 2014);
A bit of mathematical simplification:
SELECT * FROM baby_names
WHERE (name = 'John' OR name = 'Lisa')
AND (year >= 2012 AND year <= 2014);
You may choose to use more SQL keywords as you learn them, though at this point, the difference is pretty much cosmetic:
SELECT * FROM baby_names
WHERE name IN ('John', 'Lisa')
AND year BETWEEN 2012 AND 2014;
In general, it's best to keep things simple and not overthink them. Use what feels comfortable to you. After a bit of experience writing SQL, the first example will instinctively feel too much work. The last example may seem simple, but me personally, I constantly forget the proper syntax for using BETWEEN
– so >=
and <=
is good enough for me.
In the end, the logic is the logic. Rather than worry about memorizing best practices of syntax and style, make sure you can at least express in plain English what you want to find.