The Takeaways

By the end of this tutorial, make sure you can answer these questions:

  • The meaning of each of these operators: - != - >= and > - <= and <

  • How to use the NOT keyword

  • The many different ways to say the same thing

Get the data

For this lesson, download the following file, which is a SQLite database built from the U.S. Social Security Administration file of popular baby names:

Unzip it, and open the sqlite file using the SQLite client of your choice (e.g. DB Browser for SQLite)

Table of contents

Word of warning: This tutorial has very little to do with SQL specifically. Instead, it covers the more common ways to express logical comparisons. If that's something you've done in another programming language, then you can basically skip this chapter.

If you're new to programming, the bad news is that this will seem like a wall of new syntax and minutiae. There's no way to get past that perception until you practice it a little and slowly realize that there's not many actual new things to memorize, it's just that – as in plain English – there's many different ways to describe things as true or false – and you will be clumsy at it, at first.

The good news is that these logical operations are universal to math and programming. They're useful and essential in SQL, and everywhere else, too. I cover a battery of examples and variations, but for 90% of your actual work, you can easily get by with the most simple kinds of statements.

Finding non-matches with !=

We know how to filter rows by using the equals sign to make exact matches. To filter for non-matches, or rather, not equals, put an exclamation mark before the equals sign.

The following query will return all rows in which the name is not 'Daniel':

SELECT * 
FROM baby_names
WHERE name != 'Daniel'

As with the plain equals-sign operator, we can use != in a variety of combinations. Here's a query for rows in which the name is neither 'Daniel' nor 'John', for all years except 2013:

SELECT * 
FROM baby_names
WHERE name != 'Daniel' 
  AND name != 'John'
  AND year != 2013

Beware of using OR in an exclusionary condition

Remember that the OR keyword, by its very definition, expands the possible set of rows that can pass a condition. For example the following query:

SELECT * FROM baby_names 
  WHERE name = 'Daniel' 
    OR name = 'John'

Contains all of the rows by these two separate queries:

SELECT * FROM baby_names 
  WHERE name = 'Daniel' 
SELECT * FROM baby_names 
  WHERE name = 'John' 

OR wants everyone to have fun! But if you are trying to exclude rows, then OR becomes a bit too permissive. Let's revisit the previous example that used !=

SELECT * 
FROM baby_names
WHERE name != 'Daniel' 
  OR name != 'John'

Guess what that is equal to?

SELECT * 
FROM baby_names

By using OR, the WHERE condition can be described in English as:

  Include all the rows in which the name
  is NOT "Daniel", 
  OR, the name is NOT "John"

Well, a row in which the name is 'John' is also a row in which the name is not 'Daniel'. The condition as stated above includes a row if its name is not equal to 'Daniel'. Thus, 'John' rows are included, even though they fail the other (optional) requirement: name is not equal to 'John'.

But again, OR makes the requirements optional. Thus, two exclusive filters joined by an OR can result in nothing being excluded.

Here's a picture of beef

The above explanation was pretty convoluted, so let's throw in a photo of food to break up the text:

image

To describe the previous concept of NOT and OR in a contrived real-life situation:

You tell the waiter that you want something that is either:

He ends up serving you steak because, even though steak is beef, it is also not chicken, which satisfies one of your conditions.

What you should have told the waiter:

I want something that is not beef and is not chicken

Sure, you say, but what kind of waiter would be dumb enough to misinterpret the previous request? Well, a computer is far dumber than humans and won't attempt to guess what we really mean. So that's why we have to be careful – and pedantic – when writing code.

Basically, don't get fancy when trying to not include things. It's not unlike the confusion generated by a double negative, if you don't don't get my drift, eh?

Greater and less than

If you remember elementary school math, this is pretty straightforward.

To fetch all records before (i.e. year less than) 2012:

SELECT * 
FROM baby_names
WHERE year < 2012;

To fetch all records after (i.e. year greater than) 2012:

SELECT * 
FROM baby_names
WHERE year > 2012;

Greater/less than or equal to

To include the boundary number in a comparison, append the equals sign after the comparison operator.

To fetch all records before or during (i.e. year less than or equal to) 2012:

SELECT * 
FROM baby_names
WHERE year <= 2012;

To fetch all records during or after (i.e. year greater than or equal to) 2012:

SELECT * 
FROM baby_names
WHERE year >= 2012;

Combining numerical comparisons to create a range

Using the AND operator, you can specify a lower- and upper-bound to a range:

All records after 2011 but before 2014:

SELECT * 
FROM baby_names
WHERE year > 2011 
  AND year < 2014;

Using OR, you can specify values that fall outside a range:

All records that are before 2011 or after 2013:

SELECT * 
FROM baby_names
WHERE year < 2011 
  OR year > 2013;

Comparing text characters

Text characters can be compared against each other, i.e. the letter 'M' is "greater than" the letter 'H'.

To get the names between the letters 'T' and 'Y', i.e. all the names that begin with either of the letters 'U', 'V', and 'X':

SELECT * 
FROM baby_names
WHERE name >= 'U' 
  AND name < 'Y';

More logical pitfalls

But be careful that the combination of conditions doesn't lead to a logical impossibility:

SELECT * 
FROM baby_names
WHERE year > 2011 
  AND year < 2013;
SELECT * 
FROM baby_names
WHERE name < 'G' 
  AND name > 'Z';

The NOT keyword

The NOT keyword can be used to modify another keyword, such as AND or OR.

The following query looks for records in which the year is 2013 but the sex is not equal to 'F':

SELECT * 
FROM baby_names
WHERE year = 2013 
  AND NOT sex = 'F'

This is equivalent to this use of plain AND and the != operator:

SELECT * 
FROM baby_names
WHERE year = 2013 
  AND sex != 'F'

Negating a multiple OR statement

Don't be negative if you don't want to be

The NOT keyword is just another way to express yourself logically. However, just as in plain English, things can get confusing when you mix up negative and affirmative type comparisons.

For example:

Iowa is smaller than California when you consider that California has more land mass than Iowa does.

Versus:

Iowa is smaller than California when you don't ignore that California does not have less land mass than Iowa does.

In general, don't use NOT if it just don't sound like plain English.

There's never not many ways to phrase the same expression

This tutorial doesn't introduce any fundamental new concepts about SQL. But by just going over the comparison operators, it might feel as if there's an unlimited number of ways to do things.

And that's just about right. Which is why it's important to understand that SQL really is a language. And with any proper language, there's many different – and differently annoying – yet correct ways to say the same thing.

Think about comparisons in English:

Mary is older than Tom

– is the same as:

Tom is older than Mary

Likewise, in SQL:

SELECT * FROM baby_names
  WHERE year > 2011;

– i.e., "select all rows with year after 2011"" – can be expressed as:

SELECT * FROM baby_names
  WHERE 2011 < year

– i.e. "select all rows where 2011 is less than the year value".

When you're new to SQL, you're not familiar with all the different ways that an expression can be stated or simplified. So it's perfectly fine to create a seemingly verbose and clumsy sequence of otherwise simple statements.

For example, if asked to find all rows in which the year is from 2012 to 2014, and where the name is either 'John' or 'Lisa', you might write this out:

SELECT * FROM baby_names
WHERE name = 'John' AND year = 2012
    OR name = 'John' AND year = 2013
    OR name = 'John' AND year = 2014
    OR name = 'Lisa' AND year = 2012
    OR name = 'Lisa' AND year = 2013
    OR name = 'Lisa' AND year = 2014;

But thinking through the logic, you might simplify it like this:

SELECT * FROM baby_names
WHERE (name = 'John' OR name = 'Lisa')
    AND (year = 2012 OR year = 2013 OR year = 2014);

A bit of mathematical simplification:

SELECT * FROM baby_names
WHERE (name = 'John' OR name = 'Lisa')
    AND (year >= 2012 AND year <= 2014);

You may choose to use more SQL keywords as you learn them, though at this point, the difference is pretty much cosmetic:

SELECT * FROM baby_names
WHERE name IN ('John', 'Lisa')
    AND year BETWEEN 2012 AND 2014;

In general, it's best to keep things simple and not overthink them. Use what feels comfortable to you. After a bit of experience writing SQL, the first example will instinctively feel too much work. The last example may seem simple, but me personally, I constantly forget the proper syntax for using BETWEEN – so >= and <= is good enough for me.

In the end, the logic is the logic. Rather than worry about memorizing best practices of syntax and style, make sure you can at least express in plain English what you want to find.