The Takeaways

By the end of this tutorial, make sure you can answer these questions:

  • What “SQL” stands for.

  • The difference between SQLite and the DB Browser for SQLite

  • The difference between a table and a database in the context of SQLite.

  • How to open a database using the DB Browser for SQLite (or the client of your choice).

  • How to execute a SQLite query

Get the data

For this lesson, download the following file, which is a SQLite database built from the U.S. Social Security Administration file of popular baby names:

Unzip it, and open the sqlite file using the SQLite client of your choice (e.g. DB Browser for SQLite)

Table of contents

SQL is difficult. For this course, it's going to be the single most difficult subject for most people, because it is a programming language. And even professional programmers find it challenging.

However, we're learning it for two important reasons:

  1. To efficiently load and process datasets of 1,000,000+ rows. With Google Sheets, we run into performance issues with even just 10,000 rows.
  2. To have a language for describing exactly what we want to do with the data.

That second point will be the difficult concept. So this short walkthrough is intended to introduce the first point.

Installing the SQLite Database browser

Stanford Journalism Lab installation

If you are a Stanford Journalism student, the Macs in the McClatchy lab already have the SQLite Browser installed. But you should install it on your own computers, too, for your own convenience.

You can download SQLite Database Browser from its homepage: http://sqlitebrowser.org/

How to open a database file

image

How to browse the data

image

How to write a query

image

SELECT IncidntNum FROM sfpd_incidents LIMIT 10
SELECT IncidntNum 
  FROM sfpd_incidents 
  LIMIT 10
SELECT IncidntNum, Category 
  FROM sfpd_incidents 
  LIMIT 10

Cause an error by using the wrong field name ("Description"):

SELECT IncidntNum, Category, Description
  FROM sfpd_incidents 
  LIMIT 10
SELECT IncidntNum, Category, Descript 
  FROM sfpd_incidents 
  LIMIT 10
SELECT Category, Descript, Resolution, Date, Time
  FROM sfpd_incidents 
  WHERE Time > "18:00"
  LIMIT 10

WHERE

SELECT Category, Descript, Resolution, Date, Time
  FROM sfpd_incidents 
  WHERE Time = 18:00
  LIMIT 10

The error:

    near ":00": syntax error: 

Use quotation marks for non-numerical values

SELECT Category, Descript, Resolution, Date, Time
  FROM sfpd_incidents 
  WHERE Time = "18:00"
  LIMIT 10
SELECT Category, Descript, Resolution, Date, Time
  FROM sfpd_incidents 
  WHERE Time > "18:00"
  LIMIT 10

Change the LIMIT

SELECT Category, Descript, Resolution, Date, Time
  FROM sfpd_incidents 
  WHERE Time > "18:00"
  LIMIT 100

Removing the LIMIT

SELECT Category, Descript, Resolution, Date, Time
  FROM sfpd_incidents 
  WHERE Time > "18:00"

References

This is a guide for how to get started on two other SQL clients (i.e. not SQLite Browser). The interface and exact steps are different, perhaps, but the concept is the same.

How to select data from a SQL database. Again, different SQL clients, but same concepts. SQL syntax (for the most part) will also be the same.

Just in case you want to download your own dataset