top of page

The Basics

This chapter discusses the basics of SQL and relational data.

The Basics

SQL, the Structured Query Language, is the most popular language for creating, manipulating, and analyzing relational data.

A Relational Database is essentially a collection of tables, not dissimilar to an Excel spreadsheet. The structure of each table is defined by a list of columns, each associated with a name and a data type (e.g., integer values or text). Tables contain rows that must comply with the table structure. For instance, each row must specify values for each of the table columns and the types of those values must comply with the column data types.

For instance, consider the following table (named Students):

Name

GPA

Start

John

4

2022

Isabel

3.5

2020

Marc

3

2020

This table has three columns, named Name, GPA, and Start as shown in the header row, and contains three data rows (meaning three students). We would probably want to associate the Name column to the data type Text, the GPA column to the data type Numeric, and the Start year to type Integer. SQL enables you to define data types for each column of a table, as well as other constraints, to ensure that users can only insert data that complies with those constraints. In later chapters, we will discuss the types of available constraints in more detail.

A simple operation we might want to perform on that data is a lookup. Given the name of a specific student, we want to find out the student's GPA. On the other hand, perhaps we want to get information not about a single student but about an entire class (e.g., all students who started in 2020). Instead of retrieving information about each single student, it could be also interesting to retrieve aggregate information, calculating the average GPA for all students or all students within a certain class. And, of course, in a real database we typically have not one but multiple tables. That makes it interesting to explore ways to combine data from multiple tables. SQL makes it easy to perform those and many more complex types of data analysis by writing SQL queries.

In the following chapters, we will first discuss the SQL Data Definition Language, commands that allow us to define table structures and constraints. We will see how to load data into a previously created database and update or delete it. Next, we will discuss simple analysis operations, including data filtering and arithmetic expressions. Having focused on single tables until this point, we will expand our scope to join operations that combine data from multiple tables. Afterward, we discuss data aggregation, either over single tables or over data partitions, as well as data sorting. Finally, we will discuss sub-queries, a technique that enables you to nest SQL queries inside of other SQL queries to enable more complex analysis.

SQL is supported by a variety of database management systems, including the likes of PostgreSQL, SQLite, MySQL, DuckDB, and many others. To get fully proficient with SQL, install one of those systems and try out your own SQL queries while working through the following chapters.

© 2025 by Immanuel Trummer. All rights reserved.

bottom of page