The Evolution of Primary Keys in PostgreSQL: From Serial to Identity

6 min readJan 7, 2025

Welcome back. This is the following post on Mastering Postgres series. If you skipped earlier posts, I recommend checking them out where we explored topics, such as Check Constraint, Custom Data Types, Enums, etc. Today, we will discuss Serial and Identity in PostgreSQL.

The Evolution of Primary Keys in PostgreSQL: From Serial to Identity

In PostgreSQL, the Serial data type has long been the go-to option for generating auto-incrementing values for primary keys. It’s simple, effective, and widely used in many projects. However, as of PostgreSQL 10, this approach has been superseded by a more robust and flexible method: Identity columns.

Surprised 😅 ? So was I! For years, I also relied on Serial as the default choice for primary keys, but the PostgreSQL team has made improvements that warrant a shift in how we handle auto-incrementing keys.

In this post, we’ll take a deeper dive into:

Serial and its evolution
Identity columns: The new recommended way to handle auto-incrementing values

Bonus Section:

Generated Columns: A powerful way to create columns derived from other data in your table.

By the end of this post, you’ll not only understand why we should consider moving away from Serial, but also how to implement safer and more efficient primary key strategies and explore the power of Generated Columns.

Serial Type: The Old Standard

The Serial type has been a PostgreSQL stable for auto-incrementing values, especially when used as a primary key. At its core, Serial isn’t actually a data type — it’s an alias for an INTEGER column with a default value set by a sequence. Let’s break this down:

Common Use Case for `SERIAL`:

CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    name TEXT
);

When you declare a column with SERIAL, PostgreSQL automatically creates a sequence to manage the auto-incrementing behavior behind the scenes. If we unwrap this, it would look something like this:

CREATE SEQUENCE users_id_seq AS INTEGER;

CREATE TABLE users (
    id INTEGER NOT NULL DEFAULT nextval('users_id_seq'),
    name TEXT
);ALTER SEQUENCE users_id_seq OWNED BY users.id;

Auto-increment is provided by sequence that is created automatically when we declare Serial or as in this case using users_id_seq.

Sequence is created by the table and it exists outside the scope of the table, however that sequence is tied to the column.

Hovewer, keeping using SERIALas PRIMARY KEY is still fine and acceptable though a better approach exists.

One key moment to note about Serial being used as Primary Key — to use BigSerial instead of Serial. Under the hood, BigInteger will be used to store the value and it creates enough space in order to avoid running out of Primary Key space in the future in case of massive success.

If too many rows inserted, the space is ran out of and it will create much trouble.

The New Standard: Identity Columns

Starting with PostgreSQL 10, Identity columns were introduced as a more reliable and flexible alternative to Serial. Identity columns behave similarly to Serial, but with better guarantees about sequence management and more control over how the values are generated.

Here’s how to define an auto-incrementing Primary Key using Identity:

Using Identity for Primary Keys

CREATE TABLE users
(
    id   BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    name TEXT
);

With this approach:

The id column uses a BIGINT to ensure that the primary key values can grow without running into issues like overflow.
The GENERATED ALWAYS clause ensures that values are automatically generated, and you can’t insert values manually into this column without special handling.

Inserting values

INSERT INTO users(name)
VALUES ('Bruce'),
       ('Wayne'),
       ('Phil');

SELECT * FROM users;


-- OUTPUT
+--+-----+
|id|name |
+--+-----+
|1 |Bruce|
|2 |Wayne|
|3 |Phil |
+--+-----+

Attempting to Insert an Explicit ID

If you try to manually insert a value into the id column (which is generated automatically), PostgreSQL will block it:

INSERT INTO users(id, name)
VALUES (4, 'Baggio');

-- ERROR: cannot insert a non-DEFAULT value into column "id" 
-- Detail: Column "id" is an identity column defined as GENERATED ALWAYS. 
-- Hint: Use OVERRIDING SYSTEM VALUE to override.

This is a protective measure to keep the sequence in sync with the table data. If you absolutely need to insert a specific value, PostgreSQL provides a way to override the default sequence behavior, but it’s not recommended for regular use.

INSERT INTO users(id, name)
    OVERRIDING SYSTEM VALUE
VALUES (4, 'Baggio');

Restoring Sync After Manual Insertion

If you insert a value manually and break the sequence, PostgreSQL won’t automatically fix it for you. You’ll need to manually reset the sequence so it continues from the correct number. This can be done like so:

SELECT setval('users_id_seq', (SELECT MAX(id) FROM users));

-- Resetting the sequence 'users_id_seq' to the current maximum value in the 'id' column
-- This ensures that the next generated value for 'id' will continue from the highest existing value.

This ensures the sequence is properly aligned with the current state of the table.

Alternative: `BY DEFAULT` Identity

If you want the flexibility to manually insert values (like the Serial approach) but still use identity for auto-increment when no value is specified, you can use GENERATED BY DEFAULT:

CREATE TABLE users (
    id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
    name TEXT
);

This allows you to insert values explicitly while also letting PostgreSQL handle auto-incrementation when no value is provided. However, as mentioned earlier, this can lead to sequence misalignment, so it’s generally better to avoid this unless absolutely necessary.

In summary, as you can see IDENTITY ensures more safety and strictness on inserting rows and prevents from accidentally inserting a row into table
unless you explicitly specify OVERRIDING SYSTEM VALUE. That’s definitely what you would not do accidentally.

Another interesting topic to discuss — Generated Columns

Generated Columns: Dynamic Data on the Fly

Generated columns are a feature that allows you to create columns whose values are derived from expressions or functions based on other columns in the same row. These columns are read-only, meaning you can’t insert or update values directly in them.
Postgres supports Stored Generated Columns, meaning, the column is written to disk.

For example, let’s say we have a persons table and we want to store both height in centimeters and height in inches:

Example: Generated Column for Height Conversion

CREATE TABLE persons (
    height_cm INTEGER,
    height_inch INTEGER GENERATED ALWAYS AS (height_cm / 2.54) STORED
);

Here, the height_inch column is automatically generated based on the height_cm column value. You can't manually insert data into the height_inch column; it’s calculated dynamically when you insert or update a record.

INSERT INTO persons(height_cm)
VALUES (170), (179);

SELECT *
FROM persons;

-- OUTPUT

+---------+-----------+
|height_cm|height_inch|
+---------+-----------+
|170      |67         |
|179      |70         |
+---------+-----------+

If you try to insert height_inch manually, you will get error.

INSERT INTO persons(height_cm, height_inch)
values (170, 67);

-- ERROR: cannot insert a non-DEFAULT value into column "height_inch" 
-- Detail: Column "height_inch" is a generated column.

Example: Extracting Data from Strings

Another common use case for generated columns is extracting or manipulating data from other columns. For example, in an accounts table, you might want to extract the email domain into a separate column:

CREATE TABLE accounts (
    email TEXT,
    email_domain TEXT GENERATED ALWAYS AS (split_part(email, '@', 2)) STORED
);

INSERT INTO accounts(email)
VALUES ('bva@yandex.ru'), ('baggio@gmail.com');
SELECT * FROM accounts;

Output:

| email               | email_domain |
|---------------------|--------------|
| bva@yandex.ru       | yandex.ru    |
| baggio@gmail.com    | gmail.com    |

This is a great way to derive values without needing to store redundant data or constantly recalculate values in your application code.

Conclusion: Evolving with PostgreSQL

While Serial was a useful tool in earlier versions of PostgreSQL, the introduction of Identity columns provides a more robust and flexible solution for generating auto-incrementing values. Whether you choose Generated Always or Generated By Default, Identity columns ensure better synchronization and consistency with the database.

Additionally, Generated Columns allow you to create derived data without extra application logic, making your database smarter and more efficient.

By embracing these newer features, you’ll ensure that your PostgreSQL schemas are future-proof, cleaner, and more reliable.

Final Thoughts:

Serial still works but is now considered a legacy approach.
Identity offers better control and avoids some of the pitfalls of Serial.
Generated Columns provide a powerful way to transform data on the fly without storing redundant information.

PostgreSQL Domain Types and Enums: Ensuring Data Integrity

If you find this post useful, like and subscribe to stay tuned for future posts. Happy Coding!

The Evolution of Primary Keys in PostgreSQL: From Serial to Identity

The Evolution of Primary Keys in PostgreSQL: From Serial to Identity

Serial Type: The Old Standard

Common Use Case for `SERIAL`:

The New Standard: Identity Columns

Using Identity for Primary Keys

Attempting to Insert an Explicit ID

Restoring Sync After Manual Insertion

Alternative: `BY DEFAULT` Identity

Generated Columns: Dynamic Data on the Fly

Example: Generated Column for Height Conversion

Example: Extracting Data from Strings

Conclusion: Evolving with PostgreSQL

Final Thoughts:

Written by Java Jedi

No responses yet

The Evolution of Primary Keys in PostgreSQL: From Serial to Identity

The Evolution of Primary Keys in PostgreSQL: From Serial to Identity

Serial Type: The Old Standard

Common Use Case for SERIAL:

The New Standard: Identity Columns

Using Identity for Primary Keys

Attempting to Insert an Explicit ID

Restoring Sync After Manual Insertion

Alternative: BY DEFAULT Identity

Generated Columns: Dynamic Data on the Fly

Example: Generated Column for Height Conversion

Example: Extracting Data from Strings

Conclusion: Evolving with PostgreSQL

Final Thoughts:

Written by Java Jedi

No responses yet

Common Use Case for `SERIAL`:

Alternative: `BY DEFAULT` Identity