Auto-incrementing IDs in PostgreSQL: IDENTITY, sequences, and what your application is doing wrong

Frank Heikens·
postgresql

Auto-incrementing IDs in PostgreSQL: IDENTITY, sequences, and what your application is doing wrong

A ticket lands in your queue: "can you reset the IDs on the orders table so they start at 1 again? They're at 5,000,234 and it looks unprofessional in support emails."

Two desks over, a junior is debugging an INSERT that succeeded but returned 0 instead of the new row's ID. A few minutes later, someone else asks why their INSERT INTO accounts ... ON CONFLICT DO NOTHING sometimes returns no row, even though the row clearly exists.

These are three different surface symptoms of the same underlying misunderstanding. They keep coming back, in every team, in every project, until the people writing the code internalise what an auto-incrementing primary key actually is.

This post is about flipping that.

The pattern: an ID is a token, not a number

The number 5 in id = 5 looks like it means "the fifth one". It doesn't. To PostgreSQL, the sequence-backed primary key is a generator of opaque tokens — values guaranteed to be unique within a table, never reused, never re-ordered, with no semantic meaning at all. The fact that those tokens happen to be small integers in ascending order is an implementation convenience, not a property your application is allowed to rely on.

Once you internalise that, half the questions in this cluster collapse:

  • "Can you reset the IDs?" — No, and you don't need to. The numbers don't mean anything.
  • "Why are there gaps?" — Because tokens were generated and not used (rolled-back transactions, failed inserts, ON CONFLICT retries). Gaps are not a bug.
  • "Why is MAX(id) returning a different row than I just inserted?" — Because another connection inserted in between, and MAX(id) is asking the wrong question.

The rest of this post is the practical follow-through.

Picking the right type in 2026

PostgreSQL 10 (released 2017) introduced GENERATED ALWAYS AS IDENTITY. That's been the right default for seven years. If you're starting a new table now:

CREATE TABLE orders (
    id  bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    ...
);

Use bigint, not int. Once you've outgrown int you can't widen it without rewriting every page of the table; pick wide and forget about it.

GENERATED ALWAYS blocks INSERT INTO orders (id) VALUES (...) — the application can't accidentally specify an ID. If you really need to (data import, replication, very specific edge cases), use OVERRIDING SYSTEM VALUE per statement, deliberately. GENERATED BY DEFAULT AS IDENTITY allows the override implicitly but quietly erodes the invariant. Prefer ALWAYS.

SERIAL and BIGSERIAL still work, and you'll find them in older schemas. They are pseudo-types: an integer column plus a separate sequence object plus a default expression. The sequence is its own relation with its own ownership and its own privileges — GRANT on the table doesn't extend to the sequence, so a role with INSERT on the table can still get permission denied for sequence ... if nobody granted USAGE. None of that has changed; it's been a papercut since the SERIAL pseudo-type existed, and it's one of the reasons IDENTITY exists. With GENERATED ... AS IDENTITY there is no separate sequence object to grant on; the column manages its own generator internally. Migrate when you can; new tables should not use SERIAL.

UUIDs are the alternative when you genuinely need an ID generated outside the database — distributed systems where multiple writers mint IDs, external systems that publish IDs to clients before write, or any case where you'd rather not reveal table size in a URL. Use uuid (16 bytes, indexable like any other type). For random UUIDs, gen_random_uuid() (built into core since PG 13) is the answer; ignore the legacy uuid_generate_v4() from uuid-ossp. PostgreSQL 18 adds uuidv7() to core, which is preferable when you'd rather not pay the random-insert cost on a B-tree index — UUIDv7 is time-ordered, so new rows land at the right edge of the index instead of all over it.

For most internal tables: IDENTITY. For surface APIs and distributed writes: UUID.

Inserting without naming the column

Among the most-asked questions in this cluster: "how do I insert a row without specifying every column when one of them is auto-generated?" The answer is DEFAULT:

-- Skip the identity column
INSERT INTO orders (customer_id, total) VALUES (42, 19.99);

-- Or use DEFAULT explicitly
INSERT INTO orders (id, customer_id, total) VALUES (DEFAULT, 42, 19.99);

-- Insert a row that's all defaults
INSERT INTO orders DEFAULT VALUES;

-- Multi-row with DEFAULT in some columns
INSERT INTO orders (id, customer_id, total)
VALUES (DEFAULT, 42, 19.99),
       (DEFAULT, 43, 25.00);

DEFAULT VALUES is the special syntax for "every column gets its default". It only works if every column has a default or is nullable.

Getting the new ID back

The most common bug in the application code that surrounds these tables is asking for the new row's ID the wrong way. The right way:

INSERT INTO orders (customer_id, total)
VALUES (42, 19.99)
RETURNING id;

RETURNING is part of the same statement as the INSERT. There is no race, no second round trip, no driver translation — the just-inserted row's columns come back in the result set the same way a SELECT would. Every modern Postgres driver passes this through transparently.

Three things people reach for instead, and why each is wrong:

  • SELECT MAX(id) FROM orders — wrong because between your INSERT and your SELECT, another transaction may have inserted a higher ID. You'll silently get someone else's row.
  • SELECT lastval() or SELECT currval('orders_id_seq') — works, but it's session-state. The value is whatever the last sequence advance in this session produced, in any table. If your ORM or your application code triggered a different INSERT in between (a logging row, a join-table row, anything), lastval() returns that one's ID. Connection poolers add a separate set of problems: depending on the pooling mode, the "session" the sequence function is reading from may not be yours. Verify against your own pooler before relying on lastval() or currval()RETURNING sidesteps the question entirely.
  • MySQL-style mysql_insert_id() / lastInsertId() driver helpers — most pg drivers translate this to RETURNING id under the covers, but only when they recognise the dialect. Half the driver bugs the cluster is asking about come from this layer silently doing the wrong thing.

RETURNING is unambiguous. Use it.

Using the returned ID for a related insert in one shot

A real version of the third-most-asked question: you're inserting an order, and you immediately need the order's ID to insert related rows in order_items. The clumsy way is to do two round trips, holding the new ID in application code. The clean way is one statement:

WITH new_order AS (
    INSERT INTO orders (customer_id, total)
    VALUES (42, 19.99)
    RETURNING id
)
INSERT INTO order_items (order_id, sku, qty)
SELECT new_order.id, x.sku, x.qty
FROM   new_order, (VALUES ('ABC-1', 2), ('XYZ-9', 1)) AS x(sku, qty);

Both inserts run in the same statement, in the same transaction, with no application-side ID juggling. This is the standard PostgreSQL pattern for "use the new row's primary key as a foreign key in the next insert."

Don't reset sequences

Back to the ticket: "reset the IDs to start at 1 again."

The mechanism is SETVAL:

SELECT setval('orders_id_seq', 1, false);

But before you run that against a live database, walk through what happens:

  • Concurrent writers may have already taken a few thousand values ahead of you (via nextval()); their inserts will collide with the freshly-restarted sequence and fail with a duplicate-key error.
  • Foreign keys pointing at orders.id aren't relabeled. If the goal was to renumber existing rows from 5,000,234 down to a smaller range, you'd have to update every referencing row in every related table, in lockstep, with FKs deferred. The change ripples across the whole schema.
  • The "looks unprofessional in support emails" problem doesn't actually go away — within a week of regular traffic, you're back to large numbers.

The only context where SETVAL is reasonable is non-production seeding: resetting a dev or test database to a known state, or realigning a sequence after a restore from pg_dump. In those cases:

-- Snap the sequence to one past the maximum existing id
SELECT setval('orders_id_seq', COALESCE(MAX(id), 0)) FROM orders;

For "the numbers are too big" in production: they're not. Gaps are fine. Five million is fine. Two billion (the int ceiling, where this conversation gets less academic) is when you migrate to bigint — which you should have started with.

"Insert if not exists" — UPSERT, done right

The cluster's concurrent-insert questions all reduce to: I want to either insert a row or get back the existing one's ID, and I want this to be safe under load.

ON CONFLICT is the answer:

INSERT INTO tags (name)
VALUES ('postgresql')
ON CONFLICT (name) DO NOTHING
RETURNING id;

There's a subtle gotcha here that bites the cluster repeatedly: when ON CONFLICT DO NOTHING matches an existing row, the statement returns no rows. To always get an ID back — whether the row was inserted or already existed — use DO UPDATE against itself:

INSERT INTO tags (name)
VALUES ('postgresql')
ON CONFLICT (name) DO UPDATE SET name = EXCLUDED.name
RETURNING id;

The DO UPDATE SET name = EXCLUDED.name is a no-op on the data — it sets the column to the same value — but it forces the row to participate in the RETURNING clause. The result is a single statement that's safe under concurrent inserters, returns the canonical row's ID either way, and doesn't require a separate SELECT round trip.

Bottom line

An ID is a token. Generate it with GENERATED ALWAYS AS IDENTITY, get it back with RETURNING, use it as a foreign key with a CTE, and don't try to renumber it. Most of the questions in this cluster are the same misunderstanding wearing different clothes — and most of them disappear the moment the application code stops treating the integer ID as a meaningful number.

The boring rule that covers ninety percent of cases: write a new table with a bigint GENERATED ALWAYS AS IDENTITY primary key, do your inserts with RETURNING id, and walk away. Your database is not the problem.

Reactions