Auto-incrementing IDs in PostgreSQL: IDENTITY, sequences, and what your application is doing wrong
Auto-incrementing IDs in PostgreSQL: IDENTITY, sequences, and what your application is doing wrong
A ticket lands in your queue: "can you reset the IDs on the orders table so they start at 1 again? They're at 5,000,234 and it looks unprofessional in support emails."
Two desks over, a junior is debugging an INSERT that succeeded but
returned 0 instead of the new row's ID. A few minutes later, someone
else asks why their INSERT INTO accounts ... ON CONFLICT DO NOTHING
sometimes returns no row, even though the row clearly exists.
These are three different surface symptoms of the same underlying misunderstanding. They keep coming back, in every team, in every project, until the people writing the code internalise what an auto-incrementing primary key actually is.
This post is about flipping that.
The pattern: an ID is a token, not a number
The number 5 in id = 5 looks like it means "the fifth one". It
doesn't. To PostgreSQL, the sequence-backed primary key is a
generator of opaque tokens — values guaranteed to be unique within
a table, never reused, never re-ordered, with no semantic meaning at
all. The fact that those tokens happen to be small integers in
ascending order is an implementation convenience, not a property
your application is allowed to rely on.
Once you internalise that, half the questions in this cluster collapse:
- "Can you reset the IDs?" — No, and you don't need to. The numbers don't mean anything.
- "Why are there gaps?" — Because tokens were generated and not
used (rolled-back transactions, failed inserts,
ON CONFLICTretries). Gaps are not a bug. - "Why is
MAX(id)returning a different row than I just inserted?" — Because another connection inserted in between, andMAX(id)is asking the wrong question.
The rest of this post is the practical follow-through.
Picking the right type in 2026
PostgreSQL 10 (released 2017) introduced
GENERATED ALWAYS AS IDENTITY. That's been the right default for
seven years. If you're starting a new table now:
CREATE TABLE orders (
id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
...
);
Use bigint, not int. Once you've outgrown int you can't widen
it without rewriting every page of the table; pick wide and forget
about it.
GENERATED ALWAYS blocks INSERT INTO orders (id) VALUES (...) —
the application can't accidentally specify an ID. If you really need
to (data import, replication, very specific edge cases), use
OVERRIDING SYSTEM VALUE per statement, deliberately.
GENERATED BY DEFAULT AS IDENTITY allows the override implicitly
but quietly erodes the invariant. Prefer ALWAYS.
SERIAL and BIGSERIAL still work, and you'll find them in older
schemas. They are pseudo-types: an integer column plus a separate
sequence object plus a default expression. The sequence is its own
relation with its own ownership and its own privileges — GRANT on
the table doesn't extend to the sequence, so a role with INSERT
on the table can still get permission denied for sequence ... if
nobody granted USAGE. None of that has changed; it's been a
papercut since the SERIAL pseudo-type existed, and it's one of the
reasons IDENTITY exists. With GENERATED ... AS IDENTITY there is
no separate sequence object to grant on; the column manages its own
generator internally. Migrate when you can; new tables should not
use SERIAL.
UUIDs are the alternative when you genuinely need an ID generated
outside the database — distributed systems where multiple writers
mint IDs, external systems that publish IDs to clients before
write, or any case where you'd rather not reveal table size in a
URL. Use uuid (16 bytes, indexable like any other type). For
random UUIDs, gen_random_uuid() (built into core since PG 13) is
the answer; ignore the legacy uuid_generate_v4() from
uuid-ossp. PostgreSQL 18 adds uuidv7() to core, which is
preferable when you'd rather not pay the random-insert cost on a
B-tree index — UUIDv7 is time-ordered, so new rows land at the
right edge of the index instead of all over it.
For most internal tables: IDENTITY. For surface APIs and distributed writes: UUID.
Inserting without naming the column
Among the most-asked questions in this cluster: "how do I insert a
row without specifying every column when one of them is
auto-generated?" The answer is DEFAULT:
-- Skip the identity column
INSERT INTO orders (customer_id, total) VALUES (42, 19.99);
-- Or use DEFAULT explicitly
INSERT INTO orders (id, customer_id, total) VALUES (DEFAULT, 42, 19.99);
-- Insert a row that's all defaults
INSERT INTO orders DEFAULT VALUES;
-- Multi-row with DEFAULT in some columns
INSERT INTO orders (id, customer_id, total)
VALUES (DEFAULT, 42, 19.99),
(DEFAULT, 43, 25.00);
DEFAULT VALUES is the special syntax for "every column gets its
default". It only works if every column has a default or is
nullable.
Getting the new ID back
The most common bug in the application code that surrounds these tables is asking for the new row's ID the wrong way. The right way:
INSERT INTO orders (customer_id, total)
VALUES (42, 19.99)
RETURNING id;
RETURNING is part of the same statement as the INSERT. There is
no race, no second round trip, no driver translation — the
just-inserted row's columns come back in the result set the same way
a SELECT would. Every modern Postgres driver passes this through
transparently.
Three things people reach for instead, and why each is wrong:
SELECT MAX(id) FROM orders— wrong because between yourINSERTand yourSELECT, another transaction may have inserted a higher ID. You'll silently get someone else's row.SELECT lastval()orSELECT currval('orders_id_seq')— works, but it's session-state. The value is whatever the last sequence advance in this session produced, in any table. If your ORM or your application code triggered a different INSERT in between (a logging row, a join-table row, anything),lastval()returns that one's ID. Connection poolers add a separate set of problems: depending on the pooling mode, the "session" the sequence function is reading from may not be yours. Verify against your own pooler before relying onlastval()orcurrval()—RETURNINGsidesteps the question entirely.- MySQL-style
mysql_insert_id()/lastInsertId()driver helpers — most pg drivers translate this toRETURNING idunder the covers, but only when they recognise the dialect. Half the driver bugs the cluster is asking about come from this layer silently doing the wrong thing.
RETURNING is unambiguous. Use it.
Using the returned ID for a related insert in one shot
A real version of the third-most-asked question: you're inserting
an order, and you immediately need the order's ID to insert
related rows in order_items. The clumsy way is to do two round
trips, holding the new ID in application code. The clean way is
one statement:
WITH new_order AS (
INSERT INTO orders (customer_id, total)
VALUES (42, 19.99)
RETURNING id
)
INSERT INTO order_items (order_id, sku, qty)
SELECT new_order.id, x.sku, x.qty
FROM new_order, (VALUES ('ABC-1', 2), ('XYZ-9', 1)) AS x(sku, qty);
Both inserts run in the same statement, in the same transaction, with no application-side ID juggling. This is the standard PostgreSQL pattern for "use the new row's primary key as a foreign key in the next insert."
Don't reset sequences
Back to the ticket: "reset the IDs to start at 1 again."
The mechanism is SETVAL:
SELECT setval('orders_id_seq', 1, false);
But before you run that against a live database, walk through what happens:
- Concurrent writers may have already taken a few thousand values
ahead of you (via
nextval()); their inserts will collide with the freshly-restarted sequence and fail with a duplicate-key error. - Foreign keys pointing at
orders.idaren't relabeled. If the goal was to renumber existing rows from 5,000,234 down to a smaller range, you'd have to update every referencing row in every related table, in lockstep, with FKs deferred. The change ripples across the whole schema. - The "looks unprofessional in support emails" problem doesn't actually go away — within a week of regular traffic, you're back to large numbers.
The only context where SETVAL is reasonable is non-production
seeding: resetting a dev or test database to a known state, or
realigning a sequence after a restore from pg_dump. In those
cases:
-- Snap the sequence to one past the maximum existing id
SELECT setval('orders_id_seq', COALESCE(MAX(id), 0)) FROM orders;
For "the numbers are too big" in production: they're not. Gaps are
fine. Five million is fine. Two billion (the int ceiling, where
this conversation gets less academic) is when you migrate to
bigint — which you should have started with.
"Insert if not exists" — UPSERT, done right
The cluster's concurrent-insert questions all reduce to: I want to either insert a row or get back the existing one's ID, and I want this to be safe under load.
ON CONFLICT is the answer:
INSERT INTO tags (name)
VALUES ('postgresql')
ON CONFLICT (name) DO NOTHING
RETURNING id;
There's a subtle gotcha here that bites the cluster repeatedly:
when ON CONFLICT DO NOTHING matches an existing row, the
statement returns no rows. To always get an ID back — whether the
row was inserted or already existed — use DO UPDATE against
itself:
INSERT INTO tags (name)
VALUES ('postgresql')
ON CONFLICT (name) DO UPDATE SET name = EXCLUDED.name
RETURNING id;
The DO UPDATE SET name = EXCLUDED.name is a no-op on the data —
it sets the column to the same value — but it forces the row to
participate in the RETURNING clause. The result is a single
statement that's safe under concurrent inserters, returns the
canonical row's ID either way, and doesn't require a separate
SELECT round trip.
Bottom line
An ID is a token. Generate it with GENERATED ALWAYS AS IDENTITY,
get it back with RETURNING, use it as a foreign key with a CTE,
and don't try to renumber it. Most of the questions in this cluster
are the same misunderstanding wearing different clothes — and most
of them disappear the moment the application code stops treating
the integer ID as a meaningful number.
The boring rule that covers ninety percent of cases: write a new
table with a bigint GENERATED ALWAYS AS IDENTITY primary key, do
your inserts with RETURNING id, and walk away. Your database is
not the problem.