Inheriting a Postgres database
Inheriting a Postgres database
You inherited the database six months ago when the previous data lead left. Since then, you've been the Postgres-person-by-default — the one people DM about ALTER, GRANT, and how to dump a table. Today's request lands at 12:40pm. The new BI engineer needs SELECT on the analytics schema by 2pm. The quarterly board prep is running on her queries.
You log in and run the obvious thing:
GRANT USAGE ON SCHEMA analytics TO bi_engineer;
GRANT SELECT ON ALL TABLES IN SCHEMA analytics TO bi_engineer;
You ping her. Five minutes later: "I can read most tables, but
conversion_rate returns permission denied." You check. The table
was created last week by the ETL pipeline, after your GRANT ran.
The grant only covered tables that existed at the moment you ran
it. Anything created since gets nothing.
This is the moment most teams discover what ALTER DEFAULT PRIVILEGES is for, usually under time pressure.
What ALTER DEFAULT PRIVILEGES actually does
The natural fix looks like this:
ALTER DEFAULT PRIVILEGES IN SCHEMA analytics
GRANT SELECT ON TABLES TO bi_engineer;
Read it once and it sounds like a global rule: any table created
in the analytics schema gets SELECT for bi_engineer. That's not
what it does. The rule is scoped to the role that runs it. By
default, it applies to objects created by you, not by anyone
else. If the ETL pipeline runs as a different role — and it does —
your default-privilege rule does nothing for the tables it
creates.
The full form makes this explicit:
ALTER DEFAULT PRIVILEGES FOR ROLE etl_writer IN SCHEMA analytics
GRANT SELECT ON TABLES TO bi_engineer;
Now, when etl_writer creates a table in the analytics schema,
the SELECT privilege is granted automatically. This is the line
every team forgets, every team, every time. The consequence is
the exact bug the BI engineer just hit.
There's no "any role" wildcard. If you have several writers, you
set the rule once per writer. The role-scoping isn't a quirk of
the implementation; it's the security model. A privilege that can
affect another role's future objects needs to be set by that role,
or by a superuser acting on their behalf. It's the same reason a
random user can't GRANT SELECT ON your_table TO themselves — and
the same logic applied to the future.
Teams forget this not because it's hidden but because the natural
phrasing reads otherwise. GRANT ... ON ALL TABLES IN SCHEMA
sounds comprehensive. The "all" is present-tense; the future is
silent until a new table appears.
12:55pm. You set the rule for the ETL role, re-grant on the existing
table, and confirm with \dp:
Schema | Name | Access privileges
-----------+-----------------+-----------------------------
analytics | conversion_rate | bi_engineer=r/etl_writer
Her query goes through. Board prep continues. Two more pings stacked while you were debugging — both, of course, also urgent.
Then the next request lands
1:10pm. The first of the two: an analyst on another team needs a copy of one production table to test a model locally. Schema and data, just that one table. His sprint demo is at 4. You've done this twice this year. You'll do it again.
pg_dump is the tool, and it takes four shapes you'll use ninety
percent of the time. Whole database, custom format — compressed,
restorable in parallel, the only format worth using for anything
non-trivial:
pg_dump --format=custom --file=mydb.dump mydb
One table:
pg_dump --format=custom --table=public.orders --file=orders.dump mydb
Schema only:
pg_dump --schema-only --file=mydb-schema.sql mydb
Data only, for refreshing a staging table from production:
pg_dump --data-only --table=public.products --file=products.sql mydb
Restoring is symmetric. Custom-format dumps go through
pg_restore; plain SQL dumps go through psql:
pg_restore --dbname=mydb --jobs=4 mydb.dump
psql --dbname=mydb --file=mydb-schema.sql
Two things worth internalising while you're in here. pg_dump is
a logical backup: a transaction-consistent snapshot of the
schema and the data, written as SQL or as a custom-format archive.
It's the right tool for migrations, schema copies, moving data
between environments, and exactly this kind of "send me that
table" request. It is the wrong tool for disaster recovery on a
2 TB cluster — you'd be replaying the dump row by row through the
SQL parser. For recovery, you want physical backups:
pg_basebackup, WAL archiving, pgBackRest, or your provider's
snapshot facility. Several teams I've watched only discovered this
distinction in the post-mortem.
The other thing: the version of the pg_dump binary should
match or exceed the server you're dumping. New pg_dump against
an older server is supported and routine. The reverse is not.
"Where are the logs?"
1:25pm. The second ping: someone says the application is slow, intermittently, since this morning. You'd normally check the database log. Where's the log?
SHOW data_directory;
SHOW config_file;
SHOW hba_file;
SHOW log_directory;
These are the four paths that come up. The cluster's data
directory, postgresql.conf, pg_hba.conf, and where the logs
are being written. On a self-hosted Linux server you get real
filesystem paths back. On a managed service you get paths inside
the provider's container, which you can't reach over ssh or
cat. Knowing that the path exists but is unreachable is itself
the useful answer — it tells you to stop looking on the database
host and start looking in the cloud console.
SHOW is the readable surface. The full table, with sources,
units, and whether a restart is required to change them, is in
pg_settings:
SELECT name, setting, unit, source, pending_restart
FROM pg_settings
WHERE name IN ('shared_buffers', 'work_mem', 'max_connections');
current_setting('name') returns the same value as SHOW but
inline in a query. It's what you reach for in scripts and views.
A small numeric question
In the same thread, someone asks: "We're adding a total_amount
column to invoices. What type?"
PostgreSQL's answer is short. For currency, monetary amounts,
anything where a human reads the value as a decimal and expects
the cents to add up — numeric:
amount numeric(12, 2) NOT NULL -- up to 9,999,999,999.99
Exact, base-10 arithmetic. Slower than the binary types, more
storage, the right answer anyway. It maps to BigDecimal in Java,
Decimal in Python, decimal.js in JavaScript.
For counters, IDs, foreign keys — bigint by default. The four
extra bytes per row are not your bottleneck, and migrating
integer to bigint later means rewriting the table. For
temperatures, sensor readings, ratios that aren't financial —
double precision. Fast, small, and will cheerfully tell you that
0.1 + 0.2 = 0.30000000000000004, which is correct for IEEE-754
floats and wrong for invoices.
The shortest rule: if a person reads the value as a decimal amount
and cares about the cents, it's numeric. Otherwise pick bigint
or double precision based on whether you're counting or
measuring.
psql is faster than the GUI
psql is the reference client. Every other tool — pgAdmin,
DBeaver, your IDE's database panel — wraps the same catalog
queries psql runs. Once the meta-commands are in your fingers,
psql is faster than any of them.
\l list databases
\c mydb connect to a database
\dn list schemas
\dt list tables in the current schema
\dt app.* list tables in schema "app"
\d orders describe the orders table (columns, indexes, FKs)
\d+ orders same, plus storage details and comments
\di list indexes
\df list functions
\df+ my_func function definition, including the body
\du list roles
\dp orders show privileges on the orders table
\timing toggle showing query duration
\x toggle expanded output (vertical)
\e open the last query in $EDITOR
\i file.sql execute file.sql in the current session
\copy ... client-side COPY (works without server filesystem access)
\? list every meta-command
\d table is the one you'll run most. Columns with types and
nullability, indexes (including which one is the PK), foreign
keys, check constraints, triggers, partitioned children — the
full shape of the table on one screen. If you only ever learn one
meta-command, learn that one.
\copy is worth knowing because people reach for SQL COPY and
hit permission errors that sound much worse than they are. COPY FROM '/path/to/file.csv' requires the server to read that path.
On managed Postgres it doesn't work; on self-hosted, the file has
to live on the database host. \copy does the equivalent
client-side: the file is read from your machine and streamed to
the server. Use \copy unless you have a specific reason not to.
When the cluster is on RDS
A pattern across the afternoon: every command above worked, but
you're noticing the seams of running on RDS. The bootstrap role is
called postgres but isn't actually a superuser — it's
rds_superuser, a controlled subset. ALTER DATABASE ... OWNER TO postgres fails with must be member of role "postgres", even
though that role is the one you're connected as. Some ALTER SYSTEM settings are blocked; cluster-wide tuning goes through the
parameter group in the AWS console. CREATE EXTENSION works for
the provider's whitelist, a long list but not an unbounded one.
SHOW data_directory returns a path you can't ls.
The mental model for managed Postgres: a standards-compliant
PostgreSQL server with the platform-administration surface
removed. Application-level SQL is unchanged. Anything that touches
the operating system or the cluster bootstrap is the provider's
job, exposed through their console or API rather than through
ALTER SYSTEM and the pg_* shell utilities.
When a command fails on managed Postgres and the error mentions superuser, role membership, or filesystem access, you're not doing it wrong. You're being told that the operation lives on the platform side of the line, and that the platform has a different way to do it.
Bottom line
PostgreSQL is one of the most introspectable databases that
exists. Almost everything you might need to know — who owns what,
where the files live, what privileges a role has, what the column
types actually are — is one psql command or one catalog query
away. None of the afternoon's lessons were exotic.
Most of what teams rediscover about Postgres isn't complex. It
compounds under pressure. A missing default-privilege rule, a
COPY that wanted \copy, a pg_dump version mismatch, a
numeric column declared as double precision two years ago —
each one costs a few minutes in isolation and an afternoon when
they pile up at the same time.
This is the pattern: Operational Sediment. Decisions that were trivial when made, taken in isolation, by people no longer on the team — quietly stacking until the day someone needs the database to do something specific in a hurry.
The afternoon's lessons aren't lessons. They're recall. Knowing the command, the boundary, the cloud's exclusion list — without looking — is the difference between a half-day incident and a ten-minute fix. PostgreSQL doesn't get harder over time. Inherited databases do.