Inheriting a Postgres database

Frank Heikens·
postgresqlOperations

Inheriting a Postgres database

You inherited the database six months ago when the previous data lead left. Since then, you've been the Postgres-person-by-default — the one people DM about ALTER, GRANT, and how to dump a table. Today's request lands at 12:40pm. The new BI engineer needs SELECT on the analytics schema by 2pm. The quarterly board prep is running on her queries.

You log in and run the obvious thing:

GRANT USAGE ON SCHEMA analytics TO bi_engineer;
GRANT SELECT ON ALL TABLES IN SCHEMA analytics TO bi_engineer;

You ping her. Five minutes later: "I can read most tables, but conversion_rate returns permission denied." You check. The table was created last week by the ETL pipeline, after your GRANT ran. The grant only covered tables that existed at the moment you ran it. Anything created since gets nothing.

This is the moment most teams discover what ALTER DEFAULT PRIVILEGES is for, usually under time pressure.

What ALTER DEFAULT PRIVILEGES actually does

The natural fix looks like this:

ALTER DEFAULT PRIVILEGES IN SCHEMA analytics
    GRANT SELECT ON TABLES TO bi_engineer;

Read it once and it sounds like a global rule: any table created in the analytics schema gets SELECT for bi_engineer. That's not what it does. The rule is scoped to the role that runs it. By default, it applies to objects created by you, not by anyone else. If the ETL pipeline runs as a different role — and it does — your default-privilege rule does nothing for the tables it creates.

The full form makes this explicit:

ALTER DEFAULT PRIVILEGES FOR ROLE etl_writer IN SCHEMA analytics
    GRANT SELECT ON TABLES TO bi_engineer;

Now, when etl_writer creates a table in the analytics schema, the SELECT privilege is granted automatically. This is the line every team forgets, every team, every time. The consequence is the exact bug the BI engineer just hit.

There's no "any role" wildcard. If you have several writers, you set the rule once per writer. The role-scoping isn't a quirk of the implementation; it's the security model. A privilege that can affect another role's future objects needs to be set by that role, or by a superuser acting on their behalf. It's the same reason a random user can't GRANT SELECT ON your_table TO themselves — and the same logic applied to the future.

Teams forget this not because it's hidden but because the natural phrasing reads otherwise. GRANT ... ON ALL TABLES IN SCHEMA sounds comprehensive. The "all" is present-tense; the future is silent until a new table appears.

12:55pm. You set the rule for the ETL role, re-grant on the existing table, and confirm with \dp:

 Schema    | Name            | Access privileges
-----------+-----------------+-----------------------------
 analytics | conversion_rate | bi_engineer=r/etl_writer

Her query goes through. Board prep continues. Two more pings stacked while you were debugging — both, of course, also urgent.

Then the next request lands

1:10pm. The first of the two: an analyst on another team needs a copy of one production table to test a model locally. Schema and data, just that one table. His sprint demo is at 4. You've done this twice this year. You'll do it again.

pg_dump is the tool, and it takes four shapes you'll use ninety percent of the time. Whole database, custom format — compressed, restorable in parallel, the only format worth using for anything non-trivial:

pg_dump --format=custom --file=mydb.dump mydb

One table:

pg_dump --format=custom --table=public.orders --file=orders.dump mydb

Schema only:

pg_dump --schema-only --file=mydb-schema.sql mydb

Data only, for refreshing a staging table from production:

pg_dump --data-only --table=public.products --file=products.sql mydb

Restoring is symmetric. Custom-format dumps go through pg_restore; plain SQL dumps go through psql:

pg_restore --dbname=mydb --jobs=4 mydb.dump
psql       --dbname=mydb --file=mydb-schema.sql

Two things worth internalising while you're in here. pg_dump is a logical backup: a transaction-consistent snapshot of the schema and the data, written as SQL or as a custom-format archive. It's the right tool for migrations, schema copies, moving data between environments, and exactly this kind of "send me that table" request. It is the wrong tool for disaster recovery on a 2 TB cluster — you'd be replaying the dump row by row through the SQL parser. For recovery, you want physical backups: pg_basebackup, WAL archiving, pgBackRest, or your provider's snapshot facility. Several teams I've watched only discovered this distinction in the post-mortem.

The other thing: the version of the pg_dump binary should match or exceed the server you're dumping. New pg_dump against an older server is supported and routine. The reverse is not.

"Where are the logs?"

1:25pm. The second ping: someone says the application is slow, intermittently, since this morning. You'd normally check the database log. Where's the log?

SHOW data_directory;
SHOW config_file;
SHOW hba_file;
SHOW log_directory;

These are the four paths that come up. The cluster's data directory, postgresql.conf, pg_hba.conf, and where the logs are being written. On a self-hosted Linux server you get real filesystem paths back. On a managed service you get paths inside the provider's container, which you can't reach over ssh or cat. Knowing that the path exists but is unreachable is itself the useful answer — it tells you to stop looking on the database host and start looking in the cloud console.

SHOW is the readable surface. The full table, with sources, units, and whether a restart is required to change them, is in pg_settings:

SELECT name, setting, unit, source, pending_restart
FROM   pg_settings
WHERE  name IN ('shared_buffers', 'work_mem', 'max_connections');

current_setting('name') returns the same value as SHOW but inline in a query. It's what you reach for in scripts and views.

A small numeric question

In the same thread, someone asks: "We're adding a total_amount column to invoices. What type?"

PostgreSQL's answer is short. For currency, monetary amounts, anything where a human reads the value as a decimal and expects the cents to add up — numeric:

amount  numeric(12, 2) NOT NULL    -- up to 9,999,999,999.99

Exact, base-10 arithmetic. Slower than the binary types, more storage, the right answer anyway. It maps to BigDecimal in Java, Decimal in Python, decimal.js in JavaScript.

For counters, IDs, foreign keys — bigint by default. The four extra bytes per row are not your bottleneck, and migrating integer to bigint later means rewriting the table. For temperatures, sensor readings, ratios that aren't financial — double precision. Fast, small, and will cheerfully tell you that 0.1 + 0.2 = 0.30000000000000004, which is correct for IEEE-754 floats and wrong for invoices.

The shortest rule: if a person reads the value as a decimal amount and cares about the cents, it's numeric. Otherwise pick bigint or double precision based on whether you're counting or measuring.

psql is faster than the GUI

psql is the reference client. Every other tool — pgAdmin, DBeaver, your IDE's database panel — wraps the same catalog queries psql runs. Once the meta-commands are in your fingers, psql is faster than any of them.

\l                  list databases
\c mydb             connect to a database
\dn                 list schemas
\dt                 list tables in the current schema
\dt app.*           list tables in schema "app"
\d  orders          describe the orders table (columns, indexes, FKs)
\d+ orders          same, plus storage details and comments
\di                 list indexes
\df                 list functions
\df+ my_func        function definition, including the body
\du                 list roles
\dp orders          show privileges on the orders table
\timing             toggle showing query duration
\x                  toggle expanded output (vertical)
\e                  open the last query in $EDITOR
\i file.sql         execute file.sql in the current session
\copy ...           client-side COPY (works without server filesystem access)
\?                  list every meta-command

\d table is the one you'll run most. Columns with types and nullability, indexes (including which one is the PK), foreign keys, check constraints, triggers, partitioned children — the full shape of the table on one screen. If you only ever learn one meta-command, learn that one.

\copy is worth knowing because people reach for SQL COPY and hit permission errors that sound much worse than they are. COPY FROM '/path/to/file.csv' requires the server to read that path. On managed Postgres it doesn't work; on self-hosted, the file has to live on the database host. \copy does the equivalent client-side: the file is read from your machine and streamed to the server. Use \copy unless you have a specific reason not to.

When the cluster is on RDS

A pattern across the afternoon: every command above worked, but you're noticing the seams of running on RDS. The bootstrap role is called postgres but isn't actually a superuser — it's rds_superuser, a controlled subset. ALTER DATABASE ... OWNER TO postgres fails with must be member of role "postgres", even though that role is the one you're connected as. Some ALTER SYSTEM settings are blocked; cluster-wide tuning goes through the parameter group in the AWS console. CREATE EXTENSION works for the provider's whitelist, a long list but not an unbounded one. SHOW data_directory returns a path you can't ls.

The mental model for managed Postgres: a standards-compliant PostgreSQL server with the platform-administration surface removed. Application-level SQL is unchanged. Anything that touches the operating system or the cluster bootstrap is the provider's job, exposed through their console or API rather than through ALTER SYSTEM and the pg_* shell utilities.

When a command fails on managed Postgres and the error mentions superuser, role membership, or filesystem access, you're not doing it wrong. You're being told that the operation lives on the platform side of the line, and that the platform has a different way to do it.

Bottom line

PostgreSQL is one of the most introspectable databases that exists. Almost everything you might need to know — who owns what, where the files live, what privileges a role has, what the column types actually are — is one psql command or one catalog query away. None of the afternoon's lessons were exotic.

Most of what teams rediscover about Postgres isn't complex. It compounds under pressure. A missing default-privilege rule, a COPY that wanted \copy, a pg_dump version mismatch, a numeric column declared as double precision two years ago — each one costs a few minutes in isolation and an afternoon when they pile up at the same time.

This is the pattern: Operational Sediment. Decisions that were trivial when made, taken in isolation, by people no longer on the team — quietly stacking until the day someone needs the database to do something specific in a hurry.

The afternoon's lessons aren't lessons. They're recall. Knowing the command, the boundary, the cloud's exclusion list — without looking — is the difference between a half-day incident and a ten-minute fix. PostgreSQL doesn't get harder over time. Inherited databases do.

Reactions