Re: PATCH: Exclude unlogged tables from base backups

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: David Steele <david(at)pgmasters(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PATCH: Exclude unlogged tables from base backups
Date: 2017-12-13 01:48:05
Message-ID: 20171213014804.GH4628@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andres,

* Andres Freund (andres(at)anarazel(dot)de) wrote:
> On 2017-12-12 18:04:44 -0500, David Steele wrote:
> > If the forks are written out of order (i.e. main before init), which is
> > definitely possible, then I think worst case is some files will be backed up
> > that don't need to be. The main fork is unlikely to be very large at that
> > point so it doesn't seem like a big deal.
> >
> > I don't see this as any different than what happens during recovery. The
> > unlogged forks are cleaned / re-inited before replay starts which is the
> > same thing we are doing here.
>
> It's quite different - in the recovery case there's no other write
> activity going on. But on a normally running cluster the persistence of
> existing tables can get changed, and oids can get recycled. What
> guarantees that between the time you checked for the init fork the table
> hasn't been dropped, the oid reused and now a permanent relation is in
> its place?

We *are* actually talking about the recovery case here because this is a
backup that's happening and WAL replay will be happening after the
pg_basebackup is done and then the backup restored somewhere and PG
started up again.

If the persistence is changed then the table will be written into the
WAL, no? All of the WAL generated during a backup (which is what we're
talking about here) has to be replayed after the restore is done and is
before the database is considered consistent, so none of this matters,
as far as I can see, because the drop table or alter table logged or
anything else will be in the WAL that ends up getting replayed.

If that's not correct, then isn't there a live issue here with how
backups are happening today with unlogged tables and online backups?

I don't think there is, because, as David points out, the unlogged
tables are cleaned up first and then WAL replay happens during recovery,
so the init fork will cause the relation to be overwritten, but then
later the logged 'drop table' and subsequent re-use of the relfilenode
to create a new table (or persistence change) will all be in the WAL and
will be replayed over top and will take care of this.

Thanks!

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David G. Johnston 2017-12-13 01:57:17 Re: proposal: alternative psql commands quit and exit
Previous Message Masahiko Sawada 2017-12-13 01:47:00 Re: [HACKERS] Transactions involving multiple postgres foreign servers