Re: Catalog/Metadata consistency during changeset extraction from wal

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Florian Pflug <fgp(at)phlo(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Catalog/Metadata consistency during changeset extraction from wal
Date: 2012-06-21 15:56:11
Message-ID: 201206211756.11996.andres@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thursday, June 21, 2012 04:05:54 PM Florian Pflug wrote:
> On Jun21, 2012, at 13:41 , Andres Freund wrote:
> > 5.)
> > The actually good idea. Yours?
>
> What about a mixure of (3b) and (4), which writes the data not to the WAL
> but to a separate logical replication log. More specifically:
>
> There's a per-backend queue of change notifications.
>
> Whenever a non-catalog tuple is modified, we queue a TUPLE_MODIFIED
> record containing (xid, databaseoid, tableoid, old xmin, old ctid, new
> ctid)
>
> Whenever a table (or something that a table depends on) is modified we
> wait until all references to that table's oid have vanished from the queue,
> then queue a DDL record containing (xid, databaseoid, tableoid, text).
> Other backend cannot concurrently add further TUPLE_MODIFIED records since
> we alreay hold an exclusive lock on the table at that point.
>
> A background process continually processes these queues. If the front of
> the queue is a TUPLE_MODIFIED record, it fetches the old and the new tuple
> based on their ctids and writes the old tuple's PK and the full new tuple
> to the logical replication log. Since table modifications always wait for
> all previously queued TUPLE_MODIFIED records referencing that table to be
> processes *before* altering the catalog, tuples can always be interpreted
> according to the current (SnapshotNow) catalog contents.
>
> Upon transaction COMMIT and ROLLBACK, we queue COMMIT and ROLLBACK records,
> which are also written to the log by the background process. The background
> process may decide to wait until a backend commits before processing that
> backend's log. In that case, rolled back transaction don't leave a trace in
> the logical replication log. Should a backend, however, issue a DDL
> statement, the background process *must* process that backend's queue
> immediately, since otherwise there's a dead lock.
>
> The background process also maintains a value in shared memory which
> contains the oldest value in any of the queue's xid or "old xmin" fields.
> VACUUM and the like must not remove tuples whose xmin is >= that value.
> Hit bits *may* be set for newest tuples though, provided that the
> background process ignores hint bits when fetching the old and new tuples.
I think thats too complicated to fly. Getting that to recover cleanly in case
of crash would mean you'd need another wal.

I think if it comes to that going for 1) is more realistic...

Andres
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2012-06-21 15:58:19 Re: Reseting undo/redo logs
Previous Message D'Arcy Cain 2012-06-21 15:46:54 COMMUTATOR doesn't seem to work