Re: changeset generation v5-01 - Patches & git tree

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Kevin Grittner <kgrittn(at)ymail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: changeset generation v5-01 - Patches & git tree
Date: 2013-06-28 19:47:47
Message-ID: 20130628194747.GB11516@alap2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2013-06-28 12:26:52 -0400, Tom Lane wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> > On the other hand, I can't entirely shake the feeling that adding the
> > information into WAL would be more reliable.
>
> That feeling has been nagging at me too. I can't demonstrate that
> there's a problem when an ALTER TABLE is in process of rewriting a table
> into a new relfilenode number, but I don't have a warm fuzzy feeling
> about the reliability of reverse lookups for this.

I am pretty sure the mapping thing works, but it indeed requires some
complexity. And it's harder to debug because when you want to understand
what's going on the relfilenodes involved aren't in the catalog anymore.

> At the very least
> it's going to require some hard-to-verify restriction about how we
> can't start doing changeset reconstruction in the middle of a
> transaction that's doing DDL.

Currently changeset extraction needs to wait (and does so) till it found
a point where it has seen the start of all in-progress transactions. All
transaction that *commit* after the last partiall observed in-progress
transaction finished can be decoded.
To make that point visible for external tools to synchronize -
e.g. pg_dump - it exports the snapshot of exactly the moment when that
last in-progress transaction committed.

So, from what I gather there's a slight leaning towards *not* storing
the relation's oid in the WAL. Which means the concerns about the
uniqueness issues with the syscaches need to be addressed. So far I know
of three solutions:
1) develop a custom caching/mapping module
2) Make sure InvalidOid's (the only possible duplicate) can't end up the
syscache by adding a hook that prevents that on the catcache level
3) Make sure that there can't be any duplicates by storing the oid of
the relation in a mapped relations relfilenode

Opinions?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Josh Berkus 2013-06-28 19:58:43 Re: Add more regression tests for ASYNC
Previous Message Josh Berkus 2013-06-28 19:42:47 Re: Add more regression tests for ASYNC