Re: Catalog/Metadata consistency during changeset extraction from wal

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Florian Pflug <fgp(at)phlo(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Catalog/Metadata consistency during changeset extraction from wal
Date: 2012-06-22 19:30:59
Message-ID: 201206222130.59842.andres@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Friday, June 22, 2012 03:22:03 PM Andres Freund wrote:
> On Thursday, June 21, 2012 05:40:08 PM Andres Freund wrote:
> > On Thursday, June 21, 2012 03:56:54 PM Florian Pflug wrote:
> > > On Jun21, 2012, at 13:41 , Andres Freund wrote:
> > > > 3b)
> > > > Ensure that enough information in the catalog remains by fudging the
> > > > xmin horizon. Then reassemble an appropriate snapshot to read the
> > > > catalog as the tuple in question has seen it.
> > >
> > > The ComboCID machinery makes that quite a bit harder, I fear. If a
> > > tuple is updated multiple times by the same transaction, you cannot
> > > decide whether a tuple was visible in a certain snapshot unless you
> > > have access to the updating backend's ComboCID hash.
> >
> > Thats a very good point. Not sure how I forgot that.
> >
> > It think it might be possible to reconstruct a sensible combocid mapping
> > from the walstream. Let me think about it for a while...
>
> I have a very, very preliminary thing which seems to work somewhat. I just
> log (cmin, cmax) additionally for every modified catalog tuple into the
> wal and so far that seems to be enough.
> Do you happen to have suggestions for other problematic things to look into
> before I put more time into it?
Im continuing to play around with this. The tricky bit so far is
subtransaction handling in transactions which modify the catalog (+ possible
tables which are marked as being required for decoding like pg_enum
equivalent).

Would somebody fundamentally object to one the following things:
1.
replace

#define IsMVCCSnapshot(snapshot) \
((snapshot)->satisfies == HeapTupleSatisfiesMVCC)

with something like

#define IsMVCCSnapshot(snapshot) \
((snapshot)->satisfies == HeapTupleSatisfiesMVCC || (snapshot)->satisfies ==
HeapTupleSatisfiesMVCCDuringDecode)

The define is only used sparingly and none of the code path looks so hot that
this could make a difference.

2.
Set SnapshotNowData.satisfies to HeapTupleSatisfiesNowDuringRecovery while
reading the catalog for decoding.

Its possible to go on without both but the faking up of data gets quite a bit
more complex.

The problem making replacement of SnapshotNow.satisfies useful is that there is
no convenient way to represent subtransactions of the current transaction
which already have committed according to the TransactionLog but aren't yet
visible at the current lsn because they only started afterwards. Its
relatively easy to fake this in an mvcc snapshot but way harder for
SnapshotNow because you cannot mark transactions as in-progress.

Thanks,

Andres

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2012-06-22 19:36:44 Re: random failing builds on spoonbill - backends not exiting...
Previous Message Stefan Kaltenbrunner 2012-06-22 19:09:56 Re: random failing builds on spoonbill - backends not exiting...