Re: Minimal logical decoding on standbys

From: Andres Freund <andres(at)anarazel(dot)de>
To: "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>, fabriziomello(at)gmail(dot)com, tushar <tushar(dot)ahuja(at)enterprisedb(dot)com>, Rahila Syed <rahila(dot)syed(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Melanie Plageman <melanieplageman(at)gmail(dot)com>
Subject: Re: Minimal logical decoding on standbys
Date: 2023-03-31 19:53:24
Message-ID: 20230331195324.okg6qhyabqjss4za@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2023-03-30 21:33:00 -0700, Andres Freund wrote:
> > diff --git a/src/include/access/gistxlog.h b/src/include/access/gistxlog.h
> > index 2ce9366277..93fb9d438a 100644
> > --- a/src/include/access/gistxlog.h
> > +++ b/src/include/access/gistxlog.h
> > @@ -51,11 +51,14 @@ typedef struct gistxlogDelete
> > {
> > TransactionId snapshotConflictHorizon;
> > uint16 ntodelete; /* number of deleted offsets */
> > + bool isCatalogRel; /* to handle recovery conflict during logical
> > + * decoding on standby */
> >
> > - /* TODELETE OFFSET NUMBER ARRAY FOLLOWS */
> > + /* TODELETE OFFSET NUMBERS */
> > + OffsetNumber offsets[FLEXIBLE_ARRAY_MEMBER];
> > } gistxlogDelete;
> >
> > -#define SizeOfGistxlogDelete (offsetof(gistxlogDelete, ntodelete) + sizeof(uint16))
> > +#define SizeOfGistxlogDelete offsetof(gistxlogDelete, offsets)

>
> I don't think the changes are quite sufficient:
>
> for gist:
>
> > @@ -672,11 +668,12 @@ gistXLogUpdate(Buffer buffer,
> > */
> > XLogRecPtr
> > gistXLogDelete(Buffer buffer, OffsetNumber *todelete, int ntodelete,
> > - TransactionId snapshotConflictHorizon)
> > + TransactionId snapshotConflictHorizon, Relation heaprel)
> > {
> > gistxlogDelete xlrec;
> > XLogRecPtr recptr;
> >
> > + xlrec.isCatalogRel = RelationIsAccessibleInLogicalDecoding(heaprel);
> > xlrec.snapshotConflictHorizon = snapshotConflictHorizon;
> > xlrec.ntodelete = ntodelete;
>
> Note that gistXLogDelete() continues to register data with two different
> XLogRegisterData() calls. This will append data without any padding:
>
> XLogRegisterData((char *) &xlrec, SizeOfGistxlogDelete);
>
> /*
> * We need the target-offsets array whether or not we store the whole
> * buffer, to allow us to find the snapshotConflictHorizon on a standby
> * server.
> */
> XLogRegisterData((char *) todelete, ntodelete * sizeof(OffsetNumber));
>
>
> But replay now uses the new offset member:
>
> > @@ -177,6 +177,7 @@ gistRedoDeleteRecord(XLogReaderState *record)
> > gistxlogDelete *xldata = (gistxlogDelete *) XLogRecGetData(record);
> > Buffer buffer;
> > Page page;
> > + OffsetNumber *toDelete = xldata->offsets;
> >
> > /*
> > * If we have any conflict processing to do, it must happen before we
>
>
> That doesn't look right. If there's any padding before offsets, we'll afaict
> read completely bogus data?
>
> As it turns out, there is padding:
>
> struct gistxlogDelete {
> TransactionId snapshotConflictHorizon; /* 0 4 */
> uint16 ntodelete; /* 4 2 */
> _Bool isCatalogRel; /* 6 1 */
>
> /* XXX 1 byte hole, try to pack */
>
> OffsetNumber offsets[]; /* 8 0 */
>
> /* size: 8, cachelines: 1, members: 4 */
> /* sum members: 7, holes: 1, sum holes: 1 */
> /* last cacheline: 8 bytes */
> };
>
>
> I am frankly baffled how this works at all, this should just about immediately
> crash?
>
>
> Oh, I see. We apparently don't reach the gist deletion code in the tests:
> https://coverage.postgresql.org/src/backend/access/gist/gistxlog.c.gcov.html#674
> https://coverage.postgresql.org/src/backend/access/gist/gistxlog.c.gcov.html#174
>
> And indeed, if I add an abort() into , it's not reached.
>
> And it's not because tests use a temp table, the caller is also unreachable:
> https://coverage.postgresql.org/src/backend/access/gist/gist.c.gcov.html#1643

After writing a minimal test to reach it, it turns out to actually work - I
missed that SizeOfGistxlogDelete now includes the padding, where commonly that
pattern tries to *exclude* trailing padding. Sorry for the noise on this one
:(

I am writing two minimal test cases to reach this code for hash and
gist. Not to commit as part of this, but to be able to verify that it
works. I'll post them in the separate thread I started about the lack of
regression test coverage in the area.

Greetings,

Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2023-03-31 20:00:08 Re: Non-superuser subscription owners
Previous Message Melanie Plageman 2023-03-31 19:44:58 Re: Track IO times in pg_stat_io