Re: Minimal logical decoding on standbys

From: Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Minimal logical decoding on standbys
Date: 2019-06-14 11:24:19
Message-ID: CAJ3gD9dv+5zim7zvzRgQ8+PRSwDqbWfx0PSHr3h40L1Facw+jA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 22 May 2019 at 15:05, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com> wrote:
>
> On Tue, 9 Apr 2019 at 22:23, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com> wrote:
> >
> > On Sat, 6 Apr 2019 at 04:45, Andres Freund <andres(at)anarazel(dot)de> wrote:
> > > > diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
> > > > index 006446b..5785d2f 100644
> > > > --- a/src/backend/replication/slot.c
> > > > +++ b/src/backend/replication/slot.c
> > > > @@ -1064,6 +1064,85 @@ ReplicationSlotReserveWal(void)
> > > > }
> > > > }
> > > >
> > > > +void
> > > > +ResolveRecoveryConflictWithSlots(Oid dboid, TransactionId xid)
> > > > +{
> > > > + int i;
> > > > + bool found_conflict = false;
> > > > +
> > > > + if (max_replication_slots <= 0)
> > > > + return;
> > > > +
> > > > +restart:
> > > > + if (found_conflict)
> > > > + {
> > > > + CHECK_FOR_INTERRUPTS();
> > > > + /*
> > > > + * Wait awhile for them to die so that we avoid flooding an
> > > > + * unresponsive backend when system is heavily loaded.
> > > > + */
> > > > + pg_usleep(100000);
> > > > + found_conflict = false;
> > > > + }
> > > > +
> > > > + LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
> > > > + for (i = 0; i < max_replication_slots; i++)
> > > > + {
> > > > + ReplicationSlot *s;
> > > > + NameData slotname;
> > > > + TransactionId slot_xmin;
> > > > + TransactionId slot_catalog_xmin;
> > > > +
> > > > + s = &ReplicationSlotCtl->replication_slots[i];
> > > > +
> > > > + /* cannot change while ReplicationSlotCtlLock is held */
> > > > + if (!s->in_use)
> > > > + continue;
> > > > +
> > > > + /* not our database, skip */
> > > > + if (s->data.database != InvalidOid && s->data.database != dboid)
> > > > + continue;
> > > > +
> > > > + SpinLockAcquire(&s->mutex);
> > > > + slotname = s->data.name;
> > > > + slot_xmin = s->data.xmin;
> > > > + slot_catalog_xmin = s->data.catalog_xmin;
> > > > + SpinLockRelease(&s->mutex);
> > > > +
> > > > + if (TransactionIdIsValid(slot_xmin) && TransactionIdPrecedesOrEquals(slot_xmin, xid))
> > > > + {
> > > > + found_conflict = true;
> > > > +
> > > > + ereport(WARNING,
> > > > + (errmsg("slot %s w/ xmin %u conflicts with removed xid %u",
> > > > + NameStr(slotname), slot_xmin, xid)));
> > > > + }
> > > > +
> > > > + if (TransactionIdIsValid(slot_catalog_xmin) && TransactionIdPrecedesOrEquals(slot_catalog_xmin, xid))
> > > > + {
> > > > + found_conflict = true;
> > > > +
> > > > + ereport(WARNING,
> > > > + (errmsg("slot %s w/ catalog xmin %u conflicts with removed xid %u",
> > > > + NameStr(slotname), slot_catalog_xmin, xid)));
> > > > + }
> > > > +
> > > > +
> > > > + if (found_conflict)
> > > > + {
> > > > + elog(WARNING, "Dropping conflicting slot %s", s->data.name.data);
> > > > + LWLockRelease(ReplicationSlotControlLock); /* avoid deadlock */
> > > > + ReplicationSlotDropPtr(s);
> > > > +
> > > > + /* We released the lock above; so re-scan the slots. */
> > > > + goto restart;
> > > > + }
> > > > + }
> > > >
> > > I think this should be refactored so that the two found_conflict cases
> > > set a 'reason' variable (perhaps an enum?) to the particular reason, and
> > > then only one warning should be emitted. I also think that LOG might be
> > > more appropriate than WARNING - as confusing as that is, LOG is more
> > > severe than WARNING (see docs about log_min_messages).
> >
> > What I have in mind is :
> >
> > ereport(LOG,
> > (errcode(ERRCODE_INTERNAL_ERROR),
> > errmsg("Dropping conflicting slot %s", s->data.name.data),
> > errdetail("%s, removed xid %d.", conflict_str, xid)));
> > where conflict_str is a dynamically generated string containing
> > something like : "slot xmin : 1234, slot catalog_xmin: 5678"
> > So for the user, the errdetail will look like :
> > "slot xmin: 1234, catalog_xmin: 5678, removed xid : 9012"
> > I think the user can figure out whether it was xmin or catalog_xmin or
> > both that conflicted with removed xid.
> > If we don't do this way, we may not be able to show in a single
> > message if both xmin and catalog_xmin are conflicting at the same
> > time.
> >
> > Does this message look good to you, or you had in mind something quite
> > different ?
>
> The above one is yet another point that needs to be concluded on. Till
> then I will use the above way to display the error message in the
> upcoming patch version.

Attached is v7 version that has the above changes regarding having a
single error message.

>
> --
> Thanks,
> -Amit Khandekar
> EnterpriseDB Corporation
> The Postgres Database Company

--
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company

Attachment Content-Type Size
logical-decoding-on-standby_v7.patch application/octet-stream 50.8 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Khandekar 2019-06-14 11:37:05 Re: Minimal logical decoding on standbys
Previous Message 张文杰 2019-06-14 11:00:09 Check the number of potential synchronous standbys