Re: [HACKERS] Restricting maximum keep segments by repslots

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: jgdr(at)dalibo(dot)com, andres(at)anarazel(dot)de, michael(at)paquier(dot)xyz, sawada(dot)mshk(at)gmail(dot)com, peter(dot)eisentraut(at)2ndquadrant(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org, thomas(dot)munro(at)enterprisedb(dot)com, sk(at)zsrv(dot)org, michael(dot)paquier(at)gmail(dot)com
Subject: Re: [HACKERS] Restricting maximum keep segments by repslots
Date: 2020-04-28 16:29:41
Message-ID: 20200428162941.GA6196@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2020-Apr-28, Kyotaro Horiguchi wrote:

> At Mon, 27 Apr 2020 18:33:42 -0400, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote in
> > On 2020-Apr-08, Kyotaro Horiguchi wrote:
> >
> > > At Wed, 08 Apr 2020 09:37:10 +0900 (JST), Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> wrote in

> > Thanks for the fix! I propose two changes:
> >
> > 1. reword the error like this:
> >
> > ERROR: replication slot "regression_slot3" cannot be advanced
> > DETAIL: This slot has never previously reserved WAL, or has been invalidated
>
> Agreed to describe what is failed rather than the cause. However,
> logical replications slots are always "previously reserved" at
> creation.

Bah, of course. I was thinking in making the equivalent messages all
identical in all callsites, but maybe they should be different when
slots are logical. I'll go over them again.

> > 2. use the same error in one other place, to wit
> > pg_logical_slot_get_changes() and pg_replication_slot_advance(). I
> > made the DETAIL part the same in all places, but the ERROR line is
> > adjusted to what each callsite is doing.
> > I do think that this change in test_decoding is a bit unpleasant:
> >
> > -ERROR: cannot use physical replication slot for logical decoding
> > +ERROR: cannot get changes from replication slot "repl"
> >
> > The test is
> > -- check that we're detecting a streaming rep slot used for logical decoding
> > SELECT 'init' FROM pg_create_physical_replication_slot('repl');
> > SELECT data FROM pg_logical_slot_get_changes('repl', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1');
>
> The message may be understood as "No change has been made since
> restart_lsn". Does something like the following work?
>
> ERROR: replication slot "repl" is not usable to get changes

That wording seems okay, but my specific point for this error message is
that we were trying to use a physical slot to get logical changes; so
the fact that the slot has been invalidated is secondary and we should
complain about the *type* of slot rather than the restart_lsn.

> By the way there are some other messages that doesn't render the
> symptom but the cause.
>
> "cannot use physical replication slot for logical decoding"
> "replication slot \"%s\" was not created in this database"
>
> Don't they need the same amendment?

Maybe, but I don't want to start rewording every single message in uses
of replication slots ... I prefer to only modify the ones related to the
problem at hand.

> > > > On the other hand, physical replication doesn't break by invlidation.
> > > > [...]

> > Anyway I think the current patch can be applied as is -- and if we want
> > physical replication to have some other behavior, we can patch for that
> > afterwards.
>
> Agreed here. The false-invalidation doesn't lead to any serious
> consequences.

But does it? What happens, for example, if we have a slot used to get a
pg_basebackup, then time passes before starting to stream from it and is
invalidated? I think this "works fine" (meaning that once we try to
stream from the slot to replay at the restored base backup, we will
raise an error immediately), but I haven't tried.

The worst situation would be producing a corrupt replica. I don't think
this is possible.

The ideal behavior I think would be that pg_basebackup aborts
immediately when the slot is invalidated, to avoid wasting more time
producing a doomed backup.

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Emre Hasegeli 2020-04-28 16:33:53 Re: Bogus documentation for bogus geometric operators
Previous Message Paul Carlucci 2020-04-28 16:24:23 Re: PostgreSQL CHARACTER VARYING vs CHARACTER VARYING (Length)