Re: Replication slot xmin is not reset if HS feedback is turned off while standby is shut down

From: Ants Aasma <ants(dot)aasma(at)eesti(dot)ee>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Replication slot xmin is not reset if HS feedback is turned off while standby is shut down
Date: 2016-12-21 07:40:09
Message-ID: CA+CSw_ue0SSp=WP2GB4bw4d-kZ-angu6ayFGuw9kJ7fP2jfb=Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Dec 21, 2016 at 4:14 AM, Craig Ringer <craig(at)2ndquadrant(dot)com> wrote:
> Re the patch, I don't like
>
> - static bool master_has_standby_xmin = false;
> + static bool master_has_standby_xmin = true;
>
> without any comment. It's addressed in the comment changes on the
> header func, but the link isn't obvious. Maybe a oneliner to say
> "ensure we always send at least one feedback message" ?

Will fix.

> I think this part of the patch is correct and useful.
>
>
>
> I don't see the point of
>
> + *
> + * If Hot Standby is not yet active we reset the xmin value. Check this
> + * after the interval has expired to reduce number of calls.
> */
> - if (hot_standby_feedback)
> + if (hot_standby_feedback && HotStandbyActive())
>
> though. Forcing feedback to send InvalidTransactionId until hot
> standby feedback is actively running seems counterproductive; we want
> to lock in feedback as soon as possible, not wait until we're
> accepting transactions. Simon and I are in fact working on changes to
> do the opposite of what you've got here and ensure that we don't allow
> hot standby connections until we know hot_standby_feedback is in
> effect and a usable xmin is locked in. That way the master won't
> remove tuples we need as soon as we start an xact and cause a conflict
> with recovery.
>
> If there are no running xacts, GetOldestXmin() will return the slot
> xmin if any. We should definitely not be clearing that just because
> we're not accepting hot standby connections yet; we want to make very
> sure it remains in effect unless the user explicitly de-configures hot
> standby.
>
> (There's another pre-exisitng problem there, where we can start the
> walsender before slots are initialized, that I'll be addressing
> separately).
>
> If there's no slot xmin either, we'll send nextXid. That's a sensible
> starting point for hot standby feedback until we start some
> transactions.
>
> So -1 on this part of the patch, unless there's something I've misunderstood.

Currently there was no feedback sent if hot standby was not active. I
was not sure if it was safe to call GetOldestXmin() in that case.
However I did not consider cascading replica slots wanting to hold
back xmin, where resetting the parents xmin is indeed wrong. Do you
know if GetOldestXmin() is safe at this point and we can just remove
the HotStandbyActive() check? Otherwise I think the correct approach
is to move the check and return inside the hot_standby_feedback case
like this:

if (hot_standby_feedback)
{
if (!HotStandbyActive())
return;

Regards,
Ants Aasma

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andreas Karlsson 2016-12-21 07:55:08 Re: Why does plpython delay composite type resolution?
Previous Message Pavel Stehule 2016-12-21 07:14:12 Re: pgsql: Simplify LWLock tranche machinery by removing array_base/array_s