Re: WIP: Failover Slots

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: Failover Slots
Date: 2016-04-08 06:13:03
Message-ID: CAMsr+YGrOMCzWhOj9z0UySB_EdRb2R+qUKyFUFJ31hbBh43qBA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 6 April 2016 at 22:17, Andres Freund <andres(at)anarazel(dot)de> wrote:

> Quickly skimming 0001 in [4] there appear to be a number of issues:
> * LWLockHeldByMe() is only for debugging, not functional differences
> * ReplicationSlotPersistentData is now in an xlog related header
> * The code and behaviour around name conflicts of slots seems pretty
> raw, and not discussed
> * Taking spinlocks dependant on InRecovery() seems like a seriously bad
> idea
> * I doubt that the archive based switches around StartupReplicationSlots
> do what they intend. Afaics that'll not work correctly for basebackups
> taken with -X, without recovery.conf
>
>
Thanks for looking at it. Most of those are my errors. I think this is
pretty dead at least for 9.6, so I'm mostly following up in the hopes of
learning about a couple of those mistakes.

Good catch with -X without a recovery.conf. Since it wouldn't be recognised
as a promotion and wouldn't increment the timeline, copied non-failover
slots wouldn't get removed. I've never liked that logic at all anyway, I
just couldn't think of anything better...

LWLockHeldByMe() has a comment to the effect of: "This is meant as debug
support only." So that's just a dumb mistake on my part, and I should've
added "alreadyLocked" parameters. (Ugly, but works).

But why would it be a bad idea to conditionally take a code path that
acquires a spinlock based on whether RecoveryInProgress()? It's not testing
RecoveryInProgress() more than once and doing the acquire and release based
on separate tests, which would be a problem. I don't really get the problem
with:

if (!RecoveryInProgress())
{
/* first check whether there's something to write out */
SpinLockAcquire(&slot->mutex);
was_dirty = slot->dirty;
slot->just_dirtied = false;
SpinLockRelease(&slot->mutex);

/* and don't do anything if there's nothing to write */
if (!was_dirty)
return;
}

... though I think what I really should've done there is just always
dirty the slot in the redo functions.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2016-04-08 06:15:14 Re: Relation extension scalability
Previous Message Robert Haas 2016-04-08 06:08:29 Re: Relation extension scalability