Quick Links

Re: [HACKERS] WIP: long transactions on hot standby feedback replica / proof of concept

From:	Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
To:	Ivan Kartyshov <i(dot)kartyshov(at)postgrespro(dot)ru>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [HACKERS] WIP: long transactions on hot standby feedback replica / proof of concept
Date:	2018-06-20 17:54:29
Message-ID:	CAPpHfdvrYFnR4ubUc1Gh12E5RRA7BQx6tqbNBpw65ukUW1upgA@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi!

Thank you for your work on this issue.

On Wed, Feb 28, 2018 at 5:25 PM Ivan Kartyshov
<i(dot)kartyshov(at)postgrespro(dot)ru> wrote:
> Thank you for your valuable comments. I've made a few adjustments.
>
> The main goal of my changes is to let long read-only transactions run on
> replica if hot_standby_feedback is turned on.
>
>
> Patch1 - hsfeedback_av_truncate.patch is made to stop
> ResolveRecoveryConflictWithLock occurs on replica, after autovacuum lazy
> truncates heap on master cutting some pages at the end. When
> hot_standby_feedback is on, we know that the autovacuum does not remove
> anything superfluous, which could be needed on standby, so there is no
> need to rise any ResolveRecoveryConflict*.
>
> 1) Add to xl_standby_locks and xl_smgr_truncate isautovacuum flag, which
> tells us that autovacuum generates them.
>
> 2) When autovacuum decides to trim the table (using lazy_truncate_heap),
> it takes AccessExclusiveLock and sends this lock to the replica, but
> replica should ignore AccessExclusiveLock if hot_standby_feedback=on.
>
> 3) When autovacuum truncate wal message is replayed on a replica, it
> takes ExclusiveLock on a table, so as not to interfere with read-only
> requests.

I see that you use IsAutoVacuumWorkerProcess() function to determine
if this access exclusive lock is caused by vacuum. And I see at least
two issues with that.

1) That wouldn't work for manually run vacuums. I understand stat
query cancel caused by autovacuum is probably most annoying thing.
But unnecessary partial bug fix is undesirable.
2) Access exclusive locks are logged not only immediately by the
process taken that, but also all such locks are logged in
LogStandbySnapshot(). LogStandbySnapshot() is called by checkpointer,
bgwriter and others. So, IsAutoVacuumWorkerProcess() wouldn't help in
such cases. I understand that heap truncation is typically fast. And
probability that some process will dump all the access exclusive locks
while truncation is in progress is low. But nevertheless we need to
handle that properly.

Based on this, I think we should give up with using
IsAutoVacuumWorkerProcess() to determine locks caused by vacuum.

Thinking about that more I found that adding vacuum mark as an extra
argument to LockAcquireExtended is also wrong. It would be still hard
to determine if we should log the lock in LogStandbySnapshot(). We'll
have to add that flag to shared memory table of locks. And that looks
like a kludge.

Therefore, I'd like to propose another approach: introduce new lock
level. So, AccessExclusiveLock will be split into two
AccessExclusiveLocalLock and AccessExclusiveLock. In spite of
AccessExclusiveLock, AccessExclusiveLocalLock will be not logged to
standby, and used for heap truncation.

I expect some resistance to my proposal, because fixing this "small
bug" doesn't deserve new lock level. But current behavior of
hot_standby_feedback is buggy. From user prospective,
hot_standby_feedback option is just non-worker, which causes master to
bloat without protection for standby queries from cancel. We need to
fix that, but I don't see other proper way to do that except adding
new lock level...

> Patch2 - hsfeedback_noninvalide_xmin.patch
> When walsender is initialized, its xmin in PROCARRAY is set to
> GetOldestXmin() in order to prevent autovacuum running on master from
> truncating relation and removing some pages that are required by
> replica. This might happen if master's autovacuum and replica's query
> started simultaneously. And the replica has not yet reported its xmin
> value.

I don't yet understand what is the problem here and why this patch
should solve that. As I get idea of this patch, GetOldestXmin() will
initialize xmin of walsender with lowest xmin of replication slot.
But xmin of replication slots will be anyway taken into account by
GetSnapshotData() called by vacuum.

> How to test:
> Make async replica, turn on feedback, reduce max_standby_streaming_delay
> and aggressive autovacuum.

You forgot to mention, that one should setup the replication using
replication slot. Naturally, if replication slot isn't exist, then
master shouldn't keep dead tuples for disconnected standby. Because
master doesn't know if standby will reconnect or is it gone forever.

> autovacuum = on
> autovacuum_max_workers = 1
> autovacuum_naptime = 1s
> autovacuum_vacuum_threshold = 1
> autovacuum_vacuum_cost_delay = 0
>
> Test:
> Here we will start replica and begi repeatable read transaction on
> table, then we stop replicas postmaster to prevent starting walreceiver
> worker (on master startup) and sending master it`s transaction xmin over
> hot_standby_feedback message.
> MASTER REPLICA
> start
> CREATE TABLE test AS (SELECT id, 1 AS value FROM
> generate_series(1,10000000) id);
> stop
> start

I guess you meant start of standby session here, not postmaster.
Otherwise, I don't understand how table test will reach standby.

> BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
> SELECT * FROM test;
> stop postmaster with gdb
> start
> DELETE FROM test WHERE id > 0;
> wait till autovacuum delete and changed xmin
> release postmaster with gdb
> --- Result on replica
> FATAL: terminating connection due to conflict with recovery
> DETAIL: User query might have needed to see row versions that must be
> removed.

I've tries it with replication slot, and just with
hsfeedback_av_truncate.patch everything is working fine: query on
standby isn't cancelled. I don't see how
hsfeedback_noninvalide_xmin.patch influence the situation. Without
replication slot, it always fails regardless any attached patches, and
that's right behavior.

BTW, I've registered this patch for July commitfest. It would be nice
to finally fix this long-term bug. Despite this is bug fix, I doubt
it should be backpatched, especially if we will accept the new lock
level. We should rather apply patch to the documentation of supported
versions, which would claim that vacuum can still cancel queries on
standbys regardless hot_standby_feedback option.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Re: [HACKERS] WIP: long transactions on hot standby feedback replica / proof of concept at 2018-02-28 14:24:50 from Ivan Kartyshov

Responses

Re: [HACKERS] WIP: long transactions on hot standby feedback replica / proof of concept at 2018-08-09 20:25:59 from Ivan Kartyshov

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Daniel Gustafsson	2018-06-20 18:06:41	Re: Allow auto_explain to log to NOTICE
Previous Message	Robert Haas	2018-06-20 17:51:13	Re: PATCH: backtraces for error messages