Re: Minimal logical decoding on standbys

From: Andres Freund <andres(at)anarazel(dot)de>
To: Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Minimal logical decoding on standbys
Date: 2019-04-05 23:15:53
Message-ID: 20190405231553.njjcu7igoyuk2hx5@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Thanks for the new version of the patch. Btw, could you add Craig as a
co-author in the commit message of the next version of the patch? Don't
want to forget him.

On 2019-04-05 17:08:39 +0530, Amit Khandekar wrote:
> Regarding the test result failures, I could see that when we drop a
> logical replication slot at standby server, then the catalog_xmin of
> physical replication slot becomes NULL, whereas the test expects it to
> be equal to xmin; and that's the reason a couple of test scenarios are
> failing :
>
> ok 33 - slot on standby dropped manually
> Waiting for replication conn replica's replay_lsn to pass '0/31273E0' on master
> done
> not ok 34 - physical catalog_xmin still non-null
> not ok 35 - xmin and catalog_xmin equal after slot drop
> # Failed test 'xmin and catalog_xmin equal after slot drop'
> # at t/016_logical_decoding_on_replica.pl line 272.
> # got:
> # expected: 2584
>
> I am not sure what is expected. What actually happens is : the
> physical xlot catalog_xmin remains NULL initially, but becomes
> non-NULL after the logical replication slot is created on standby.

That seems like the correct behaviour to me - why would we still have a
catalog xmin if there's no slot logical slot?

> diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
> index 006446b..5785d2f 100644
> --- a/src/backend/replication/slot.c
> +++ b/src/backend/replication/slot.c
> @@ -1064,6 +1064,85 @@ ReplicationSlotReserveWal(void)
> }
> }
>
> +void
> +ResolveRecoveryConflictWithSlots(Oid dboid, TransactionId xid)
> +{
> + int i;
> + bool found_conflict = false;
> +
> + if (max_replication_slots <= 0)
> + return;
> +
> +restart:
> + if (found_conflict)
> + {
> + CHECK_FOR_INTERRUPTS();
> + /*
> + * Wait awhile for them to die so that we avoid flooding an
> + * unresponsive backend when system is heavily loaded.
> + */
> + pg_usleep(100000);
> + found_conflict = false;
> + }
> +
> + LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
> + for (i = 0; i < max_replication_slots; i++)
> + {
> + ReplicationSlot *s;
> + NameData slotname;
> + TransactionId slot_xmin;
> + TransactionId slot_catalog_xmin;
> +
> + s = &ReplicationSlotCtl->replication_slots[i];
> +
> + /* cannot change while ReplicationSlotCtlLock is held */
> + if (!s->in_use)
> + continue;
> +
> + /* not our database, skip */
> + if (s->data.database != InvalidOid && s->data.database != dboid)
> + continue;
> +
> + SpinLockAcquire(&s->mutex);
> + slotname = s->data.name;
> + slot_xmin = s->data.xmin;
> + slot_catalog_xmin = s->data.catalog_xmin;
> + SpinLockRelease(&s->mutex);
> +
> + if (TransactionIdIsValid(slot_xmin) && TransactionIdPrecedesOrEquals(slot_xmin, xid))
> + {
> + found_conflict = true;
> +
> + ereport(WARNING,
> + (errmsg("slot %s w/ xmin %u conflicts with removed xid %u",
> + NameStr(slotname), slot_xmin, xid)));
> + }
> +
> + if (TransactionIdIsValid(slot_catalog_xmin) && TransactionIdPrecedesOrEquals(slot_catalog_xmin, xid))
> + {
> + found_conflict = true;
> +
> + ereport(WARNING,
> + (errmsg("slot %s w/ catalog xmin %u conflicts with removed xid %u",
> + NameStr(slotname), slot_catalog_xmin, xid)));
> + }
> +
> +
> + if (found_conflict)
> + {
> + elog(WARNING, "Dropping conflicting slot %s", s->data.name.data);
> + LWLockRelease(ReplicationSlotControlLock); /* avoid deadlock */
> + ReplicationSlotDropPtr(s);
> +
> + /* We released the lock above; so re-scan the slots. */
> + goto restart;
> + }
> + }
>
I think this should be refactored so that the two found_conflict cases
set a 'reason' variable (perhaps an enum?) to the particular reason, and
then only one warning should be emitted. I also think that LOG might be
more appropriate than WARNING - as confusing as that is, LOG is more
severe than WARNING (see docs about log_min_messages).

> @@ -0,0 +1,386 @@
> +# Demonstrate that logical can follow timeline switches.
> +#
> +# Test logical decoding on a standby.
> +#
> +use strict;
> +use warnings;
> +use 5.8.0;
> +
> +use PostgresNode;
> +use TestLib;
> +use Test::More tests => 55;
> +use RecursiveCopy;
> +use File::Copy;
> +
> +my ($stdin, $stdout, $stderr, $ret, $handle, $return);
> +my $backup_name;
> +
> +# Initialize master node
> +my $node_master = get_new_node('master');
> +$node_master->init(allows_streaming => 1, has_archiving => 1);
> +$node_master->append_conf('postgresql.conf', q{
> +wal_level = 'logical'
> +max_replication_slots = 4
> +max_wal_senders = 4
> +log_min_messages = 'debug2'
> +log_error_verbosity = verbose
> +# send status rapidly so we promptly advance xmin on master
> +wal_receiver_status_interval = 1
> +# very promptly terminate conflicting backends
> +max_standby_streaming_delay = '2s'
> +});
> +$node_master->dump_info;
> +$node_master->start;
> +
> +$node_master->psql('postgres', q[CREATE DATABASE testdb]);
> +
> +$node_master->safe_psql('testdb', q[SELECT * FROM pg_create_physical_replication_slot('decoding_standby');]);
> +$backup_name = 'b1';
> +my $backup_dir = $node_master->backup_dir . "/" . $backup_name;
> +TestLib::system_or_bail('pg_basebackup', '-D', $backup_dir, '-d', $node_master->connstr('testdb'), '--slot=decoding_standby');
> +
> +sub print_phys_xmin
> +{
> + my $slot = $node_master->slot('decoding_standby');
> + return ($slot->{'xmin'}, $slot->{'catalog_xmin'});
> +}
> +
> +my ($xmin, $catalog_xmin) = print_phys_xmin();
> +# After slot creation, xmins must be null
> +is($xmin, '', "xmin null");
> +is($catalog_xmin, '', "catalog_xmin null");
> +
> +my $node_replica = get_new_node('replica');
> +$node_replica->init_from_backup(
> + $node_master, $backup_name,
> + has_streaming => 1,
> + has_restoring => 1);
> +$node_replica->append_conf('postgresql.conf',
> + q[primary_slot_name = 'decoding_standby']);
> +
> +$node_replica->start;
> +$node_master->wait_for_catchup($node_replica, 'replay', $node_master->lsn('flush'));
> +
> +# with hot_standby_feedback off, xmin and catalog_xmin must still be null
> +($xmin, $catalog_xmin) = print_phys_xmin();
> +is($xmin, '', "xmin null after replica join");
> +is($catalog_xmin, '', "catalog_xmin null after replica join");
> +
> +$node_replica->append_conf('postgresql.conf',q[
> +hot_standby_feedback = on
> +]);
> +$node_replica->restart;
> +sleep(2); # ensure walreceiver feedback sent

Can we make this more robust? E.g. by waiting till pg_stat_replication
shows the change on the primary? Because I can guarantee that this'll
fail on slow buildfarm machines (say the valgrind animals).

> +$node_master->wait_for_catchup($node_replica, 'replay', $node_master->lsn('flush'));
> +sleep(2); # ensure walreceiver feedback sent

Similar.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-04-05 23:26:32 Re: Ordered Partitioned Table Scans
Previous Message Andres Freund 2019-04-05 22:21:33 Re: Intermittent failure in InstallCheck-C "stat" test