BF animal dikkop reported a failure in 035_standby_logical_decoding

From: "Yu Shi (Fujitsu)" <shiy(dot)fnst(at)fujitsu(dot)com>
To: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Cc: Andres Freund <andres(at)anarazel(dot)de>, "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com>
Subject: BF animal dikkop reported a failure in 035_standby_logical_decoding
Date: 2023-05-26 07:27:01
Message-ID: OSZPR01MB6310ED3CEDB531BCEDBC6AF2FD479@OSZPR01MB6310.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi hackers,

I saw a buildfarm failure on "dikkop"[1]. It failed in
035_standby_logical_decoding.pl, because the slots row_removal_inactiveslot and
row_removal_activeslot are not invalidated after vacuum.

regress_log_035_standby_logical_decoding:
```
[12:15:05.943](4.442s) not ok 22 - inactiveslot slot invalidation is logged with vacuum on pg_class
[12:15:05.945](0.003s)
[12:15:05.946](0.000s) # Failed test 'inactiveslot slot invalidation is logged with vacuum on pg_class'
# at t/035_standby_logical_decoding.pl line 238.
[12:15:05.948](0.002s) not ok 23 - activeslot slot invalidation is logged with vacuum on pg_class
[12:15:05.949](0.001s)
[12:15:05.950](0.000s) # Failed test 'activeslot slot invalidation is logged with vacuum on pg_class'
# at t/035_standby_logical_decoding.pl line 244.
[13:38:26.977](5001.028s) # poll_query_until timed out executing this query:
# select (confl_active_logicalslot = 1) from pg_stat_database_conflicts where datname = 'testdb'
# expecting this output:
# t
# last actual query output:
# f
# with stderr:
[13:38:26.980](0.003s) not ok 24 - confl_active_logicalslot updated
[13:38:26.982](0.002s)
[13:38:26.982](0.000s) # Failed test 'confl_active_logicalslot updated'
# at t/035_standby_logical_decoding.pl line 251.
Timed out waiting confl_active_logicalslot to be updated at t/035_standby_logical_decoding.pl line 251.
```

035_standby_logical_decoding.pl:
```
# This should trigger the conflict
$node_primary->safe_psql(
'testdb', qq[
CREATE TABLE conflict_test(x integer, y text);
DROP TABLE conflict_test;
VACUUM pg_class;
INSERT INTO flush_wal DEFAULT VALUES; -- see create table flush_wal
]);

$node_primary->wait_for_replay_catchup($node_standby);

# Check invalidation in the logfile and in pg_stat_database_conflicts
check_for_invalidation('row_removal_', $logstart, 'with vacuum on pg_class');
```

Is it possible that the vacuum command didn't remove tuples and then the
conflict was not triggered? It seems we can't confirm this because there is not
enough information. Maybe "vacuum verbose" can be used to provide more
information.

[1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dikkop&dt=2023-05-24%2006%3A16%3A18

Regards,
Shi Yu

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2023-05-26 07:29:38 Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication
Previous Message Peter Eisentraut 2023-05-26 07:02:33 Re: testing dist tarballs