ERROR: subtransaction logged without previous top-level txn record

From: "Hsu, John" <hsuchen(at)amazon(dot)com>
To: "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: ERROR: subtransaction logged without previous top-level txn record
Date: 2019-06-10 21:08:46
Message-ID: AB5978B2-1772-4FEE-A245-74C91704ECB0@amazon.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

Hello,

I saw the previous thread but it wasn't in my inbox at the time, so I’m creating a new one sorry about that.
https://www.postgresql.org/message-id/20190516170434.masck6ehwg2kvbi2@alap3.anarazel.de

I’ve managed to reproduce the issue pretty consistently on REL9_6_STABLE on commit 959792087a10baf7f1b58408d28411109bcedb7a

OS version:
[ec2-user@ ... ~]$ uname -a
... 4.14.77-80.57.amzn2.x86_64 #1 SMP Tue Nov 6 21:18:57 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

Postgres version:
postgres=# SELECT version();
version
----------------------------------------------------------------------------------------------------------
PostgreSQL 9.6.13 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 7.3.1 20180303 (Red Hat 7.3.1-5), 64-bit
(1 row)

I’m on an EC2 m5.4xlarge if that matters.

Repro steps:

1. Create the following script
[ec2-user(at)ip-172-31-18-48 ~]$ cat ~/subbench.txt
\set aid random(1, 100000 * :scale)
\set bid random(1, 1 * :scale)
\set tid random(1, 10 * :scale)
\set delta random(-5000, 5000)
\set subcnt random(1, 800)
select * from pgbench(:aid, :bid, :tid, :delta, :subcnt);

2. Create the following function:

CREATE OR REPLACE FUNCTION pgbench(pAid int, pBid int, pTid int, delta int, subcnt int) returns int as $$
DECLARE
abal int;
BEGIN
FOR i in 1 .. subcnt LOOP
BEGIN
UPDATE pgbench_accounts SET abalance = abalance + delta WHERE aid = pAid;
--subcnt := subcnt;
EXCEPTION
WHEN division_by_zero THEN
subcnt := subcnt;
END;
END LOOP;
abal := abalance FROM pgbench_accounts WHERE aid = pAid;
return abal;
END; $$LANGUAGE 'plpgsql';

3. Create a few logical slots in the database

select pg_create_logical_replication_slot('test_slot_1', 'test_decoding');
select pg_create_logical_replication_slot('test_slot_2', 'test_decoding');
select pg_create_logical_replication_slot('test_slot_3', 'test_decoding');
...

4. Initialize pgbench
pgbench -i -d postgres

5. Load the data
pgbench -f subbench.txt -c 64 -j 64 -T 600 -P 1 -d postgres

6. Run pg_recvlogical with a timeout, it usually takes a few iterations (~7-9) before the error occurs

var=0
while true; do
timeout 30 pg_recvlogical -d postgres --start --slot test_slot_1 -f /dev/null;
var=$((var+1))
echo "Sleeping 5s Time: $var";
sleep 5;
done

pg_recvlogical -d postgres --start --slot test_slot_1 -f -
pg_recvlogical: unexpected termination of replication stream: ERROR: subtransaction logged without previous top-level txn record

pg_recvlogical -d postgres --start --slot test_slot_2 -f -
pg_recvlogical: unexpected termination of replication stream: ERROR: subtransaction logged without previous top-level txn record

pg_recvlogical -d postgres --start --slot test_slot_3 -f -
pg_recvlogical: unexpected termination of replication stream: ERROR: subtransaction logged without previous top-level txn record
pg_recvlogical: disconnected; waiting 5 seconds to try again

What's interesting is that the confirmed_flush_lsn are all different from test_slot_1 --> test_slot_3

postgres=# select * from pg_replication_slots;
slot_name | plugin | slot_type | datoid | database | active | active_pid | xmin | catalog_xmin | restart_lsn | confirmed_flush_lsn
--------------+---------------+-----------+--------+----------+--------+------------+------+--------------+-------------+---------------------
test_slot_1 | test_decoding | logical | 13382 | postgres | f | | | 1848 | 0/1C5BC5A0 | 0/5488E468
test_slot_2 | test_decoding | logical | 13382 | postgres | f | | | 1848 | 0/1C5BC5A0 | 0/40E45EA0
test_slot_3 | test_decoding | logical | 13382 | postgres | f | | | 1848 | 0/3F4B6AF8 | 0/6BB3A990

Let me know if you require more info to repro.

Thanks!

John H

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message PG Bug reporting form 2019-06-10 23:23:24 BUG #15842: Unable to run a prepared statement using the org.postgresql Java Library
Previous Message - - 2019-06-10 18:22:41 Re: Temp table handling after anti-wraparound shutdown (Was: BUG #15840)

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2019-06-10 21:45:16 Re: Should we warn against using too many partitions?
Previous Message Ashwin Agrawal 2019-06-10 20:48:54 Re: heapam_index_build_range_scan's anyvisible