Quick Links

Re: ERROR: subtransaction logged without previous top-level txn record

From:	Arseny Sher <a(dot)sher(at)postgrespro(dot)ru>
To:	Dan Katz <dkatz(at)joor(dot)com>
Cc:	Andres Freund <andres(at)anarazel(dot)de>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, "Hsu\, John" <hsuchen(at)amazon(dot)com>, "pgsql-bugs\(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject:	Re: ERROR: subtransaction logged without previous top-level txn record
Date:	2020-01-30 21:22:46
Message-ID:	87ftfwwsex.fsf@ars-thinkpad
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-bugs pgsql-hackers

Hi,

Dan Katz <dkatz(at)joor(dot)com> writes:

> Arseny,
>
> I was hoping you could give me some insights about how this bug might
> appear with multiple replications slots. For example if I have two
> replication slots would you expect both slots to see the same error, even
> if they were started, consumed or the LSN was confirmed-flushed at
> different times?

Well, to encounter this you must happen to interrupt decoding session
(e.g. shutdown server) when restart_lsn (LSN since WAL will be read next
time) is at unfortunate position, as described in
https://www.postgresql.org/message-id/87ftjifoql.fsf%40ars-thinkpad

Generally each slot has its own restart_lsn, so if one decoding session
stucked on this issue, another one won't necessarily fail at the same
time. However, restart_lsn can be advanced only to certain points,
mainly xl_running_xacts records, which is logged every 15 seconds. So if
all consumers acknowledge changes fast enough, it is quite likely that
during shutdown restart_lsn will be the same for all slots -- which
means either all of them will stuck on further decoding or all of them
won't. If not, different slots might have different restart_lsn and
probably won't fail at the same time; but encountering this issue even
once suggests that your workload makes possibility of such problematic
restart_lsn perceptible (i.e. many subtransactions). And each
restart_lsn probably has approximately the same chance to be 'bad'
(provided the workload is even).

We need a committer familiar with this code to look here...

--
Arseny Sher
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Re: ERROR: subtransaction logged without previous top-level txn record at 2020-01-30 20:09:57 from Dan Katz

Browse pgsql-bugs by date

	From	Date	Subject
Next Message	Nick Memos	2020-01-30 22:57:49	Re: BUG #16238: Function " to_char(timestamp, text) " doesn't work properly
Previous Message	Dan Katz	2020-01-30 20:09:57	Re: ERROR: subtransaction logged without previous top-level txn record

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Peter Geoghegan	2020-01-30 22:13:43	Re: Enabling B-Tree deduplication by default
Previous Message	Mark Dilger	2020-01-30 21:15:28	Re: Hash join not finding which collation to use for string hashing