Re: ERROR: subtransaction logged without previous top-level txn record

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Arseny Sher <a(dot)sher(at)postgrespro(dot)ru>, Andres Freund <andres(at)anarazel(dot)de>
Cc: "Hsu, John" <hsuchen(at)amazon(dot)com>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: ERROR: subtransaction logged without previous top-level txn record
Date: 2020-02-10 11:03:31
Message-ID: CAA4eK1Kcsib6UG7zFPrL-h2fvnByfKWxq8Xvzg=2hxUebYwt=g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On Sun, Feb 9, 2020 at 9:37 PM Arseny Sher <a(dot)sher(at)postgrespro(dot)ru> wrote:
>
> Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> writes:
>
> >> 1) Decoding from existing slot (*not* initial snapshot construction)
> >> starts up, immediately picks up snapshot at restart_lsn (getting into
> >> SNAPBUILD_CONSISTENT) and in some xl_xact_assignment learns that it
> >> hadn't seen in full (no toplevel records) transaction which it is not
> >> even going to stream -- but still dies with "subtransation logged
> >> without...". That's my example above, and that's what people are
> >> complaining about. Here, usage of serialized snapshot and jump to
> >> SNAPBUILD_CONSISTENT is not just legit, it is essential: or order to be
> >> able to stream data since confirmed_flush_lsn, we must pick it up as we
> >> might not be able to assemble it from scratch in time. I mean, what is
> >> wrong here is not serialized snapshot usage but the check.
> >>
> >
> > I was thinking if we have some way to skip processing of
> > xl_xact_assignment for such cases, then it might be better. Say,
> > along with restart_lsn, if have some way to find corresponding nextXid
> > (below which we don't need to process records).
>
> I don't believe you can that without persisting additional
> data. Basically, what we need is list of transactions who are running at
> the point of snapshot serialization *and* already wrote something before
> it -- those we hadn't seen in full and can't decode. We have no such
> data currently. The closest thing we have is xl_running_xacts->nextXid,
> but
>
> 1) issued xid doesn't necessarily means xact actually wrote something,
> so we can't just skip xl_xact_assignment for xid < nextXid, it might
> still be decoded
> 2) snapshot might be serialized not at xl_running_xacts anyway
>
> Surely this thing doesn't deserve changing persisted data format.
>

I agree that it won't be a good idea to change the persisted data
format, especially in back-branches. I don't see any fix which can
avoid this without doing major changes in the code. Apart from this,
we have to come up with a solution for point (3) discussed in the
above email [1] which again could be change in design. I think we can
first try to proceed with the patch
0002-Stop-demanding-that-top-xact-must-be-seen-before--v2 and then we
can discuss the other patch. I can't see a way to write a test case
for this, can you think of any way?

Andres, anyone else, if you have a better idea other than changing the
code (removing the expected error) as in
0002-Stop-demanding-that-top-xact-must-be-seen-before--v2, then
please, let us know. You can read the points (1) and (3) in the email
above [1] where the below error check will hit for valid cases. We
have discussed this in detail, but couldn't come up with anything
better than to remove this check.

@@ -778,9 +778,6 @@ ReorderBufferAssignChild(ReorderBuffer *rb,
TransactionId xid,
txn = ReorderBufferTXNByXid(rb, xid, true, &new_top, lsn, true);
subtxn = ReorderBufferTXNByXid(rb, subxid, true, &new_sub, lsn, false);

- if (new_top && !new_sub)
- elog(ERROR, "subtransaction logged without previous top-level txn record");
-

[1] - https://www.postgresql.org/message-id/87zhdx76d5.fsf%40ars-thinkpad

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Arseny Sher 2020-02-10 13:04:46 Re: ERROR: subtransaction logged without previous top-level txn record
Previous Message Jehan-Guillaume de Rorthais 2020-02-10 10:34:39 Re: Another FK violation when referencing a multi-level partitioned table

Browse pgsql-hackers by date

  From Date Subject
Next Message Wolfgang Wilhelm 2020-02-10 11:16:28 Re: Just for fun: Postgres 20?
Previous Message Michail Nikolaev 2020-02-10 09:42:46 [PATCH] Comments related to "Take fewer snapshots" and "Revert patch for taking fewer snapshots"