Re: ERROR: subtransaction logged without previous top-level txn record

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Arseny Sher <a(dot)sher(at)postgrespro(dot)ru>
Cc: Andres Freund <andres(at)anarazel(dot)de>, "Hsu, John" <hsuchen(at)amazon(dot)com>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: ERROR: subtransaction logged without previous top-level txn record
Date: 2020-02-03 12:54:22
Message-ID: CAA4eK1LdNmrib1jub8b=KvYUrzXW0VT4P3MVPMyiMfMY3K64dA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On Mon, Feb 3, 2020 at 2:50 PM Arseny Sher <a(dot)sher(at)postgrespro(dot)ru> wrote:
>
>
> Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> writes:
>
> >> I don't see a bug here. At least in reproduced scenario I see false
> >> alert, as explained above: transaction with skipped xl_xact_assignment
> >> won't be streamed as it finishes before confirmed_flush_lsn.
> >>
> >
> > Does this guarantee come from the fact that we need to wait for such a
> > transaction before reaching a consistent snapshot state? If not, can
> > you explain a bit more what makes you say so?
>
> Right, see FULL_SNAPSHOT -> SNAPBUILD_CONSISTENT transition -- it exists
> exactly for this purpose: once we have good snapshot, we need to wait
> for all running xacts to finish to see all xacts we are promising to
> stream in full.
>

So, doesn't this mean that it started occurring after the fix done in
commit 96b5033e11 [1]? Because before that fix we wouldn't have
allowed processing XLOG_XACT_ASSIGNMENT records unless we are in
SNAPBUILD_FULL_SNAPSHOT state. I am not telling the fix in that
commit is wrong, but just trying to understand the situation here.

>
> Well, almost. This is true as long initial snapshot construction process
> goes the long way of building the snapshot by itself. If it happens to
> pick up from disk ready snapshot pickled there by another decoding
> session, it fast path'es to SNAPBUILD_CONSISTENT, which is technically a
> bug as described in
> https://www.postgresql.org/message-id/87ftjifoql.fsf%40ars-thinkpad
>

Can't we deal with this separately? If so, I think let's not mix the
discussions for both as the root cause of both seems different.

[1] -
commit bac2fae05c7737530a6fe8276cd27d210d25c6ac
Author: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Date: 2019-09-13 16:36:28 -0300

logical decoding: process ASSIGNMENT during snapshot build

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Daniel Gustafsson 2020-02-03 12:55:19 Re: BUG #16171: Potential malformed JSON in explain output
Previous Message hubert depesz lubaczewski 2020-02-03 11:40:22 Re: BUG #16171: Potential malformed JSON in explain output

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2020-02-03 12:55:19 Re: BUG #16171: Potential malformed JSON in explain output
Previous Message Thomas Munro 2020-02-03 12:48:49 Re: Experimenting with hash join prefetch