Re: Assertion failure while streaming toasted data

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Subject: Re: Assertion failure while streaming toasted data
Date: 2021-05-25 07:56:40
Message-ID: CAFiTN-vQ2drczfnbYMYY+myLK1DipaaQ0N2N0SKR0a5VLEgvhw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, May 25, 2021 at 12:12 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Tue, May 25, 2021 at 12:06 PM Pavan Deolasee
> <pavan(dot)deolasee(at)gmail(dot)com> wrote:
> >
> > Hi,
> >
> > While working on an output plugin that uses streaming protocol, I hit an assertion failure. Further investigations revealed a possible bug in core Postgres. This must be new to PG14 since streaming support is new to this release. I extended the test_decoding regression test to demonstrate the failure. PFA
> >
> > ```
> > 2021-05-25 11:32:19.493 IST client backend[68321] pg_regress/stream STATEMENT: SELECT data FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '
> > 1', 'stream-changes', '1');
> > TRAP: FailedAssertion("txn->size == 0", File: "reorderbuffer.c", Line: 3476, PID: 68321)
> > ```
> >
> > From my preliminary analysis, it looks like we fail to adjust the memory accounting after streaming toasted tuples. More concretely, after `ReorderBufferProcessPartialChange()` processes the in-progress transaction, `ReorderBufferTruncateTXN()` truncates the accumulated changed in the transaction, but fails to adjust the buffer size for toast chunks. Maybe we are missing a call to `ReorderBufferToastReset()` somewhere?
> >
> > From what I see, the assertion only triggers when data is inserted via COPY (multi-insert).
> >
> > Let me know if anything else is needed to reproduce this.
>
> Thanks, I will look into this and let you know if need some help.

I have identified the cause of the issue, basically, the reason is if
we are doing a multi insert operation we don't set the toast cleanup
until we get the last tuple of the xl_multi_insert [1]. Now, with
streaming, we can process the transaction in between the multi-insert
but while doing that the "change->data.tp.clear_toast_afterwards" is
set to false in all the tuples in this stream. And due to this we will
not clean up the toast.

One simple fix could be that we can just clean the toast memory if we
are processing in the streaming mode (as shown in the attached patch).
But maybe that is not the best-optimized solution, ideally, we can
save the toast until we process the last tuple of multi-insert in the
current stream, but I think that's not an easy thing to identify.

[1]
/*
* Reset toast reassembly state only after the last row in the last
* xl_multi_insert_tuple record emitted by one heap_multi_insert()
* call.
*/
if (xlrec->flags & XLH_INSERT_LAST_IN_MULTI &&
(i + 1) == xlrec->ntuples)
change->data.tp.clear_toast_afterwards = true;
else
change->data.tp.clear_toast_afterwards = false;

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
streaming_toast_bug_fix.patch text/x-patch 650 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message osumi.takamichi@fujitsu.com 2021-05-25 08:13:27 RE: locking [user] catalog tables vs 2pc vs logical rep
Previous Message tsunakawa.takay@fujitsu.com 2021-05-25 07:40:12 RE: Parallel Inserts in CREATE TABLE AS