Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

From: Erik Rijkers <er(at)xs4all(dot)nl>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Date: 2017-12-23 22:23:57
Message-ID: a98691f0d50701efc492e41b2e102eca@xs4all.nl
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

On 2017-12-23 21:06, Tomas Vondra wrote:
> On 12/23/2017 03:03 PM, Erikjan Rijkers wrote:
>> On 2017-12-23 05:57, Tomas Vondra wrote:
>>> Hi all,
>>>
>>> Attached is a patch series that implements two features to the
>>> logical
>>> replication - ability to define a memory limit for the reorderbuffer
>>> (responsible for building the decoded transactions), and ability to
>>> stream large in-progress transactions (exceeding the memory limit).
>>>
>>
>> logical replication of 2 instances is OK but 3 and up fail with:
>>
>> TRAP: FailedAssertion("!(last_lsn < change->lsn)", File:
>> "reorderbuffer.c", Line: 1773)
>>
>> I can cobble up a script but I hope you have enough from the assertion
>> to see what's going wrong...
>
> The assertion says that the iterator produces changes in order that
> does
> not correlate with LSN. But I have a hard time understanding how that
> could happen, particularly because according to the line number this
> happens in ReorderBufferCommit(), i.e. the current (non-streaming)
> case.
>
> So instructions to reproduce the issue would be very helpful.

Using:

0001-Introduce-logical_work_mem-to-limit-ReorderBuffer-v2.patch
0002-Issue-XLOG_XACT_ASSIGNMENT-with-wal_level-logical-v2.patch
0003-Issue-individual-invalidations-with-wal_level-log-v2.patch
0004-Extend-the-output-plugin-API-with-stream-methods-v2.patch
0005-Implement-streaming-mode-in-ReorderBuffer-v2.patch
0006-Add-support-for-streaming-to-built-in-replication-v2.patch

As you expected the problem is the same with these new patches.

I have now tested more, and seen that it not always fails. I guess that
it here fails 3 times out of 4. But the laptop I'm using at the moment
is old and slow -- it may well be a factor as we've seen before [1].

Attached is the bash that I put together. I tested with
NUM_INSTANCES=2, which yields success, and NUM_INSTANCES=3, which fails
often. This same program run with HEAD never seems to fail (I tried a
few dozen times).

thanks,

Erik Rijkers

[1]
https://www.postgresql.org/message-id/3897361c7010c4ac03f358173adbcd60%40xs4all.nl

Attachment Content-Type Size
test.sh text/x-shellscript 7.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2017-12-23 22:42:45 Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple
Previous Message Robert Haas 2017-12-23 21:53:55 parallel append vs. simple UNION ALL