Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Erik Rijkers <er(at)xs4all(dot)nl>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Date: 2017-12-24 00:42:01
Message-ID: a0e1f9ba-45bd-37a6-823b-0864fb2e4b22@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 12/23/2017 11:23 PM, Erik Rijkers wrote:
> On 2017-12-23 21:06, Tomas Vondra wrote:
>> On 12/23/2017 03:03 PM, Erikjan Rijkers wrote:
>>> On 2017-12-23 05:57, Tomas Vondra wrote:
>>>> Hi all,
>>>>
>>>> Attached is a patch series that implements two features to the logical
>>>> replication - ability to define a memory limit for the reorderbuffer
>>>> (responsible for building the decoded transactions), and ability to
>>>> stream large in-progress transactions (exceeding the memory limit).
>>>>
>>>
>>> logical replication of 2 instances is OK but 3 and up fail with:
>>>
>>> TRAP: FailedAssertion("!(last_lsn < change->lsn)", File:
>>> "reorderbuffer.c", Line: 1773)
>>>
>>> I can cobble up a script but I hope you have enough from the assertion
>>> to see what's going wrong...
>>
>> The assertion says that the iterator produces changes in order that does
>> not correlate with LSN. But I have a hard time understanding how that
>> could happen, particularly because according to the line number this
>> happens in ReorderBufferCommit(), i.e. the current (non-streaming) case.
>>
>> So instructions to reproduce the issue would be very helpful.
>
> Using:
>
> 0001-Introduce-logical_work_mem-to-limit-ReorderBuffer-v2.patch
> 0002-Issue-XLOG_XACT_ASSIGNMENT-with-wal_level-logical-v2.patch
> 0003-Issue-individual-invalidations-with-wal_level-log-v2.patch
> 0004-Extend-the-output-plugin-API-with-stream-methods-v2.patch
> 0005-Implement-streaming-mode-in-ReorderBuffer-v2.patch
> 0006-Add-support-for-streaming-to-built-in-replication-v2.patch
>
> As you expected the problem is the same with these new patches.
>
> I have now tested more, and seen that it not always fails.  I guess that
> it here fails 3 times out of 4.  But the laptop I'm using at the moment
> is old and slow -- it may well be a factor as we've seen before [1].
>
> Attached is the bash that I put together.  I tested with
> NUM_INSTANCES=2, which yields success, and NUM_INSTANCES=3, which fails
> often.  This same program run with HEAD never seems to fail (I tried a
> few dozen times).
>

Thanks. Unfortunately I still can't reproduce the issue. I even tried
running it in valgrind, to see if there are some memory access issues
(which should also slow it down significantly).

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2017-12-24 04:51:52 Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Previous Message Michael Paquier 2017-12-23 23:48:40 Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple