Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

From: Erik Rijkers <er(at)xs4all(dot)nl>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Date: 2017-12-24 09:00:00
Message-ID: 84b7076830fbedc155670b859926e99e@xs4all.nl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>>>>
>>>> logical replication of 2 instances is OK but 3 and up fail with:
>>>>
>>>> TRAP: FailedAssertion("!(last_lsn < change->lsn)", File:
>>>> "reorderbuffer.c", Line: 1773)
>>>>
>>>> I can cobble up a script but I hope you have enough from the
>>>> assertion
>>>> to see what's going wrong...
>>>
>>> The assertion says that the iterator produces changes in order that
>>> does
>>> not correlate with LSN. But I have a hard time understanding how that
>>> could happen, particularly because according to the line number this
>>> happens in ReorderBufferCommit(), i.e. the current (non-streaming)
>>> case.
>>>
>>> So instructions to reproduce the issue would be very helpful.
>>
>> Using:
>>
>> 0001-Introduce-logical_work_mem-to-limit-ReorderBuffer-v2.patch
>> 0002-Issue-XLOG_XACT_ASSIGNMENT-with-wal_level-logical-v2.patch
>> 0003-Issue-individual-invalidations-with-wal_level-log-v2.patch
>> 0004-Extend-the-output-plugin-API-with-stream-methods-v2.patch
>> 0005-Implement-streaming-mode-in-ReorderBuffer-v2.patch
>> 0006-Add-support-for-streaming-to-built-in-replication-v2.patch
>>
>> As you expected the problem is the same with these new patches.
>>
>> I have now tested more, and seen that it not always fails.  I guess
>> that
>> it here fails 3 times out of 4.  But the laptop I'm using at the
>> moment
>> is old and slow -- it may well be a factor as we've seen before [1].
>>
>> Attached is the bash that I put together.  I tested with
>> NUM_INSTANCES=2, which yields success, and NUM_INSTANCES=3, which
>> fails
>> often.  This same program run with HEAD never seems to fail (I tried a
>> few dozen times).
>>
>
> Thanks. Unfortunately I still can't reproduce the issue. I even tried
> running it in valgrind, to see if there are some memory access issues
> (which should also slow it down significantly).

One wonders again if 2ndquadrant shouldn't invest in some old hardware
;)

Another Good Thing would be if there was a provision in the buildfarm to
test patches like these.

But I'm probably not to first one to suggest that; no doubt it'll be
possible someday. In the meantime I'll try to repeat this crash on
other machines (but that will be after the holidays).

Erik Rijkers

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2017-12-24 13:43:49 Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Previous Message Fabien COELHO 2017-12-24 08:12:27 Re: General purpose hashing func in pgbench