Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Erik Rijkers <er(at)xs4all(dot)nl>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Date: 2017-12-25 17:40:56
Message-ID: e9fd0b47-4fe9-0c6e-78d1-c288ac000ca7@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 12/24/2017 10:00 AM, Erik Rijkers wrote:
>>>>>
>>>>> logical replication of 2 instances is OK but 3 and up fail with:
>>>>>
>>>>> TRAP: FailedAssertion("!(last_lsn < change->lsn)", File:
>>>>> "reorderbuffer.c", Line: 1773)
>>>>>
>>>>> I can cobble up a script but I hope you have enough from the assertion
>>>>> to see what's going wrong...
>>>>
>>>> The assertion says that the iterator produces changes in order that
>>>> does
>>>> not correlate with LSN. But I have a hard time understanding how that
>>>> could happen, particularly because according to the line number this
>>>> happens in ReorderBufferCommit(), i.e. the current (non-streaming)
>>>> case.
>>>>
>>>> So instructions to reproduce the issue would be very helpful.
>>>
>>> Using:
>>>
>>> 0001-Introduce-logical_work_mem-to-limit-ReorderBuffer-v2.patch
>>> 0002-Issue-XLOG_XACT_ASSIGNMENT-with-wal_level-logical-v2.patch
>>> 0003-Issue-individual-invalidations-with-wal_level-log-v2.patch
>>> 0004-Extend-the-output-plugin-API-with-stream-methods-v2.patch
>>> 0005-Implement-streaming-mode-in-ReorderBuffer-v2.patch
>>> 0006-Add-support-for-streaming-to-built-in-replication-v2.patch
>>>
>>> As you expected the problem is the same with these new patches.
>>>
>>> I have now tested more, and seen that it not always fails.  I guess that
>>> it here fails 3 times out of 4.  But the laptop I'm using at the moment
>>> is old and slow -- it may well be a factor as we've seen before [1].
>>>
>>> Attached is the bash that I put together.  I tested with
>>> NUM_INSTANCES=2, which yields success, and NUM_INSTANCES=3, which fails
>>> often.  This same program run with HEAD never seems to fail (I tried a
>>> few dozen times).
>>>
>>
>> Thanks. Unfortunately I still can't reproduce the issue. I even tried
>> running it in valgrind, to see if there are some memory access issues
>> (which should also slow it down significantly).
>
> One wonders again if 2ndquadrant shouldn't invest in some old hardware ;)
>

Well, I've done tests on various machines, including some really slow
ones, and I still haven't managed to reproduce the failures using your
script. So I don't think that would really help. But I have reproduced
it by using a custom stress test script.

Turns out the asserts are overly strict - instead of

Assert(prev_lsn < current_lsn);

it should have been

Assert(prev_lsn <= current_lsn);

because some XLOG records may contain multiple rows (e.g. MULTI_INSERT).

The attached v3 fixes this issue, and also a couple of other thinkos:

1) The AssertChangeLsnOrder assert check was somewhat broken.

2) We've been sending aborts for all subtransactions, even those not yet
streamed. So downstream got confused and fell over because of an assert.

3) The streamed transactions were written to /tmp, using filenames using
subscription OID and XID of the toplevel transaction. That's fine, as
long as there's just a single replica running - if there are more, the
filenames will clash, causing really strange failures. So move the files
to base/pgsql_tmp where regular temporary files are written. I'm not
claiming this is perfect, perhaps we need to invent another location.

FWIW I believe the relation sync cache is somewhat broken by the
streaming. I thought resetting it would be good enough, but it's more
complicated (and trickier) than that. I'm aware of it, and I'll look
into that next - but probably not before 2018.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment Content-Type Size
0001-Introduce-logical_work_mem-to-limit-ReorderBuffer-v3.patch.gz application/gzip 6.8 KB
0002-Issue-XLOG_XACT_ASSIGNMENT-with-wal_level-logical-v3.patch.gz application/gzip 2.7 KB
0003-Issue-individual-invalidations-with-wal_level-log-v3.patch.gz application/gzip 4.8 KB
0004-Extend-the-output-plugin-API-with-stream-methods-v3.patch.gz application/gzip 5.3 KB
0005-Implement-streaming-mode-in-ReorderBuffer-v3.patch.gz application/gzip 10.7 KB
0006-Add-support-for-streaming-to-built-in-replication-v3.patch.gz application/gzip 14.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2017-12-25 21:33:53 Re: Unique indexes & constraints on partitioned tables
Previous Message Fabien COELHO 2017-12-25 16:17:33 Re: General purpose hashing func in pgbench