Re: Logical replication 'invalid memory alloc request size 1585837200' after upgrading to 17.5

From: Duncan Sands <duncan(dot)sands(at)deepbluecap(dot)com>
To: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, 'Amit Kapila' <amit(dot)kapila16(at)gmail(dot)com>
Cc: "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>
Subject: Re: Logical replication 'invalid memory alloc request size 1585837200' after upgrading to 17.5
Date: 2025-05-24 15:42:30
Message-ID: 6bc28291-b212-4a84-925a-e6e5ef2fb72c@deepbluecap.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Dear Hayato Kuroda, thank you so much for working on this problem. Your patch
PG17-0001-Avoid-distributing-invalidation-messages-several-tim.patch solves the
issue for me. Without it I get an invalid memory alloc request error within
about twenty minutes. With your patch, 24 hours have passed with no errors.

Best wishes, Duncan.

On 21/05/2025 13:48, Hayato Kuroda (Fujitsu) wrote:
> Dear hackers,
>
>> I think the problem here is that when we are distributing
>> invalidations to a concurrent transaction, in addition to queuing the
>> invalidations as a change, we also copy the distributed invalidations
>> along with the original transaction's invalidations via repalloc in
>> ReorderBufferAddInvalidations. So, when there are many in-progress
>> transactions, each would try to copy all its accumulated invalidations
>> to the remaining in-progress transactions. This could lead to such an
>> increase in allocation request size. However, after queuing the
>> change, we don't need to copy it along with the original transaction's
>> invalidations. This is because the copy is only required when we don't
>> process any changes in cases like ReorderBufferForget(). I have
>> analyzed all such cases, and my analysis is as follows:
>
> Based on the analysis, I created a PoC which avoids the repalloc().
> Invalidation messages distributed by SnapBuildDistributeSnapshotAndInval() are
> skipped to add in the list, just queued - repalloc can be skipped. Also, the function
> distributes messages only in the list, so received messages won't be sent again.
>
> Now a patch for PG17 is created for testing purpose. Duncan, can you apply this and
> confirms whether the issue can be solved?
>
> Best regards,
> Hayato Kuroda
> FUJITSU LIMITED
>

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2025-05-24 19:37:05 Re: [EXT] Re: GSS Auth issue when user member of lots of AD groups
Previous Message Tom Lane 2025-05-24 14:26:53 Re: [EXT] Re: GSS Auth issue when user member of lots of AD groups