| From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
|---|---|
| To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
| Cc: | vignesh C <vignesh21(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Duncan Sands <duncan(dot)sands(at)deepbluecap(dot)com>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Logical replication 'invalid memory alloc request size 1585837200' after upgrading to 17.5 |
| Date: | 2025-06-05 18:49:42 |
| Message-ID: | CAD21AoBaiMiAMLF-daEyB43hLbWA6fMmWWToGDMyp9V3kp149w@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-bugs |
On Wed, Jun 4, 2025 at 11:20 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Thu, Jun 5, 2025 at 3:19 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > On Tue, Jun 3, 2025 at 11:48 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
> > >
> > > On Wed, 4 Jun 2025 at 01:14, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > Thank you for updating the patch. I have some comments and questions:
> >
> > In ReorderBufferAbort():
> >
> > /*
> > * We might have decoded changes for this transaction that could load
> > * the cache as per the current transaction's view (consider DDL's
> > * happened in this transaction). We don't want the decoding of future
> > * transactions to use those cache entries so execute invalidations.
> > */
> > if (txn->ninvalidations > 0)
> > ReorderBufferImmediateInvalidation(rb, txn->ninvalidations,
> > txn->invalidations);
> >
> > I think that if the txn->invalidations_distributed is overflowed, we
> > would miss executing the txn->invalidations here. Probably the same is
> > true for ReorderBufferForget() and ReorderBufferInvalidate().
> >
>
> This is because of the following check "if
> (!rbtxn_inval_overflowed(txn))" in function
> ReorderBufferAddInvalidations(). What is the need of such a check in
> this function? We don't need to execute distributed invalidations in
> cases like ReorderBufferForget() when we haven't decoded any changes.
>
> > ---
> > I'd like to make it clear again which case we need to execute
> > txn->invalidations as well as txn->invalidations_distributed (like in
> > ReorderBufferProcessTXN()) and which case we need to execute only
> > txn->invalidations (like in ReorderBufferForget() and
> > ReorderBufferAbort()). I think it might be worth putting some comments
> > about overall strategy somewhere.
> >
> > ---
> > BTW for back branches, a simple fix without ABI breakage would be to
> > introduce the RBTXN_INVAL_OVERFLOWED flag to limit the size of
> > txn->invalidations. That is, we accumulate inval messages both coming
> > from the current transaction and distributed by other transactions but
> > once the size reaches the threshold we invalidate all caches. Is it
> > worth considering for back branches?
> >
>
> It should work and is worth considering. The main concern would be
> that it will hit sooner than we expect in the field, seeing the recent
> reports. So, such a change has the potential to degrade the
> performance. I feel that the number of people impacted due to
> performance would be more than the number of people impacted due to
> such an ABI change (adding the new members at the end of
> ReorderBufferTXN).
That's a fair point. I initially assumed that DDLs were not executed
often in practice, but analyzing this bug has made me realize this
assumption was misguided.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Masahiko Sawada | 2025-06-05 19:21:21 | Re: Logical replication 'invalid memory alloc request size 1585837200' after upgrading to 17.5 |
| Previous Message | Masahiko Sawada | 2025-06-05 17:43:25 | Re: Logical replication 'invalid memory alloc request size 1585837200' after upgrading to 17.5 |