| From: | Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com> |
|---|---|
| To: | Noah Misch <noah(at)leadboat(dot)com> |
| Cc: | Nitin Motiani <nitinmotiani(at)google(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org |
| Subject: | Re: Inval reliability, especially for inplace updates |
| Date: | 2026-01-21 16:59:51 |
| Message-ID: | CAHgHdKsQYHz4XDOtUwNSn+JRv1kyatLiNYLHu4Y9V_xR4Y5kpA@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Wed, Oct 23, 2024 at 7:54 PM Noah Misch <noah(at)leadboat(dot)com> wrote:
> With the releases wrapping in 2.5 weeks, I'm ambivalent about pushing this
> before the release or after. Pushing before means fewer occurrences of
> corruption, but pushing after gives more bake time to discover these
> changes
> were defective. It's hard to predict which helps users more, on a
> risk-adjusted basis. I'm leaning toward pushing this week. Opinions?
>
> On Sun, Oct 20, 2024 at 06:41:37PM +0530, Nitin Motiani wrote:
> > I tested the patch locally and it works. And I have no other question
> > regarding the structure. So this patch looks good to me to commit.
>
> Thanks. While resolving a back-branch merge conflict, I found
> AtEOXact_Inval() and AtEOSubXact_Inval() were skipping inplaceInvalInfo
> tasks
> if transInvalInfo==NULL. If one PreInplace_Inval() failed, the session's
> next
> inplace update would get an assertion failure. Non-assert builds left
> inplaceInvalInfo pointing to freed memory, but I didn't find a reachable
> malfunction. Separately, the xact.c comment edit wasn't reflecting that v4
> brought back the transactional inval. v6 fixes those. Regarding the
> back-branch alternatives to the WAL format change:
>
> On Tue, Jun 18, 2024 at 08:23:49AM -0700, Noah Misch wrote:
> > On Mon, Jun 17, 2024 at 06:57:30PM -0700, Andres Freund wrote:
> > > On 2024-06-17 16:58:54 -0700, Noah Misch wrote:
> > > > On Sat, Jun 15, 2024 at 03:37:18PM -0700, Noah Misch wrote:
> > > > > - heap_xlog_inplace() could set the shared-inval-queue overflow
> signal on
> > > > > every backend. This is more wasteful, but inplace updates might
> be rare
> > > > > enough (~once per VACUUM) to make it tolerable.
> > >
> > > We already set that surprisingly frequently, as
> > > a) The size of the sinval queue is small
> > > b) If a backend is busy, it does not process catchup interrupts
> > > (i.e. executing queries, waiting for a lock prevents processing)
> > > c) There's no deduplication of invals, we often end up sending the
> same inval
> > > over and over.
> > >
> > > So I suspect this might not be too bad, compared to the current
> badness.
> >
> > That is good.
>
> I benchmarked that by hacking 027_stream_regress.pl to run "pgbench
> --no-vacuum --client=4 -T 30 -b select-only" on the standby, concurrent
> with
> the primary running the regression tests. Standby clients acted on sinval
> resetState ~16k times, and pgbench tps decreased 4.5%. That doesn't
> necessarily mean the real-life cost would be unacceptable, but it was
> enough
> of a decrease that I switched to the next choice:
>
> > We might be able to do the overflow signal once at end of
> > recovery, like RelationCacheInitFileRemove() does for the init file.
> That's
> > mildly harder to reason about, but it would be cheaper. Hmmm.
>
> I'm attaching the branch-specific patches for that and for the main fix.
> Other notes from back-patching:
>
> - All branches change the ABI of PrepareToInvalidateCacheTuple(), a
> function
> catcache.c exports for the benefit of inval.c. No PGXN extension calls
> that, and I can't think of a use case in extensions.
>
Unfortunately, I can think of four. I have four Table Access Methods that
I now need to fork to be compatible with 18.0 and 18.1 on the one hand, and
18.2 onward on the other.
I'm sorry I didn't follow this thread before it got pushed.
Is there a reason for doing this change in back branches? The thread is
pretty long, and I'm struggling to find a security or stability
justification for the ABI break, but perhaps there is one.
>
> - Due to v15 commit 3aafc03, the patch for v14 differs to an unusual degree
> from the patch for v15+v16. The difference is mostly mechanical, though.
>
> Thanks,
> nm
>
--
*Mark Dilger*
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tom Lane | 2026-01-21 17:21:55 | Re: [PATCH] Provide support for trailing commas |
| Previous Message | Peter Eisentraut | 2026-01-21 16:59:29 | Re: [PATCH] Provide support for trailing commas |