Re: Optimize SnapBuildPurgeOlderTxn: use in-place compaction instead of temporary array

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Xuneng Zhou <xunengzhou(at)gmail(dot)com>, Neil Chen <carpenter(dot)nail(dot)cz(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Michael Paquier <michael(at)paquier(dot)xyz>, Kirill Reshke <reshkekirill(at)gmail(dot)com>
Subject: Re: Optimize SnapBuildPurgeOlderTxn: use in-place compaction instead of temporary array
Date: 2026-01-08 23:58:55
Message-ID: CAD21AoA4ELkCCezPtF5Hn6W8yfagXf4LNJZqtBdDBe1+9GG0eA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 8, 2026 at 8:05 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Xuneng Zhou <xunengzhou(at)gmail(dot)com> writes:
> > On Thu, Jan 8, 2026 at 4:49 PM Neil Chen <carpenter(dot)nail(dot)cz(at)gmail(dot)com> wrote:
> >> I’ve given this patch a cursory review, and it looks good overall.
> >> Considering the impact of this change, should we add a regression test for it, or provide a unit test to ensure the correctness of the modification?
> >> I think it would be more receptive to merging such a "subtle performance optimization" only when its correctness is fully guaranteed.
>
> > Thanks for your review. I think we can add a tap test for it.
>
> What makes you think this code isn't adequately tested already?
> The coverage report at
>
> https://coverage.postgresql.org/src/backend/replication/logical/snapbuild.c.gcov.html
>
> shows SnapBuildPurgeOlderTxn as pretty fully exercised.
>
> A more interesting question is whether it's worth bothering with
> at all. This code only runs when we receive a running-xacts WAL
> record, which is infrequent. So it seems pretty likely to me that
> nobody could ever detect any performance difference from saving
> a palloc/pfree here. Perhaps the patch is worth applying on the
> grounds that it makes the function shorter and clearer, but that's
> pretty much in the eye of the beholder.

The 0001 patch optimizes the management of catalog-modifying XIDs
during the decoding of COMMIT records. Instead of doing qsort() for
every snapshot building, the patch maintains committed.xip in sorted
order.

While this accelerates logical decoding in scenarios with frequent DDL
execution, I share Tom's concern: the added complexity seems
disproportionate for such an infrequent use case. I am also not sure
that the optimization regarding the CONSISTENT state is necessary.

Furthermore, the patch introduces a palloc()/pfree() overhead for
every COMMIT record to replace qsort(). As indicated by the
results[1], this appears to cause a slight regression in common
workloads dominated by DML.

Regards,

[1] https://www.postgresql.org/message-id/CABPTF7USnHTLLzg%2BKsNTb0dCONM%3DcrM5tdH5HUZr4kNOq7tqAw%40mail.gmail.com

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Manni Wood 2026-01-09 00:01:59 Re: Speed up COPY FROM text/CSV parsing using SIMD
Previous Message Michael Paquier 2026-01-08 23:49:04 Re: IO wait events for COPY FROM/TO PROGRAM or file