Re: [PATCH] Better Performance for PostgreSQL with large INSERTs

From: Filip Janus <fjanus(at)redhat(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Philipp Marek <philipp(at)marek(dot)priv(dot)at>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Better Performance for PostgreSQL with large INSERTs
Date: 2025-11-26 14:02:58
Message-ID: CAFjYY+JTULmdQUJcBK-hDGRSWZy0e+RFHrCy3vp9J1pCQ18+Ew@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

-Filip-

út 7. 10. 2025 v 16:54 odesílatel Andres Freund <andres(at)anarazel(dot)de> napsal:

> Hi,
>
> On 2025-10-07 15:03:29 +0200, Philipp Marek wrote:
> > > Have you tried to verify that this doesn't cause performance
> regressions
> > > in
> > > other workloads? pq_recvbuf() has this code:
> > >
> > ...
> > >
> > > I do seem to recall that just increasing the buffer size substantially
> > > lead to
> > > more time being spent inside that memmove() (likely due to exceeding
> > > L1/L2).
> >
> >
> > Do you have any pointers to discussions or other data about that?
> >
> >
> > My (quick) analysis was that clients that send one request,
> > wait for an answer, then send the next request wouldn't run that code
> > as there's nothing behind the individual requests that could be moved.
> >
> >
> > But yes, Pipeline Mode[1] might/would be affected.
> >
> > The interesting question is how much data can userspace copy before
> > that means more load than doing a userspace-kernel-userspace round trip.
> > (I guess that moving 64kB or 128kB should be quicker, especially since
> > the various CPU mitigations.)
>
> I unfortunately don't remember the details of where I saw it
> happening. Unfortunately I suspect it'll depend a lot on hardware and
> operating system details (like the security mitigations you mention) when
> it
> matters too.
>
>
> > As long as there are complete requests in the buffer the memmove()
> > could be avoided; only the initial part of the first incomplete request
> > might need moving to the beginning.
>
> Right. I'd be inclined that that ought to be addressed as part of this
> patch,
> that way we can be sure that it's pretty sure it's not going to cause
> regressions.
>

I tried to benchmark the usage of memmove(), but I wasn’t able to hit the
memmove() part of the code. This led me to a deeper investigation, and I
realized that the memmove() call is probably in a dead part of the code.
pq_recvbuf is called when PqRecvPointer >= PqRecvLength, while memmove() is
called later only if PqRecvLength > PqRecvPointer.
This results in a contradiction.

> > The documentation says
> >
> > > Pipelining is less useful, and more complex,
> > > when a single pipeline contains multiple transactions
> > > (see Section 32.5.1.3).
> >
> > are there any benchmarks/usage statistics for pipeline mode?
>
> You can write benchmarks for it using pgbench's pipeline support, with a
> custom script.
>
> Greetings,
>
> Andres Freund
>
> I am also proposing the introduction of a new GUC variable for setting PQ_RECV_BUFFER_SIZE
in the first patch. And the second patch removes the dead code.

Filip

Attachment Content-Type Size
0002-Remove-dead-memmove-code-from-pq_recvbuf.patch application/octet-stream 1.3 KB
0001-Add-configurable-receive-buffer-size.patch application/octet-stream 5.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Manni Wood 2025-11-26 14:21:46 Re: Speed up COPY FROM text/CSV parsing using SIMD
Previous Message Nazir Bilal Yavuz 2025-11-26 14:02:21 Re: Add pg_buffercache_mark_dirty[_all] functions to the pg_buffercache