From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Amit Langote <amitlangote09(at)gmail(dot)com> |
Cc: | David Rowley <dgrowleyml(at)gmail(dot)com>, Vladlen Popolitov <v(dot)popolitov(at)postgrespro(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Junwang Zhao <zhjwpku(at)gmail(dot)com> |
Subject: | Re: Some ExecSeqScan optimizations |
Date: | 2025-07-10 20:55:33 |
Message-ID: | ww3z3uhppkzt4gv46gmbarksfcaf2zvfexdie5ofa3gcqagfkh@prplcocjxakw |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 2025-07-10 17:28:50 +0900, Amit Langote wrote:
> On Thu, Jul 10, 2025 at 8:34 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > On 2025-01-22 10:07:51 +0900, Amit Langote wrote:
> > > On Fri, Jan 17, 2025 at 2:05 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> > > > Here's v5 with a few commit message tweaks.
> > > >
> > > > Barring objections, I would like to push this early next week.
> > >
> > > Pushed yesterday. Thank you all.
> >
> > While looking at a profile I recently noticed that ExecSeqScanWithQual() had a
> > runtime branch to test whether qual is NULL, which seemed a bit silly. I think
> > we should use pg_assume(), which I just added to avoid a compiler warning, to
> > improve the code generation here.
>
> +1. I think this might be what David was getting at in his first
> message in this thread.
Indeed.
> > The performance gain unsurprisingly isn't significant (but seems repeatably
> > measureable), but it does cut out a fair bit of unnecessary code.
> >
> > andres(at)awork3:/srv/dev/build/postgres/m-dev-optimize$ size executor_nodeSeqscan.c.*o
> > text data bss dec hex filename
> > 3330 0 0 3330 d02 executor_nodeSeqscan.c.assume.o
> > 3834 0 0 3834 efa executor_nodeSeqscan.c.o
> >
> > A 13% reduction in actual code size isn't bad for such a small change, imo.
>
> Yeah, that seems worthwhile. I had been a bit concerned about code
> size growth from having four variant functions with at least some
> duplication, so this is a nice offset.
I'm rather surprised by just how much the size reduces...
I built nodeSeqscan.c with -ffunction-sections and looked at the size with
size --format=sysv:
Before:
.text.SeqRecheck 6 0
.rodata.str1.8 135 0
.text.unlikely.SeqNext 53 0
.text.SeqNext 178 0
.text.ExecSeqScanEPQ 20 0
.text.ExecSeqScanWithProject 289 0
.text.unlikely.ExecSeqScanWithQual 53 0
.text.ExecSeqScanWithQual 441 0
.text.unlikely.ExecSeqScanWithQualProject 53 0
.text.ExecSeqScanWithQualProject 811 0
.text.unlikely.ExecSeqScan 53 0
.text.ExecSeqScan 245 0
.text.ExecInitSeqScan 287 0
.text.ExecEndSeqScan 33 0
.text.ExecReScanSeqScan 63 0
.text.ExecSeqScanEstimate 88 0
.text.ExecSeqScanInitializeDSM 114 0
.text.ExecSeqScanReInitializeDSM 34 0
.text.ExecSeqScanInitializeWorker 64 0
After:
.text.SeqRecheck 6 0
.rodata.str1.8 135 0
.text.unlikely.SeqNext 53 0
.text.SeqNext 178 0
.text.ExecSeqScanEPQ 20 0
.text.ExecSeqScanWithProject 209 0
.text.unlikely.ExecSeqScanWithQual 53 0
.text.ExecSeqScanWithQual 373 0
.text.unlikely.ExecSeqScanWithQualProject 53 0
.text.ExecSeqScanWithQualProject 474 0
.text.unlikely.ExecSeqScan 53 0
.text.ExecSeqScan 245 0
.text.ExecInitSeqScan 287 0
.text.ExecEndSeqScan 33 0
.text.ExecReScanSeqScan 63 0
.text.ExecSeqScanEstimate 88 0
.text.ExecSeqScanInitializeDSM 114 0
.text.ExecSeqScanReInitializeDSM 34 0
.text.ExecSeqScanInitializeWorker 64 0
I'm rather baffled that the size of ExecSeqScanWithQualProject goes from 811
to 474, just due to those null checks being removed... But I'll take it.
> Thanks for the patch.
>
> + /*
> + * Use pg_assume() for != NULL tests to make the compiler realize no
> + * runtime check for the field is needed in ExecScanExtended().
> + */
>
> I propose changing "to make the compiler realize no runtime check" to
> "so the compiler can optimize away the runtime check", assuming that
> is what it means.
It does. I don't really see a meaningful difference between the comments?
> Also, I assume you intentionally avoided repeating the comment in all
> the variant functions.
Yea, it didn't seem helpful to do so.
> > I have a separate question as well - do we need to call ResetExprContext() if
> > we neither qual, projection nor epq? I see a small gain by avoiding that.
>
> You're referring to this block, I assume:
>
> /*
> * If we have neither a qual to check nor a projection to do, just skip
> * all the overhead and return the raw scan tuple.
> */
> if (!qual && !projInfo)
> {
> ResetExprContext(econtext);
> return ExecScanFetch(node, epqstate, accessMtd, recheckMtd);
> }
Yep.
> Yeah, I think it's fine to remove ResetExprContext() here. When I
> looked at it before, I left it in because I was unsure whether
> accessMtd() might leak memory into the per-tuple context.
It's a good question. I think I unfortunately found a problematic case,
ForeignNext().
I wonder if we instead can MemoryContextReset cheaper, by avoiding a function
call for the common case that no reset is needed. Right now we can't just
check ->isReset in an inline function, because we also delete children. I
wonder if we could define isReset so that creating a child context unsets
isReset?
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Nathan Bossart | 2025-07-10 20:57:06 | Re: pg_dump sort priority mismatch for large objects |
Previous Message | David E. Wheeler | 2025-07-10 20:55:15 | Re: encode/decode support for base64url |