From: | Amit Langote <amitlangote09(at)gmail(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | David Rowley <dgrowleyml(at)gmail(dot)com>, Vladlen Popolitov <v(dot)popolitov(at)postgrespro(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Junwang Zhao <zhjwpku(at)gmail(dot)com> |
Subject: | Re: Some ExecSeqScan optimizations |
Date: | 2025-07-11 02:22:36 |
Message-ID: | CA+HiwqF9b1Kuoz1h2+9ohjX_kB9KAHcgPpA-mi+8fKv67N0_mw@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Jul 11, 2025 at 5:55 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> On 2025-07-10 17:28:50 +0900, Amit Langote wrote:
> > On Thu, Jul 10, 2025 at 8:34 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > > The performance gain unsurprisingly isn't significant (but seems repeatably
> > > measureable), but it does cut out a fair bit of unnecessary code.
> > >
> > > andres(at)awork3:/srv/dev/build/postgres/m-dev-optimize$ size executor_nodeSeqscan.c.*o
> > > text data bss dec hex filename
> > > 3330 0 0 3330 d02 executor_nodeSeqscan.c.assume.o
> > > 3834 0 0 3834 efa executor_nodeSeqscan.c.o
> > >
> > > A 13% reduction in actual code size isn't bad for such a small change, imo.
> >
> > Yeah, that seems worthwhile. I had been a bit concerned about code
> > size growth from having four variant functions with at least some
> > duplication, so this is a nice offset.
>
> I'm rather surprised by just how much the size reduces...
>
> I built nodeSeqscan.c with -ffunction-sections and looked at the size with
> size --format=sysv:
>
> Before:
> .text.SeqRecheck 6 0
> .rodata.str1.8 135 0
> .text.unlikely.SeqNext 53 0
> .text.SeqNext 178 0
> .text.ExecSeqScanEPQ 20 0
> .text.ExecSeqScanWithProject 289 0
> .text.unlikely.ExecSeqScanWithQual 53 0
> .text.ExecSeqScanWithQual 441 0
> .text.unlikely.ExecSeqScanWithQualProject 53 0
> .text.ExecSeqScanWithQualProject 811 0
> .text.unlikely.ExecSeqScan 53 0
> .text.ExecSeqScan 245 0
> .text.ExecInitSeqScan 287 0
> .text.ExecEndSeqScan 33 0
> .text.ExecReScanSeqScan 63 0
> .text.ExecSeqScanEstimate 88 0
> .text.ExecSeqScanInitializeDSM 114 0
> .text.ExecSeqScanReInitializeDSM 34 0
> .text.ExecSeqScanInitializeWorker 64 0
>
> After:
> .text.SeqRecheck 6 0
> .rodata.str1.8 135 0
> .text.unlikely.SeqNext 53 0
> .text.SeqNext 178 0
> .text.ExecSeqScanEPQ 20 0
> .text.ExecSeqScanWithProject 209 0
> .text.unlikely.ExecSeqScanWithQual 53 0
> .text.ExecSeqScanWithQual 373 0
> .text.unlikely.ExecSeqScanWithQualProject 53 0
> .text.ExecSeqScanWithQualProject 474 0
> .text.unlikely.ExecSeqScan 53 0
> .text.ExecSeqScan 245 0
> .text.ExecInitSeqScan 287 0
> .text.ExecEndSeqScan 33 0
> .text.ExecReScanSeqScan 63 0
> .text.ExecSeqScanEstimate 88 0
> .text.ExecSeqScanInitializeDSM 114 0
> .text.ExecSeqScanReInitializeDSM 34 0
> .text.ExecSeqScanInitializeWorker 64 0
>
>
> I'm rather baffled that the size of ExecSeqScanWithQualProject goes from 811
> to 474, just due to those null checks being removed... But I'll take it.
Wow, indeed.
> > Thanks for the patch.
> >
> > + /*
> > + * Use pg_assume() for != NULL tests to make the compiler realize no
> > + * runtime check for the field is needed in ExecScanExtended().
> > + */
> >
> > I propose changing "to make the compiler realize no runtime check" to
> > "so the compiler can optimize away the runtime check", assuming that
> > is what it means.
>
> It does. I don't really see a meaningful difference between the comments?
Maybe not. I just had to pause for a moment to be sure that was what
it actually meant when I first read it. I'm fine leaving it as is if
you prefer.
> > > I have a separate question as well - do we need to call ResetExprContext() if
> > > we neither qual, projection nor epq? I see a small gain by avoiding that.
> >
> > You're referring to this block, I assume:
> >
> > /*
> > * If we have neither a qual to check nor a projection to do, just skip
> > * all the overhead and return the raw scan tuple.
> > */
> > if (!qual && !projInfo)
> > {
> > ResetExprContext(econtext);
> > return ExecScanFetch(node, epqstate, accessMtd, recheckMtd);
> > }
>
> Yep.
>
>
> > Yeah, I think it's fine to remove ResetExprContext() here. When I
> > looked at it before, I left it in because I was unsure whether
> > accessMtd() might leak memory into the per-tuple context.
>
> It's a good question. I think I unfortunately found a problematic case,
> ForeignNext().
Ah, so we do have a culprit in the tree.
> I wonder if we instead can MemoryContextReset cheaper, by avoiding a function
> call for the common case that no reset is needed. Right now we can't just
> check ->isReset in an inline function, because we also delete children. I
> wonder if we could define isReset so that creating a child context unsets
> isReset?
Were you thinking ResetExprContext() could become something like:
#define ResetExprContext(econtext) \
do { \
if (!((econtext)->ecxt_per_tuple_memory)->isReset) \
MemoryContextReset((econtext)->ecxt_per_tuple_memory); \
} while (0)
that is, once isReset also accounts for whether any child context exists?
--
Thanks, Amit Langote
From | Date | Subject | |
---|---|---|---|
Next Message | vignesh C | 2025-07-11 02:33:28 | Re: A recent message added to pg_upgade |
Previous Message | Tatsuo Ishii | 2025-07-11 01:47:04 | Re: PG18 protocol version |