Re: unnecessary executor overheads around seqscans

From: Andres Freund <andres(at)anarazel(dot)de>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Amit Langote <amitlangote09(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Subject: Re: unnecessary executor overheads around seqscans
Date: 2026-01-24 15:36:19
Message-ID: yk5rybusljgxmsrm2xh2se77lbzq7cz6avxuwa4xjq7flh2dic@4wfcvplykhoj
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2026-01-24 19:36:08 +1300, David Rowley wrote:
> On Sat, 24 Jan 2026 at 19:21, Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> > On Sat, Jan 24, 2026 at 5:16 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > > Or perhaps we could just make it so that the entire if (scandesc == NULL)
> > > branch isn't needed?
> >
> > Kind of like ExecProcNodeFirst(), what if we replace the variant
> > selection in ExecInitSeqScan() with just:
>
> I imagined moving it to ExecInitSeqScan() and just avoid doing it when
> we're doing EXPLAIN or we're doing a parallel scan. Something like the
> attached, which is giving me a 4% speedup selecting from a million row
> table with a single int column running a seqscan query with a WHERE
> clause matching no rows.

Yes, i think that'd be better. Nice!

> > > We should change ExecStoreBufferHeapTuple() to return true. Nobody uses the
> > > current return value. Alternatively we should consider just moving it to
> > > somewhere heapam.c/heapam_handler.c can see the implementations, they're the
> > > only ones that should use it anyway.
> >
> > Makes sense. Changing ExecStoreBufferHeapTuple() to return true seems
> > like the simpler option, unless I misunderstood.
>
> It's probably too late to change it now, but wouldn't it have been
> better if scan_getnextslot had been coded to return the TupleTableSlot
> rather than bool? That way you could get the sibling call in
> ExecStoreBufferHeapTuple() and in SeqNext().

TupIsNull() is actually more expensive than a bool return value, because it
needs a memory fetch to complete... Except of course if we end up doing both.

SeqNext() shouldn't need a sibling call because it ought to be inlined. But:

> I also noticed my compiler does not inline SeqNext(). Adding a
> pg_attribute_always_inline results in it getting inlined and gives a
> small speedup.

Oh,m that's not good. I think we really had assumed that it would with the 18
changes around this. It does here, but that's probably because I use -O3.

> diff --git a/src/backend/executor/nodeSeqscan.c b/src/backend/executor/nodeSeqscan.c
> index b8119face43..87420e60dc9 100644
> --- a/src/backend/executor/nodeSeqscan.c
> +++ b/src/backend/executor/nodeSeqscan.c
> @@ -63,17 +63,6 @@ SeqNext(SeqScanState *node)
> direction = estate->es_direction;
> slot = node->ss.ss_ScanTupleSlot;
>
> - if (scandesc == NULL)
> - {
> - /*
> - * We reach here if the scan is not parallel, or if we're serially
> - * executing a scan that was planned to be parallel.
> - */
> - scandesc = table_beginscan(node->ss.ss_currentRelation,
> - estate->es_snapshot,
> - 0, NULL);
> - node->ss.ss_currentScanDesc = scandesc;
> - }

What about the "if we're serially executing a scan that was planned to be
parallel." part? I frankly don't really know what precisely was referencing...

Greetings,

Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2026-01-24 15:56:21 Re: unnecessary executor overheads around seqscans
Previous Message Junwang Zhao 2026-01-24 15:35:47 Re: Converting README documentation to Markdown