Re: Proposal: scan key push down to heap [WIP]

From: Andres Freund <andres(at)anarazel(dot)de>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal: scan key push down to heap [WIP]
Date: 2016-10-14 06:24:59
Message-ID: 20161014062459.6mqjgvg6gft7cwnd@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2016-10-11 17:27:56 +0530, Dilip Kumar wrote:
> I would like to propose a patch for pushing down the scan key to heap.

I think that makes sense. Both because scankey eval is faster than
generic expression eval, and because it saves a lot of function calls in
heavily filtered cases.

> However in future when we try to expand the support for complex
> expressions, then we need to be very careful while selecting
> pushable expression. It should not happen that we push something very
> complex, which may cause contention with other write operation (as
> HeapKeyTest is done under page lock).

I don't think it's a good idea to do this under the content lock in any
case. But luckily I don't think we have to do so at all.

Due to pagemode - which is used by pretty much everything iterating over
heaps, and definitely seqscans - the key test already happens without
the content lock held, in heapgettup_pagemode():
/*
* ----------------
* heapgettup_pagemode - fetch next heap tuple in page-at-a-time mode
*
* Same API as heapgettup, but used in page-at-a-time mode
*
* The internal logic is much the same as heapgettup's too, but there are some
* differences: *****we do not take the buffer content lock**** (that only needs to
* happen inside heapgetpage), and we iterate through just the tuples listed
* in rs_vistuples[] rather than all tuples on the page. Notice that
* lineindex is 0-based, where the corresponding loop variable lineoff in
* heapgettup is 1-based.
* ----------------
*/
static void
heapgettup_pagemode(HeapScanDesc scan,
ScanDirection dir,
int nkeys,
ScanKey key)
...
/*
* advance the scan until we find a qualifying tuple or run out of stuff
* to scan
*/
for (;;)
{
while (linesleft > 0)
{
lineoff = scan->rs_vistuples[lineindex];
lpp = PageGetItemId(dp, lineoff);
Assert(ItemIdIsNormal(lpp));
...
/*
* if current tuple qualifies, return it.
*/
if (key != NULL)
{
bool valid;

HeapKeyTest(tuple, RelationGetDescr(scan->rs_rd),
nkeys, key, valid);
if (valid)
{
scan->rs_cindex = lineindex;
return;
}
}

> Instructions: (Cpu instructions measured with callgrind tool):

Note that callgrind's numbers aren't very meaningful in these days. CPU
pipelining and speculative execution/reordering makes them very
inaccurate.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2016-10-14 07:56:33 Re: proposal: session server side variables
Previous Message Michael Paquier 2016-10-14 06:09:16 Re: pg_dump, pg_dumpall and data durability