Re: Parallel Seq Scan

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Fabrízio Mello <fabriziomello(at)gmail(dot)com>, Thom Brown <thom(at)linux(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Seq Scan
Date: 2015-02-09 10:50:52
Message-ID: CAA4eK1LhbM-cESEVF0+1=B-nD+J+A_zy9M2oD8FNkTQiFCstTg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Feb 8, 2015 at 11:03 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Sat, Feb 7, 2015 at 10:36 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
wrote:
> >> Well, I agree with you, but I'm not really sure what that has to do
> >> with the issue at hand. I mean, if we were to apply Amit's patch,
> >> we'd been in a situation where, for a non-parallel heap scan, heapam.c
> >> decides the order in which blocks get scanned, but for a parallel heap
> >> scan, nodeParallelSeqscan.c makes that decision.
> >
> > I think other places also decides about the order/way heapam.c has
> > to scan, example the order in which rows/pages has to traversed is
> > decided at portal/executor layer and the same is passed till heap and
> > in case of index, the scanlimits (heap_setscanlimits()) are decided
> > outside heapam.c and something similar is done for parallel seq scan.
> > In general, the scan is driven by Scandescriptor which is constructed
> > at upper level and there are some API's exposed to derive the scan.
> > If you are not happy with the current way nodeParallelSeqscan has
> > set the scan limits, we can have some form of callback which do the
> > required work and this callback can be called from heapam.c.
>
> I thought about a callback, but what's the benefit of doing that vs.
> hard-coding it in heapam.c?

Basically I want to address your concern of setting scan limit via
sequence scan node, one of the ways could be that pass a callback_function
and callback_state to heap_beginscan which will remember that information
in HeapScanDesc and then use in heap_getnext(), now callback_state will
have info about next page which will be updated by callback_function.

We can remember callback_function and callback_state information in
estate which will be set only by parallel worker which means it won't effect
non-parallel case. I think this will be helpful in future as well where we
want
particular scan or sort to use that information to behave as parallel scan
or
sort.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2015-02-09 10:52:41 Re: What exactly is our CRC algorithm?
Previous Message Marc Balmer 2015-02-09 10:47:48 Re: For cursors, there is FETCH and MOVE, why no TELL?