From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Parallel Seq Scan |
Date: | 2015-10-05 12:20:38 |
Message-ID: | CAA4eK1+HYqsH5dU1LtwRXP0AEb_fAxcgF2_sQmw02kQb5eHVDQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, Oct 4, 2015 at 10:21 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
wrote:
>
> On Sat, Oct 3, 2015 at 11:35 PM, Robert Haas <robertmhaas(at)gmail(dot)com>
wrote:
> >
> > On Fri, Oct 2, 2015 at 11:44 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
wrote:
> > > On Thu, Oct 1, 2015 at 7:41 PM, Robert Haas <robertmhaas(at)gmail(dot)com>
wrote:
> > >> Does heap_parallelscan_nextpage really need a pscan_finished output
> > >> parameter, or can it just return InvalidBlockNumber to indicate end
of
> > >> scan? I think the latter can be done and would be cleaner.
> > >
> > > I think we can do that way as well, however we have to keep the check
> > > of page == InvalidBlockNumber at all the callers to indicate finish
> > > of scan which made me write the function as it exists in patch.
However,
> > > I don't mind changing it the way you have suggested if you feel that
would
> > > be cleaner.
> >
> > I think it would. I mean, you just end up testing the other thing
instead.
> >
>
> No issues, will change in next version of patch.
>
Changed as per suggestion.
> > >> I think that heap_parallelscan_initialize() should taken an
> > >> additional Boolean argument, allow_sync. It should decide whether to
> > >> actually perform a syncscan using the logic from initscan(), and then
> > >> it should store a phs_syncscan flag into the ParallelHeapScanDesc.
> > >> heap_beginscan_internal should set rs_syncscan based on phs_syncscan,
> > >> regardless of anything else.
> > >
> > > I think this will ensure that any future change in this area won't
break the
> > > guarantee for sync scans for parallel workers, is that the reason you
> > > prefer to implement this function in the way suggested by you?
> >
> > Basically. It seems pretty fragile the way you have it now.
> >
>
> Okay, in that case I will change it as per your suggestion.
>
Changed as per suggestion.
> > >> I think heap_parallel_rescan() is an unworkable API. When rescanning
> > >> a synchronized scan, the existing logic keeps the original
start-block
> > >> so that the scan's results are reproducible,
> > >
> > > It seems from the code comments in initscan, the reason for keeping
> > > previous startblock is to allow rewinding the cursor which doesn't
hold for
> > > parallel scan. We might or might not want to support such cases with
> > > parallel query, but even if we want to there is a way we can do with
> > > current logic (as mentioned in one of my replies below).
> >
> > You don't need to rewind a cursor; you just need to restart the scan.
> > So for example if we were on the inner side of a NestLoop, this would
> > be a real issue.
> >
>
> Sorry, but I am not able to see the exact issue you have in mind for
NestLoop,
> if we don't preserve the start block position for parallel scan.
For now, I have fixed this by not preserving the startblock incase of rescan
for parallel scan. Note that, I have created a separate patch
(parallel_seqscan_heaprescan_v1.patch) for support of rescan (for parallel
scan).
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
Attachment | Content-Type | Size |
---|---|---|
parallel_seqscan_heapscan_v2.patch | application/octet-stream | 16.2 KB |
parallel_seqscan_heaprescan_v1.patch | application/octet-stream | 3.5 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Heikki Linnakangas | 2015-10-05 13:02:41 | Use EVP API pgcrypto encryption, dropping support for OpenSSL 0.9.6 and older |
Previous Message | Stephen Frost | 2015-10-05 12:17:18 | Re: ON CONFLICT issues around whole row vars, |