Re: Parallel Seq Scan

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Seq Scan
Date: 2015-10-13 06:45:35
Message-ID: CAA4eK1LcHMcOcfW0a-tWXFKpFqkS-JrW5xpMQGfNPOioR3wW9g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Oct 12, 2015 at 5:15 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
wrote:
>
>
> Right, it should initialize parallel scan properly even for
non-synchronized
> scans. Fixed the issue in attached patch. Rebased heap rescan is
> attached as well.
>

Attached is rebased patch for partial seqscan support. The major
change in this patch is to prohibit generation of parallel path for a
relation if quals contain restricted functions and or initplan/subplan.
Also as Gather node in itself is not a projection capable node, so
if target list contains any expression, it adds a Result node on top of
it. I think this will take care of the cases where if target list contains
any parallel-unsafe expressions (like restricted functions and or
initplans/subplans), then those won't be pushed to backend workers.

Another options I have considered for target list are:
1. Assess the tlist passed to query_planner to see if it contains any
parallel-
unsafe expression, if so then don't generate any parallel path for that
subquery. Though this idea will deal with prohibition at sub-query level,
still I think it is not the best way as subquery could contain join and for
some of the relations participating in join, we could have parallel-paths,
but
doing this way will restrict parallel paths for all the relations
participating in
sub-query.

2. To handle join case in sub-uery, we can pass tlist passed to
query_planner() till create_parallelscan_paths() and then check if any
target
entry contains unsafe expression and if that expression has Var that belongs
to current relation, then don't allow parallel path else allow it. Doing
this way
we might not be able to catch the cases as below, where expression in
target doesn't belong to any relation.

select c1, pg_restricted() from t1;

We can think of other ways to handle target list containing parallel-unsafe
expression, if whatever done in patch is not sufficient.

We might want to support initplans/subplans and restricted function
evaluation once the required infrastructure to support the same is
in-place. I think those could be done as separate patches.

Notes -
1. This eventually needs to be rebased on top of bug-fixes posted by
Robert for parallelism [1]. One of the temporary fix has been done
in ExecReScanGather() to allow rescan of Gather node, the actual fix
will be incorporated by bug-fix patches.

2. Done pgindent on changed files, so you might see some indentation
changes which are not directly related to this patch, but are from previous
parallel seq scan work especially in execParallel.c.

3. Apply this patch on top of parallel heap scan patches [2]

[1] -
http://www.postgresql.org/message-id/CA+TgmoapgKdy_Z0W9mHqZcGSo2t_t-4_V36DXaKim+X_fYp0oQ@mail.gmail.com
[2] -
http://www.postgresql.org/message-id/CAA4eK1KCymW+-vJuAgSxf-s4K-0X3dBxDcw5Hem+qSgergxY4A@mail.gmail.com

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
parallel_seqscan_partialseqscan_v20.patch application/octet-stream 56.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2015-10-13 06:53:18 Re: Parallel Aggregate
Previous Message Michael Paquier 2015-10-13 06:04:52 Re: pg_ctl/pg_rewind tests vs. slow AIX buildfarm members