Re: Parallel Index Scans

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, Anastasia Lubennikova <lubennikovaav(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Rahila Syed <rahilasyed(dot)90(at)gmail(dot)com>
Subject: Re: Parallel Index Scans
Date: 2017-01-18 13:03:18
Message-ID: CAA4eK1+5J+2SJONcYPr=zxo6Ku50b9e9aTXjEAdyOHK0eSD8_g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jan 17, 2017 at 11:27 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Mon, Jan 16, 2017 at 7:11 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
>
> WAIT_EVENT_PARALLEL_INDEX_SCAN is in fact btree-specific. There's no
> guarantee that any other AMs the implement parallel index scans will
> use that wait event, and they might have others instead. I would make
> it a lot more specific, like WAIT_EVENT_BTREE_PAGENUMBER. (Waiting
> for the page number needed to continue a parallel btree scan to become
> available.)
>

WAIT_EVENT_BTREE_PAGENUMBER - NUMBER sounds slightly inconvenient. How
about just WAIT_EVENT_BTREE_PAGE? We can keep the description as
suggested by you?

> Why do we need separate AM support for index_parallelrescan()? Why
> can't index_rescan() cover that case?

The reason is that sometime index_rescan is called when we have to
just update runtime scankeys in index and we don't want to reset
parallel scan for that. Refer ExecReScanIndexScan() changes in patch
parallel_index_opt_exec_support_v4. Rescan is called from below place
during index scan.

ExecIndexScan(IndexScanState *node)
{
/*
* If we have runtime keys and they've not already been set up, do it now.
*/
if (node->iss_NumRuntimeKeys != 0 && !node->iss_RuntimeKeysReady)
ExecReScan((PlanState *) node);

> If the AM-specific method is
> getting the IndexScanDesc, it can see for itself whether it is a
> parallel scan.
>

I think if we want to do that way then we need to pass some additional
information related to runtime scan keys in index_rescan method and
then probably to amspecific rescan method. That sounds scary.

>
> I think it's a bad idea to add a ParallelIndexScanDesc argument to
> index_beginscan(). That function is used in lots of places, and
> somebody might think that they are allowed to actually pass a non-NULL
> value there, which they aren't: they must go through
> index_beginscan_parallel. I think that the additional argument should
> only be added to index_beginscan_internal, and
> index_beginscan_parallel should remain unchanged.
>

If we go that way then we need to set few parameters like heapRelation
and xs_snapshot in index_beginscan_parallel as we are doing in
index_beginscan. Does going that way sound better to you?

> Either that, or get
> rid of index_beginscan_parallel altogether and have everyone use
> index_beginscan directly, and put the snapshot-restore logic in that
> function.
>

I think there is value in retaining index_beginscan_parallel as that
is parallel to heap_beginscan_parallel.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2017-01-18 13:21:03 Re: Re: Clarifying "server starting" messaging in pg_ctl start without --wait
Previous Message Tom Lane 2017-01-18 12:38:03 Re: [COMMITTERS] pgsql: Generate fmgr prototypes automatically