Re: Parallel Index Scans

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Anastasia Lubennikova <lubennikovaav(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Rahila Syed <rahilasyed(dot)90(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Subject: Re: Parallel Index Scans
Date: 2016-12-21 15:38:13
Message-ID: CA+TgmoayUWCSxgH-Fg6hLSFqYh=ZbW6ukENnvx7P0WsZHjMY0Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks for reviewing! A few quick thoughts from me since I write a
bunch of the design for this patch.

On Wed, Dec 21, 2016 at 10:16 AM, Anastasia Lubennikova
<lubennikovaav(at)gmail(dot)com> wrote:
> 1. Can't we simply use "if (scan->parallel_scan != NULL)" instead of xs_temp_snap flag?
>
> + if (scan->xs_temp_snap)
> + UnregisterSnapshot(scan->xs_snapshot);
>
> I must say that I'm quite new with all this parallel stuff. If you give me a link,
> where to read about snapshots for parallel workers, my review will be more helpful.
> Anyway, it would be great to have more comments about it in the code.

I suspect it would be better to keep those two things formally
separate, even though they may always be the same right now.

> 2. Don't you mind to rename 'amestimateparallelscan' to let's say 'amparallelscan_spacerequired'
> or something like this? As far as I understand there is nothing to estimate, we know this size
> for sure. I guess that you've chosen this name because of 'heap_parallelscan_estimate'.
> But now it looks similar to 'amestimate' which refers to indexscan cost for optimizer.
> That leads to the next question.

"estimate" is being used this way quite widely now, in places like
ExecParallelEstimate. So if we're going to change the terminology we
should do it broadly.

> 3. Are there any changes in cost estimation? I didn't find related changes in the patch.
> Parallel scan is expected to be faster and optimizer definitely should know that.

Generally the way that's reflected in the optimized is by having the
parallel scan have a lower row count. See cost_seqscan() for an
example.

In general, you'll probably find a lot of parallels between this patch
and ee7ca559fcf404f9a3bd99da85c8f4ea9fbc2e92, which is probably a good
thing.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2016-12-21 15:42:18 Re: pg_background contrib module proposal
Previous Message Robert Haas 2016-12-21 15:31:18 Re: [COMMITTERS] pgsql: Implement table partitioning.