Re: Parallel Index Scans

From: Anastasia Lubennikova <lubennikovaav(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Rahila Syed <rahilasyed(dot)90(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Subject: Re: Parallel Index Scans
Date: 2016-12-21 15:16:52
Message-ID: 20161221151652.25844.42328.pgcf@coridan.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: tested, passed
Documentation: tested, passed

Hi, thank you for the patch.
Results are very promising. Do you see any drawbacks of this feature or something that requires more testing?
I'm willing to oo a review. I hadn't do benchmarks yet, but I've read the patch and here are some
notes and questions about it.

I saw the discussion about parameters in the thread above. And I agree that we'd better concentrate
on the patch itself and add them later if necessary.

1. Can't we simply use "if (scan->parallel_scan != NULL)" instead of xs_temp_snap flag?

+ if (scan->xs_temp_snap)
+ UnregisterSnapshot(scan->xs_snapshot);

I must say that I'm quite new with all this parallel stuff. If you give me a link,
where to read about snapshots for parallel workers, my review will be more helpful.
Anyway, it would be great to have more comments about it in the code.

2. Don't you mind to rename 'amestimateparallelscan' to let's say 'amparallelscan_spacerequired'
or something like this? As far as I understand there is nothing to estimate, we know this size
for sure. I guess that you've chosen this name because of 'heap_parallelscan_estimate'.
But now it looks similar to 'amestimate' which refers to indexscan cost for optimizer.
That leads to the next question.

3. Are there any changes in cost estimation? I didn't find related changes in the patch.
Parallel scan is expected to be faster and optimizer definitely should know that.

4. + uint8 ps_pageStatus; /* state of scan, see below */
There is no desciption below. I'd make the comment more helpful:
/* state of scan. See possible flags values in nbtree.h */
And why do you call it pageStatus? What does it have to do with page?

5. Comment for _bt_parallel_seize() says:
"False indicates that we have reached the end of scan for
current scankeys and for that we return block number as P_NONE."

What is the reason to check (blkno == P_NONE) after checking (status == false)
in _bt_first() (see code below)? If comment is correct
we'll never reach _bt_parallel_done()

+ blkno = _bt_parallel_seize(scan, &status);
+ if (status == false)
+ {
+ BTScanPosInvalidate(so->currPos);
+ return false;
+ }
+ else if (blkno == P_NONE)
+ {
+ _bt_parallel_done(scan);
+ BTScanPosInvalidate(so->currPos);
+ return false;
+ }

6. To avoid code duplication, I would wrap this into the function

+ /* initialize moreLeft/moreRight appropriately for scan direction */
+ if (ScanDirectionIsForward(dir))
+ {
+ so->currPos.moreLeft = false;
+ so->currPos.moreRight = true;
+ }
+ else
+ {
+ so->currPos.moreLeft = true;
+ so->currPos.moreRight = false;
+ }
+ so->numKilled = 0; /* just paranoia */
+ so->markItemIndex = -1; /* ditto */

And after that we can also get rid of _bt_parallel_readpage() which only
bring another level of indirection to the code.

7. Just a couple of typos I've noticed:

* Below flags are used indicate the state of parallel scan.
* Below flags are used TO indicate the state of parallel scan.

* On success, release lock and pin on buffer on success.
* On success release lock and pin on buffer.

8. I didn't find a description of the feature in documentation.
Probably we need to add a paragraph to the "Parallel Query" chapter.

I will send another review of performance until the end of the week.

The new status of this patch is: Waiting on Author

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Fetter 2016-12-21 15:29:54 Re: pg_background contrib module proposal
Previous Message Robert Haas 2016-12-21 15:09:51 Re: Proposal : Parallel Merge Join