Re: Parallel Index Scans

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Anastasia Lubennikova <lubennikovaav(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Rahila Syed <rahilasyed(dot)90(at)gmail(dot)com>
Subject: Re: Parallel Index Scans
Date: 2017-01-13 17:36:53
Message-ID: CA+Tgmobqyy9Sb4XqdEJzkqgNDMtSkDwY4eQ7CDH+kOB060QWGQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jan 13, 2017 at 9:28 AM, Anastasia Lubennikova
<a(dot)lubennikova(at)postgrespro(dot)ru> wrote:
> I didn't find any case of noticeable performance degradation,
> so set status to "Ready for committer".

The very first hunk of doc changes looks like it makes the whitespace
totally wrong - surely it can't be right to have 0-space indents in C
code.

+ The <literal>index_size</> parameter indicate the size of generic parallel

indicate -> indicates
size of generic -> size of the generic

+ index-type-specific parallel information which will be stored immediatedly

Typo.

+ Initialize the parallel index scan state. It will be used to initialize
+ index-type-specific parallel information which will be stored immediatedly
+ after generic parallel information required for parallel index scans. The
+ required state information will be set in <literal>target</>.
+ </para>
+
+ <para>
+ The <function>aminitparallelscan</> and
<function>amestimateparallelscan</>
+ functions need only be provided if the access method supports
<quote>parallel</>
+ index scans. If it doesn't, the <structfield>aminitparallelscan</> and
+ <structfield>amestimateparallelscan</> fields in its
<structname>IndexAmRoutine</>
+ struct must be set to NULL.
+ </para>

Inconsistent indentation. <quote> seems like a strange choice of tag.

+ /* amestimateparallelscan is optional; assume no-op if not
provided by AM */

The fact that amestimateparallelscan is optional even when parallel
index scans are supported is undocumented. Similarly for the other
functions, which also seem to be optional but not documented as such.
The code and documentation need to match.

+ void *amtarget = (char *) ((void *) target) + offset;

Passing an unaligned pointer to the AM sounds like a recipe for
crashes on obscure platforms that can't tolerate alignment violations,
and possibly bad performance on others. I'd arrange MAXALIGN the size
of the generic structure in index_parallelscan_estimate and
index_parallelscan_initialize. Also, why pass the size of the generic
structure to the AM-specific estimate routine, anyway? It can't
legally return a smaller value, and we can add_size() just as well as
the AM-specific code. Wouldn't it make more sense for the AM-specific
code to return the amount of space that is needed for AM-specific
stuff, and let the generic code deal with the generic stuff?

+ * status - True indicates that the block number returned is valid and scan
+ * is continued or block number is invalid and scan has just begun
+ * or block number is P_NONE and scan is finished. False indicates
+ * that we have reached the end of scan for current
scankeys and for
+ * that we return block number as P_NONE.

It is hard to parse a sentence with that many "and" and "or" clauses,
especially since English, unlike C, does not have strict operator
precedence rules. Perhaps you could reword to make it more clear.
Also, does that survive pgindent with that indentation?

+ BTParallelScanDesc btscan = (BTParallelScanDesc) OffsetToPointer(
+ (void *) scan->parallel_scan,
+ scan->parallel_scan->ps_offset);

You could avoid these uncomfortable line breaks by declaring the
variable on one line and the initializing it on a separate line.

+ SpinLockAcquire(&btscan->ps_mutex);
+ pageStatus = btscan->ps_pageStatus;
+ if (so->arrayKeyCount < btscan->ps_arrayKeyCount)
+ *status = false;
+ else if (pageStatus == BTPARALLEL_DONE)
+ *status = false;
+ else if (pageStatus != BTPARALLEL_ADVANCING)
+ {
+ btscan->ps_pageStatus = BTPARALLEL_ADVANCING;
+ nextPage = btscan->ps_nextPage;
+ exit_loop = true;
+ }
+ SpinLockRelease(&btscan->ps_mutex);

IMHO, this needs comments.

+ * It updates the value of next page that allows parallel scan to move forward
+ * or backward depending on scan direction. It wakes up one of the sleeping
+ * workers.

This construction is commonly used in India but sounds awkward to
other English-speakers, or at least to me. You can either drop the
word "it" and just start with the verb "Updates the value of ..." or
you can replace the first instance of "It" with "This function".
Although actually, I think this whole comment needs rewriting. Maybe
something like "Save information about scan position and wake up next
worker to continue scan."

+ * This must be called when there are no pages left to scan. Notify end of
+ * parallel scan to all the other workers associated with this scan.

Suggest: When there are no pages left to scan, this function should be
called to notify other workers. Otherwise, they might wait forever
for the scan to advance to the next page.

+ if (status == false)

if (!status) is usually preferred for bools. (Multiple instances.)

+#define BTPARALLEL_NOT_INITIALIZED 0x01
+#define BTPARALLEL_ADVANCING 0x02
+#define BTPARALLEL_DONE 0x03
+#define BTPARALLEL_IDLE 0x04

Let's change this to an enum. We can keep the names of the members
as-is, just use typedef enum { ... } instead of #defines.

+#define OffsetToPointer(base, offset)\
+((void *)((char *)base + offset))

Blech. Aside from the bad formatting, this is an awfully generic
thing to stick into relscan.h. I'm not sure we should have it at all,
but certainly not in this file.

+/*
+ * BTParallelScanDescData contains btree specific shared information required
+ * for parallel scan.
+ */
+typedef struct BTParallelScanDescData
+{
+ BlockNumber ps_nextPage; /* latest or next page to be scanned */
+ uint8 ps_pageStatus; /* indicates whether next page is available
+ * for scan. see nbtree.h for possible states
+ * of parallel scan. */
+ int ps_arrayKeyCount; /* count indicating number of array
+ * scan keys processed by parallel
+ * scan */
+ slock_t ps_mutex; /* protects above variables */
+ ConditionVariable cv; /* used to synchronize parallel scan */
+} BTParallelScanDescData;

Why are the states declared a separate header file from the variable
that uses them? Let's put them all in the same place.

Why do all of these fields except for the last one have a ps_ prefix,
but the last one doesn't?

I assume "ps" stands for "parallel scan" but maybe "btps" would be
better since this is btree-specific.

ps_nextPage sometimes contains something other than the next page, so
maybe we should choose a different name, like ps_currentPage or
ps_scanPage.

This is not a totally complete review - there are some things I have
deeper questions about and need to examine more closely - but let's
get the simple stuff tidied up first.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2017-01-13 18:06:58 Re: UNDO and in-place update
Previous Message Serge Rielau 2017-01-13 17:35:00 Re: Packages: Again