Re: Parallel bitmap heap scan

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>, tushar <tushar(dot)ahuja(at)enterprisedb(dot)com>, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel bitmap heap scan
Date: 2017-02-08 02:37:30
Message-ID: CA+TgmoaNTPfZ9bJuL7P7CfWtKhAUccftLzSLUjgsOZe+AxYezA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Feb 5, 2017 at 9:48 AM, Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> Here are the latest version of the patch, which address all the
> comments given by Robert.

I think it would be useful to break the remaining patch into two
parts, one that enhances the tidbitmap.c API and another that uses
that to implement Parallel Bitmap Heap Scan. In this review I'm going
to focus on the changes to tidbitmap.c and tidbitmap.h.

+extern void tbm_restore_shared_members(TIDBitmap *tbm,
+ ParallelTBMInfo * stbm);
+extern void tbm_update_shared_members(TIDBitmap *tbm,
+ ParallelTBMInfo * parallel_tbm);
+void tbm_set_parallel(TIDBitmap *tbm, void *area);
+extern Size tbm_estimate_parallel_tbminfo(Size size);
+TIDBitmap *tbm_attach(ParallelTBMInfo * parallel_tbm, void *area);
+TBMIterator *tbm_begin_shared_iterate(TIDBitmap *tbm,
+ ParallelTBMInfo * parallel_tbm, bool prefetch);
+TBMIterateResult *tbm_shared_iterate(TBMIterator *iterator);
+void tbm_init_shared_iterator(ParallelTBMInfo * tbminfo);

There are several cosmetic problems here. You may have noticed that
all function prototypes in PostgreSQL header files are explicitly
declared extern; yours should be, too. Also, there is extra
whitespace before some of the variable names here, like
"ParallelTBMInfo * tbminfo" instead of "ParallelTBMInfo *tbminfo". If
that is due to pgindent, the solution is to add the typedef names to
your local typedef list. Also, tbm_restore_shared_members doesn't
actually exist.

More broadly, this seems like an extremely complicated API. Even
ignoring the function that doesn't exist, that's still 7 different
functions just for shared iteration, which seems like a lot. I
suggest the following API:

1. Add an additional argument to tbm_create(), dsa_area *dsa. If it's
NULL, we allocate a backend-private memory for the hash entries as
normal. If it's true, we use the dsa_area to store the hash entries,
using the infrastructure added by your 0002 and revised in
c3c4f6e1740be256178cd7551d5b8a7677159b74. (You can use a flag in the
BitmapIndexScan and BitmapOr to decide whether to pass NULL or
es_query_dsa.)

2. Add a new function dsa_pointer tbm_prepare_shared_iterate(TIDBitmap
*tbm) which allocates shared iteration arrays using the DSA passed to
tbm_create() and returns a dsa_pointer to the result. Arrange this so
that if it's called more than once, each call returns a separate
iterator, so that you can call it once to get the main iterator and a
second time for the prefetch iterator, but have both of those point to
the same underlying iteration arrays.

3. Add a new function TBMSharedIterator
*tbm_attach_shared_iterate(dsa_area *dsa, dsa_pointer) which is called
once per backend and gets passed the dsa_pointer from the previous
step.

4. Add a new function TBMIterateResult
*tbm_shared_iterate(TBMSharedIterator *) to fetch the next result.

As compared with your proposal, this involves only 3 functions instead
of 7 (plus one new argument to another function), and I think it's a
lot clearer what those functions are actually doing. You don't need
tbm_estimate_parallel_tbminfo() any more because the data being passed
from one backend to another is precisely a dsa_pointer, and the bitmap
scan can just leave space for that in its own estimator function. You
don't need tbm_update_shared_members() separately from
tbm_begin_shared_iterate() separately from tbm_init_shared_iterator()
because tbm_prepare_shared_iterate() can do all that stuff. You don't
need tbm_set_parallel() because the additional argument to
tbm_create() takes care of that need.

The way you've got ParallelTBMInfo set up right now doesn't respect
the separation of concerns between nodeBitmapHeapscan.c and
tidbitmap.c properly. tidbitmap.c shouldn't know that the caller
intends on creating two iterators, one of which is for prefetching.
The design above hopefully addresses that: each call to
tbm_prepare_shared_iterate returns a dsa_pointer to a separate chunk
of shared iterator state, but tidbitmap.c does not need to know how
many times that will be called.

Apart from the above, this patch will need a rebase over today's
commits, and please make sure all functions you add have high-quality
header comments.

Thanks,

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Merlin Moncure 2017-02-08 02:50:06 Re: Idea on how to simplify comparing two sets
Previous Message Jonathan S. Katz 2017-02-08 01:52:56 Re: Press Release Draft - 2016-02-09 Cumulative Update