Re: Parallel bitmap heap scan

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>, tushar <tushar(dot)ahuja(at)enterprisedb(dot)com>, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel bitmap heap scan
Date: 2017-02-27 05:00:49
Message-ID: CAFiTN-t2CpLfcAu9SoRkk++MHpYe7wdfKqivVTKeTNOpdywySg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Feb 26, 2017 at 9:26 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

Thanks for the review, I will work on these. There are some comments I
need suggestions.

> tbm_free_shared_area because the two iterators share one copy of the
> underlying iteration arrays, and the TBM code isn't smart enough to
> avoid freeing them twice. You're going to have to come up with a
> better solution to that problem; nodeBitmapHeapScan.c shouldn't know
> about the way the underlying storage details are managed. (Maybe you
> need to reference-count the iterator arrays?)

Yeah, I also think current way doesn't look so clean, currently, these
arrays are just integers array, may be we can use a first slot of the
array for reference-count? or convert to the structure which has space
for reference-count and an integers array. What do you suggest?

>
> + if (node->inited)
> + goto start_iterate;
>
> My first programming teacher told me not to use goto. I've
> occasionally violated that rule, but I need a better reason than you
> have here. It looks very easy to avoid.

Yes, this can be avoided, I was just trying to get rid of multi-level
if nesting and end up with the "goto".

>
> + pbms_set_parallel(outerPlanState(node));
>
> I think this should be a flag in the plan, and the planner should set
> it correctly, instead of having it be a flag in the executor that the
> executor sets. Also, the flag itself should really be called
> something that involves the word "shared" rather than "parallel",
> because the bitmap will not be created in parallel, but it will be
> shared.

Earlier, I thought that it will be not a good idea to set that flag in
BitmapIndexScan path because the same path can be used either under
ParallelBitmapHeapPath or under normal BitmapHeapPath. But, now after
putting some thought, I realised that we can do that in create_plan.
Therefore, I will change this.

>
> Have you checked whether this patch causes any regression in the
> non-parallel case? It adds a bunch of "if" statements, so it might.
> Hopefully not, but it needs to be carefully tested.

During the initial patch development, I have tested this aspect also
but never published any results for the same. I will do that along
with my next patch and post the results.
>
> pbms_is_leader() looks horrifically inefficient. Every worker will
> reacquire the spinlock for every tuple. You should only have to enter
> this spinlock-acquiring loop for the first tuple. After that, either
> you became the leader, did the initialization, and set the state to
> PBM_FINISHED, or you waited until the pre-existing leader did the
> same. You should have a backend-local flag that keeps you from
> touching the spinlock for every tuple. I wouldn't be surprised if
> fixing this nets a noticeable performance increase for this patch at
> high worker counts.

I think there is some confusion, if you notice, below code will avoid
calling pbms_is_leader for every tuple.
It will be called only first time. And, after that node->inited will
be true and it will directly jump to start_iterate for subsequent
calls. Am I missing something?

> + if (node->inited)
> + goto start_iterate;

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nikhil Sontakke 2017-02-27 05:19:54 Re: Speedup twophase transactions
Previous Message Amit Langote 2017-02-27 04:57:42 Re: dropping partitioned tables without CASCADE