Re: Server crashed with TRAP: FailedAssertion("!(parallel_workers > 0)" when partitionwise_aggregate true.

From: Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Server crashed with TRAP: FailedAssertion("!(parallel_workers > 0)" when partitionwise_aggregate true.
Date: 2018-06-22 04:05:15
Message-ID: CAM2+6=Up=LtjZcrnNQhuwpOJwyVXTnyqTxy5Xg1c1=1z6JkmGg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jun 20, 2018 at 7:11 PM, Ashutosh Bapat <
ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:

> On Tue, Jun 19, 2018 at 2:13 PM, Jeevan Chalke
> <jeevan(dot)chalke(at)enterprisedb(dot)com> wrote:
> >
> >
> > In the reported testcase, parallel_workers is set to 0 for all partition
> > (child) relations. Which means partial parallel paths are not possible
> for
> > child rels. However, the parent can have partial parallel paths. Thus,
> when
> > we have a full partitionwise aggregate possible (as the group by clause
> > matches with the partition key), we end-up in a situation where we do
> create
> > a partially_grouped_rel for the parent but there won't be any
> > partially_grouped_live_children.
> >
> > The current code was calling add_paths_to_append_rel() without making
> sure
> > of any live children present or not (sorry, it was my fault). This
> causes an
> > Assertion failure in add_paths_to_append_rel() as this function assumes
> that
> > it will have non-NIL live_childrels list.
> >
>
> Thanks for the detailed explanation.
>
> > I have fixed partitionwise aggregate code which is calling
> > add_paths_to_append_rel() by checking the live children list correctly.
> And
> > for robustness, I have also added an Assert() in
> add_paths_to_append_rel().
> >
> > I have verified the callers of add_paths_to_append_rel() and except one,
> all
> > are calling it by making sure that they have a non-NIL live children
> list.
>
> I actually thought about moving if(live_childrel != NIL) test inside
> this function, but then every caller is doing something different when
> that condition occurs. So doesn't help much.
>
> > The one which is calling add_paths_to_append_rel() directly is
> > set_append_rel_pathlist(). And I think, at that place, we will never
> have an
> > empty live children list,
>
> Yes, I think so too. That's because set_append_rel_size() should have
> marked such a parent as dummy and subsequent set_rel_pathlist() would
> not create any paths for it.
>
> Here are some review comments.
>
> + /* We should end-up here only when we have few live child rels. */
>
> I think the right wording is " ... we have at least one child." I was
> actually
> thinking if we should just return from here when live_children == NIL. But
> then
> we will loose an opportunity to catch a bug earlier than set_cheapest().
>

Done.

>
> + * However, if there are no live children, then we cannot create any
> append
> + * path.
>
> I think "no live children" is kind of misleading here. We usually use that
> term
> to mean at least one non-dummy child. But in this case there is no child at
> all, so we might want to better describe this situation. May be also
> explain
> when this condition can happen.
>

Done. Tried re-phrasing it. Please check.

> + if (patype == PARTITIONWISE_AGGREGATE_FULL && grouped_live_children
> != NIL)
>
> I think for this to happen, the parent grouped relation and the underlying
> scan
> itself should be dummy. So, would an Assert be better? I think we have
> discussed this earlier, but I can not spot the mail.
>

Yep, Assert() will be better. Done.

>
> +-- Test when partition tables has parallel_workers = 0 but not the parent
>
> Better be worded as "Test when parent can produce parallel paths but not
> any of
> its children". parallel_worker = 0 is just a means to test that. Although
> the
> EXPLAIN output below doesn't really reflect that parent can have parallel
> plans. Is it possible to create a scenario to show that.
>

Added test.

>
> I see that you have posted some newer versions of this patch, but I
> think those still need to address these comments.
> --
> Best Wishes,
> Ashutosh Bapat
> EnterpriseDB Corporation
> The Postgres Database Company
>

Thanks
--
Jeevan Chalke
Technical Architect, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Attachment Content-Type Size
0001-Make-sure-that-we-have-live-children-before-we-appen.patch text/x-patch 11.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nico Williams 2018-06-22 04:23:38 Threat models for DB cryptography (Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key) Management Service (KMS)
Previous Message Pavan Deolasee 2018-06-22 04:02:19 Re: PATCH: backtraces for error messages