Re: speeding up planning with partitions

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
Cc: "Imai, Yoshikazu" <imai(dot)yoshikazu(at)jp(dot)fujitsu(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: speeding up planning with partitions
Date: 2019-01-12 08:49:39
Message-ID: CAKJS1f-FJDRGfj3c3S7EJhpVmXKCHcSw9K-ccN2hZ-3cvBn3QQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, 12 Jan 2019 at 02:00, Amit Langote
<Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> That's an oversight. Fixed by making add_child_rel_equivalences do this:
>
> + cur_ec->ec_relids = bms_difference(cur_ec->ec_relids,
> + parent_rel->relids);
> + cur_ec->ec_relids = bms_add_members(cur_ec->ec_relids,
> + child_rel->relids);
>
> >
> > UPDATE: I see you're likely leaving this alone since you're only doing
> > a shallow copy of the eclasses in
> > adjust_inherited_target_child_root(). It seems like a pretty bad idea
> > to do a shallow copy there.
>
> So, you're talking about this code:
>
> /*
> * Child root should get its own copy of ECs, because they'll be modified
> * to replace parent EC expressions by child expressions in
> * add_child_rel_equivalences.
> */
> subroot->eq_classes = NIL;
> foreach(lc, root->eq_classes)
> {
> EquivalenceClass *ec = lfirst(lc);
> EquivalenceClass *new_ec = makeNode(EquivalenceClass);
>
> memcpy(new_ec, ec, sizeof(EquivalenceClass));
> new_ec->ec_members = list_copy(ec->ec_members);
> subroot->eq_classes = lappend(subroot->eq_classes, new_ec);
> }
>
> Can you say what you think is wrong with this way of making a copy of the ECs?

If you shallow copy with memcpy(new_ec, ec,
sizeof(EquivalenceClass));, then fields such as ec_relids will just
point to the same memory as the parent PlannerInfo's
EquivalenceClasses, so when you do your adjustment (as above) on the
child eclasses, you'll modify memory belonging to the parent. I'd
assume you'd not want to do that since you need to keep the parent
intact at least to make copies for other child rels. You'd have
gotten away with it before since you performed a list_copy() and only
were deleting the parent's EMs with list_delete_cell() which was
modifying the copy, not the original list. Since you were missing the
alteration to ec_relids, then you might not have seen issues with the
shallow copy, but now that you are changing that field, are you not
modifying the ec_relids field that still set in the parent's
PlannerInfo? In this instance, you've sidestepped that issue by using
bms_difference() which creates a copy of the Bitmapset and modifies
that. I think you'd probably see issues if you tried to use
bms_del_members(). I think not doing the deep copy is going to give
other people making changes in this are a hard time in the future.

> > 12. The following comment makes less sense now that you've modified
> > the previous paragraph:
> >
> > + * Fortunately, the UPDATE/DELETE target can never be the nullable side of an
> > + * outer join, so it's OK to generate the plan this way.
> >
> > This text used to refer to:
> >
> > but target inheritance has to be expanded at
> > * the top. The reason is that for UPDATE, each target relation needs a
> > * different targetlist matching its own column set. Fortunately,
> > * the UPDATE/DELETE target can never be the nullable side of an outer join,
> > * so it's OK to generate the plan this way.
> >
> > you no longer describe plan as being expanded from the top rather than
> > at the bottom, which IMO is what "this way" refers to.
>
> To be honest, I never quite understood what that line really means, which
> is why I kept it around. What I do know though is that, even with the
> patched, we still generate subpaths for each child whose targetlist
> patches the child, so I think the part about "this way" that prompted
> someone to write that line still remains. Does that make sense to you?

Not quite. I thought that "OK to generate the plan this way" is
talking about expanding the children at the top of the plan, e.g.

explain (costs off) update listp set b = b + 1 from t where listp.a = t.a;
QUERY PLAN
--------------------------------------
Update on listp
Update on listp1
Update on listp2
-> Merge Join
Merge Cond: (listp1.a = t.a)
-> Sort
Sort Key: listp1.a
-> Seq Scan on listp1
-> Sort
Sort Key: t.a
-> Seq Scan on t
-> Merge Join
Merge Cond: (listp2.a = t.a)
-> Sort
Sort Key: listp2.a
-> Seq Scan on listp2
-> Sort
Sort Key: t.a
-> Seq Scan on t

i.e each non-pruned partition gets joined to the other tables
(expanded from the top of the plan tree). The other tables appear once
for each child target relation.

instead of:

postgres=# explain (costs off) select * from listp inner join t on
listp.a = t.a;
QUERY PLAN
--------------------------------------
Merge Join
Merge Cond: (t.a = listp1.a)
-> Sort
Sort Key: t.a
-> Seq Scan on t
-> Sort
Sort Key: listp1.a
-> Append
-> Seq Scan on listp1
-> Seq Scan on listp2

where the children are expanded from the bottom (or at the leaf side
of the plan tree), since we always plant our trees up-side-down, in
computer science.

In my view, since you removed the part about where the children are
expanded then it does not quite make sense anymore.

Of course, I might not be reading the comment as it was intended.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2019-01-12 10:42:38 Re: Delay locking partitions during query execution
Previous Message Dean Rasheed 2019-01-12 07:49:22 Re: [HACKERS] PATCH: multivariate histograms and MCV lists