Re: v12.0: ERROR: could not find pathkey item to sort

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Amit Langote <amitlangote09(at)gmail(dot)com>
Cc: David Rowley <drowley(at)postgresql(dot)org>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: v12.0: ERROR: could not find pathkey item to sort
Date: 2019-11-05 16:51:20
Message-ID: 10628.1572972680@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Amit Langote <amitlangote09(at)gmail(dot)com> writes:
> On Sun, Nov 3, 2019 at 4:43 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Also, I thought we should try to put more conditions on whether we
>> invoke add_child_join_rel_equivalences at all. In the attached I did
>> if ((enable_partitionwise_join || enable_partitionwise_aggregate) &&
>> (joinrel->has_eclass_joins ||
>> has_useful_pathkeys(root, parent_joinrel)))
>> but I wonder if we could do more, like checking to see if the parent
>> joinrel is partitioned. Alternatively, maybe it's unnecessary because
>> we won't try to build child joinrels unless these conditions are true?

> Actually, I think we can assert in add_child_rel_equivalences() that
> enable_partitionwise_join is true. Then checking
> enable_partitionwise_aggregate is unnecessary.

After tracing the call sites back a bit further, I agree that we won't
be here in the first place unless enable_partitionwise_join is true,
so the extra tests I proposed are unnecessary. I took them out again.

> I have been thinking maybe add_child_rel_equivalences() doesn't need
> to translate EC members that reference multiple appendrels, because
> there top_parent_relids is always a singleton set, whereas em_relids
> of such expressions is not? Those half-translated expressions are
> useless, only adding to the overhead of scanning ec_members. I'm
> thinking that we should apply this diff:
> - if (bms_overlap(cur_em->em_relids, top_parent_relids))
> + if (bms_is_subset(cur_em->em_relids, top_parent_relids))

Meh, I'm not really convinced. The case where this would be relevant
is an EC generated from something like "WHERE (a.x + b.y) = c.z"
where "a" is partitioned. It's possible that we'd never have a use
for a sort key corresponding to "a_child.x + b.y", but I think that's
not obvious, and probably short-sighted. Anyway such EC members are
pretty rare in the first place, so we're not going to win much
performance by trying to optimize them.

Anyway, I've pushed the fix for Justin's problem to v12 and HEAD.
The problem with poor planning of multiway joins that you mentioned
in the other thread remains open, but I imagine the patches you
posted there are going to need rebasing over this commit, so I
set that CF entry to Waiting On Author.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-11-05 17:00:42 Re: d25ea01275 and partitionwise join
Previous Message rtorre 2019-11-05 16:49:27 Re: [Proposal] Arbitrary queries in postgres_fdw