| From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
|---|---|
| To: | Richard Guo <guofenglinux(at)gmail(dot)com> |
| Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Andrei Lepikhov <lepihov(at)gmail(dot)com>, Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, arne(dot)roland(at)malkut(dot)net, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com> |
| Subject: | Re: apply_scanjoin_target_to_paths and partitionwise join |
| Date: | 2025-10-29 12:36:05 |
| Message-ID: | CA+TgmobVFOfUiei9BDu4P8oNtmNP3kgc8vAOACs+jtApKorsqA@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Wed, Oct 29, 2025 at 5:21 AM Richard Guo <guofenglinux(at)gmail(dot)com> wrote:
> I don't think the rewrite of unique-ification requires any adjustment
> to this patch. I ran Q1 on v18, which does not include the
> unique-ification changes, and here is what I observed: without
> Ashutosh's patch, it performs a full partitionwise join; with the
> patch, it performs one join partitionwise and the other
> non-partitionwise. The costs of the unpatched versus patched versions
> on v18 are 2286.11 and 1420.40, respectively, indicating that
> Ashutosh's patch reduces the cost by a large amount. This matches
> your observation exactly. I think this suggests that we can rule out
> the interference from the unique-ification changes.
This testing methodology makes some sense to me, but it seems here you
have tested Q1 here, which was the good case, rather than Q3, which
was the troubling one.
> However, that reasoning is valid only when all of the existing paths
> are Appends of Scans or Joins. It does not hold for a partitioned
> join relation, which can have paths that are Joins of Appends.
> Therefore, I think there's something wrong with the current logic, and
> we may need to do something about it.
I agree.
> Maybe we could address this by discarding all existing partitionwise
> paths and relying on apply_scanjoin_target_to_paths() to rebuild these
> Append paths after applying the target to all partitions?
I'm quite afraid of just deleting items out of the pathlist, because
the pathlist has to satisfy a complicated set of invariants and it is
unclear to me that we wouldn't end up violating some of them. I think
we could mitigate the danger if we re-added the old paths we want to
keep. That is, set some variable to the pathlist, set the pathlist to
NIL, and then iterate over the saved pathlist and call add_path() on
each path we wish to keep. That would produce effects similar to
modeling the final scan/join target as a separate RelOptInfo, which is
what this code is trying to do without actually creating a separate
RelOptInfo. Maybe that's overly cautious and would always produce the
same results as deleting from the path list, but I'm not sure.
Deleting from the pathlist is hypothetically allowed (c.f. comments
for set_join_pathlist_hook) but I wonder whether anyone has actually
been able to make it work in real life.
> Another concern I have, though I'm not entirely sure, is about
> comparing the costs between a partitionwise join path and a
> non-partitionwise join path. It seems to me that their costs are
> computed in very different ways, so I'm not sure whether the costs are
> truly comparable. So I suspect that, with the patch, there may be
> cases where a lower estimated cost does not necessarily translate to
> shorter execution time. However, I'm not sure what to do about this.
I think that's a general risk of using a cost model to choose plans,
and in that sense I believe it is something we neither can nor should
try to fix, here or in general. What I find more concerning about this
case than, in certain cases, the costs of a partititionwise path and a
non-partitionwise path are extremely close together for very
understandable reasons. Hence, the choice will be unpredictable, which
I fear might create problems for users. I'm not sure what to do about,
though. It's tempting to think that we don't need to consider both a
MergeJoin-of-Appends and an Append-of-MergeJoins, which seems to be
the case where you get almost-identical costs, but that's actually
only true when there's no sorting happening under the merge-joins.
Even if that were no issue, it's unclear how we could reasonably avoid
considering both of those possibilities given the code structure.
--
Robert Haas
EDB: http://www.enterprisedb.com
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Robert Haas | 2025-10-29 12:47:45 | Re: apply_scanjoin_target_to_paths and partitionwise join |
| Previous Message | Joel Jacobson | 2025-10-29 12:30:12 | Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue |