| From: | Tomas Vondra <tomas(at)vondra(dot)me> |
|---|---|
| To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
| Cc: | James Hunter <james(dot)hunter(dot)pg(at)gmail(dot)com>, Richard Guo <guofenglinux(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: should we have a fast-path planning for OLTP starjoins? |
| Date: | 2025-11-08 20:14:54 |
| Message-ID: | 47930f36-6373-45aa-8d59-9f7b990a2df9@vondra.me |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On 9/23/25 21:46, Tom Lane wrote:
> [ sorry for ridiculously slow response to this ]
>
> Tomas Vondra <tomas(at)vondra(dot)me> writes:
>> Here's a patch trying to do it more like this - by manipulating the
>> lists describing the join problems, before it's passed the the actual
>> join search algorithm (which is where the PoC patch did this).
>> I wonder if this is roughly the place where you imagined this would be
>> done, or if you envision some other issue with this approach.
>
> Cool. This is proof-of-concept that manipulating the joinlist can
> do what we need done. So we can move on to what heuristics we need
> to use.
>
Thanks. Good to hear this seems like a reasonable place.
>> I initially tried to manipulate the joinlist much earlier - pretty much
>> right at the end of deconstruct_jointree. But that turned out to be way
>> too early. To identify dimensions etc. we need to check stuff about
>> foreign keys, join clauses, ... and that's not available that early.
>
>> So I think this needs to happen much later in query_planner(), and the
>> patch does it right before the make_one_rel() call. Maybe that's too
>> late, but it needs to happen after match_foreign_keys_to_quals(), as it
>> relies on some of the FK info built by that call. Maybe we could call
>> match_foreign_keys_to_quals() earlier, but I don't quite see any
>> benefits of doing that ...
>
> I don't have a problem with doing it where you did it, but the comment
> should explain the placement. What you do have in the comment mostly
> belongs with the code, too; it's not the business of the caller. So
> in planmain.c something like
>
> + /*
> + * Try to simplify the join search problem for starjoin-like joins.
> + * This step relies on info about FK relationships, so we can't do it
> + * till after match_foreign_keys_to_quals().
> + */
>
> would be more appropriate IMO.
I agree. I've adopted your wording, and moved the original comment to
starjoin_adjust_joins (with some changes).
> I'd be slightly inclined to put the GUC test there, too:
>
> + if (enable_starjoin_join_search)
> + joinlist = starjoin_adjust_joins(root, joinlist);
>
I'm not sure I like this very mcuh. No other call in query_planner()
does it like that. Most don't have such GUC, but at least
remove_useless_self_joins does, and it still doesn't check it here.
>
> I agree that you need to worry about join order restrictions,
> and that it's not immediately clear how to do that. join_is_legal
> would work if we could call it, but the problem is that at this
> stage we'll only have RelOptInfos for base rels not join rels.
> If we have a joinlist (A B C D) and we are considering treating
> C as a dimension table, then the questions we have to ask are:
> (a) is it okay to build the (A B D) join without C?
> (b) is it okay to join (A B D) to C?
>
> In this simple case, I think (b) must be true if (a) is, but
> I'm not quite sure that that's so in more complex cases with
> multiple candidates for dimension tables. In any case,
> join_is_legal won't help us if we don't have join RelOptInfos.
>
> I'm inclined to start by using has_join_restriction: if that
> says "false" for a candidate dimension table, it should be safe
> to postpone the join to the dimension table. We might be able
> to refine that later.
>
Thanks. I agree has_join_restriction seems like a good start, I'll give
that a try in the next patch version.
When thinking about this, I realized the has_join_restriction() is only
ever used in join_search_one_level(), i.e. when dealing with each small
join order problem. Doesn't this mean the deconstructed jointree must
already consider the restrictions in some way? I don't see any explicit
mentions of such join order restrictions in deconstruct_recurse. It must
not violate any ordering restrictions by splitting the joins in a
"wrong" way, right? If I set join_collapse_limit=1 it still needs to
satisfy all the rules.
I was wondering if maybe we could piggy-back on that, somehow. But maybe
that's not very practical, and has_join_restriction() is the way to go.
It's been a while since I looked at this patch, so it's possible I
already concluded that wouldn't work, and forgot about it :-/
>> The second example (create-2/select-2) is quite different, in that it's
>> nor a starjoin schema. It still joins one "main" table with multiple
>> "dimensions", but the FKs go in the other direction (to a single column
>> in the main table). But it has a very similar bottleneck - the order of
>> the joins is expensive, but it often does not matter very much, because
>> the query matches just one row anyway. And even if it returns more rows,
>> isn't the join order determined just by the selectivity of each join?
>> Maybe the starjoin optimization could be made to work for this type too?
>
> Yeah, I'm slightly itchy about relying on FKs in this heuristic at
> all; it doesn't seem like quite the right thing. I think we do want
> one side of the join to be joining to a PK or at least a unique index,
> but I'm not sure we need to insist on there being an FK relationship.
>
True, requiring the FK may be unnecessary in this case. We do need to
guarantee the cardinality does not change, but a UNIQUE + LEFT JOIN
seems to be enough for that.
BTW the v3 lost the patch/ directory. I assume that wasn't intentional,
so I added it back into v4.
> A couple of minor coding notes:
>
> * There's no point in doing anything (except recursing) if the joinlist
> contains fewer than 3 items, and maybe as a further heuristic
> this shouldn't kick in till later yet, like 5 or so items.
> When there are just a few items, the possibility of missing the
> best plan seems to outweigh the minimal savings in plan time.
>
> * Joinlists never contain anything but RangeTblRefs and sub-lists.
> See make_rel_from_joinlist.
>
> * Your reconstructed joinlist is overly complicated. Instead of
>
> + newlist = list_make2(newlist, list_make1(lfirst(lc)));
>
> you could just do
>
> + newlist = list_make2(newlist, lfirst(lc));
>
> because a single-element subproblem is useless.
>
> I notice that the patch doesn't apply cleanly anymore because of
> the introduction of guc_parameters.dat. So here's a v3 that
> rebases over that, and I took the liberty of fixing the joinlist
> construction as above, but I didn't do anything else.
>
Thanks. I agree with all of these comments, and updated v4 accordingly.
cfbot started complaining about guc_parameters.dat and a couple more
things, so v4 fixes that too.
regards
--
Tomas Vondra
| Attachment | Content-Type | Size |
|---|---|---|
| v4-0001-Simplify-planning-of-starjoin-queries.patch | text/x-patch | 25.7 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Joshua Shanks | 2025-11-08 20:21:24 | [PATCH] libpq: Wrap out-of-memory error messages with libpq_gettext() |
| Previous Message | Tom Lane | 2025-11-08 19:26:14 | Re: IO in wrong state on riscv64 |