| From: | Bruce Momjian <bruce(at)momjian(dot)us> |
|---|---|
| To: | Tomas Vondra <tomas(at)vondra(dot)me> |
| Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: should we have a fast-path planning for OLTP starjoins? |
| Date: | 2026-05-30 18:18:35 |
| Message-ID: | ahsp-_c8MPWPT4t_@momjian.us |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Fri, May 29, 2026 at 12:11:32AM +0200, Tomas Vondra wrote:
> But re-reading the old thread, this doesn't seem to be why it got stuck.
> We already can identify dimensions joined on foreign keys, and that
> seems like a good start.
>
> IIRC the thing that worried me was that just sticking the joins at the
> end is pretty heavy-handed. It can easily end up making the plan worse,
> if one of the other joins increases the cardinality. Would that be
> common? Probably not, but it seems unnecessarily risky.
Right.
> Ideally, we'd do join that reduce cardinality first (with the regular DP
> join search), then join all the dimensions, and finally do all joins
> that expand cardinality (again, using the regular DP). But the earlier
> patches worked by adjusting the join tree in query_planner(), i.e. way
> before we get to calculate join cardinalities.
Yes, I remember discussing that.
> It works like this:
>
> 1) query_planner()
>
> Determine if the query includes a starjoin (or multiple), and remember
> the relids included in the starjoin cluster. Pick a "canonical" join
> order for each starjoin cluster we found (e.g. with dimensions in relid
> order).
>
> 2) standard_join_search()/join_search_one_level()
>
> When constructing the join rels (e.g. in make_join_rel or right before
> it's called), check that the new rel would violate the canonical order.
> If it would, refuse to create it, just like we do for various other join
> restrictions.
This is how you avoid the factorial explosion of plan options, right?
> The new join restriction is that if the join result includes a subset of
> the starjoin cluster, then it has to include the fact + prefix of the
> list of dimensions (which is the canonical join order).
Sorry, I got lost here. What is "prefix?" I looked at the patch and
also could not understand it.
> Note: It should be possible to make the restriction even more strict, if
> needed (e.g. to effectively join all dimensions at once, with no other
> joins in between).
The patch is quite small.
> Attached are a couple charts from a test with 1-15 dimensions (scripts
> attached too). I was wondering how geqo affects this, so I tried with
> geqo=on/off, and with join_collapse_limit=1/8/16.
>
> With join_collapse_limit=1 there's no difference between any of the runs
> (master, patches with on/off). Here's an example of results:
>
> dims master(1) master sj/off sj/on master sj/off sj/on
> -------------------------------------------------------------------
> 1 49485 48797 48966 49118 99% 99% 99%
> 3 26886 22003 21319 24322 82% 79% 90%
> 5 17759 7923 7634 15434 45% 43% 87%
> 7 13110 2122 2071 11290 16% 16% 86%
> 9 10390 462 445 8709 4% 4% 84%
> 11 7781 87 86 6488 1% 1% 83%
> 13 5948 14 14 5749 0% 0% 97%
> 15 5237 1 1 4227 0% 0% 81%
Impressive.
--
Bruce Momjian <bruce(at)momjian(dot)us> https://momjian.us
EDB https://enterprisedb.com
Do not let urgent matters crowd out time for investment in the future.
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Amit Kapila | 2026-05-30 18:21:10 | Re: Proposal: Conflict log history table for Logical Replication |
| Previous Message | Tomas Vondra | 2026-05-30 18:14:33 | Re: hashjoins vs. Bloom filters (yet again) |