Re: Implement missing join selectivity estimation for range types

From: SCHOEMANS Maxime <maxime(dot)schoemans(at)ulb(dot)be>
To: Haibo Yan <tristan(dot)yim(at)gmail(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Damir Belyalov <dam(dot)bel07(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, SAKR Mahmoud <mahmoud(dot)sakr(at)ulb(dot)be>, Diogo Repas <diogo(dot)repas(at)gmail(dot)com>, Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
Subject: Re: Implement missing join selectivity estimation for range types
Date: 2026-04-15 15:13:34
Message-ID: DB8P190MB0731EA1B8381F48AA15B9437F0222@DB8P190MB0731.EURP190.PROD.OUTLOOK.COM
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Haibo,

Thank you for the review.

> One small note: the localized bool join_is_reversed; in my version was
> intentional. I left it that way because get_join_variables() wants a
> storage location, and I preferred to keep that use local and explicit
> rather than trying to reshape things around it.

Fair point. I moved it out of the bare block because it looked unusual,
but I can change it back if you prefer.

> For patch 2, I am less convinced, especially for &&.
> [...]
> for multiranges there is a third possibility: neither side is entirely
> left nor entirely right, and yet they still do not overlap because of
> an internal gap.

This is a valid concern, but it is an existing limitation of multirange
statistics, not something we are introducing. The existing restriction
selectivity code in multirangetypes_selfuncs.c already uses the same
NOT(<<) AND NOT(>>) decomposition for && on multiranges. And
multirange_typanalyze explicitly says:

/* Treat multiranges like a big range without gaps. */

The statistics only store the outermost bounds, so the gap information
is already lost before our estimator sees it. The multirange GiST
opclass does the same (stores the bounding range). Our join estimator
is just consistent with how multiranges are handled elsewhere.

The alternative is falling back to the 0.005 default, which will almost
certainly be worse. Would a comment explaining the limitation be enough?

> For patch 3, I agree with the motivation [...] But I am not convinced
> that exporting those helpers via selfuncs.h is the right boundary.
> My preference would be something tighter: [...] a backend-private
> internal header just for the range-family selfuncs code

Good point about the visibility. I'll move the declarations to a
separate backend-private header in the next version.

Regards,
Maxime

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2026-04-15 15:15:05 submissions for real-time patch evaluation panel
Previous Message Ashutosh Bapat 2026-04-15 15:08:07 Re: Typo fixes in Graph table files