Re: Is there value in having optimizer stats for joins/foreignkeys?

From: Tomas Vondra <tomas(at)vondra(dot)me>
To: Andrei Lepikhov <lepihov(at)gmail(dot)com>, Alexandra Wang <alexandra(dot)wang(dot)oss(at)gmail(dot)com>, Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)lists(dot)postgresql(dot)org, hs(at)cybertec(dot)at, Jeff Davis <pgsql(at)j-davis(dot)com>
Subject: Re: Is there value in having optimizer stats for joins/foreignkeys?
Date: 2026-02-01 16:39:38
Message-ID: e9165403-7b0e-4c01-96d0-b73512ce359d@vondra.me
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 1/31/26 12:18, Andrei Lepikhov wrote:
> On 29/1/26 06:04, Alexandra Wang wrote:
>> Hi hackers,
>>
>> As promised in my previous email, I'm sharing a proof-of-concept patch
>> exploring join statistics for correlated columns across relations.
>> This is a POC at this point, but I hope the performance numbers below
>> give a better idea of both the potential usefulness of join statistics
>> and the complexity of implementing them.
> I wonder why you chose the JOIN operator only?
>
> It seems to me that any relational operator produces relational output
> that can be treated as a table. The extended statistics code may be
> adopted to such relations.
> I think it may be a VIEW that you can declare (manually or
> automatically) and allow Postgres to build statistics on this 'virtual'
> table. So, the main focus may shift to the question: how to provably
> match a query subtree to a specific statistic.
>

Because for each "supported" operator we need to know two things:

(1) how to sample it efficiently

(2) how to apply it in selectivity estimation

We can't add support for everything at once, and for some cases we may
not even know answers to (1) and/or (2).

We can't simply store an opaque VIEW, and build the stats by simply
executing it (and sampling the results). The whole premise of extended
stats is that people define them to fix incorrect estimates. And with
incorrect estimates the plan may be terrible, and the VIEW may not even
complete.

regards

--
Tomas Vondra

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2026-02-01 16:48:17 Re: Is there value in having optimizer stats for joins/foreignkeys?
Previous Message Tomas Vondra 2026-02-01 16:32:20 Re: Is there value in having optimizer stats for joins/foreignkeys?