Re: using extended statistics to improve join estimates

From: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, Andy Fan <zhihuifan1213(at)163(dot)com>
Cc: Julien Rouhaud <rjuju123(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org, Justin Pryzby <pryzby(at)telsasoft(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: using extended statistics to improve join estimates
Date: 2025-06-09 18:38:27
Message-ID: 2c54b453-dc12-4717-997f-b51370215139@tantorlabs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi hackers

Еhank you for your work.

Let me start my review from the top — specifically, in clausesel.c, the
function clauselist_selectivity_ext():

1. About check clauses == NULL. In my opinion, this check should be
kept. This issue has already been discussed previously[0], and I think
it's better to keep the safety check.

2. I noticed that the patch applies extended statistics to OR clauses as
well. There's an example from regression tests illustrating this:

Before applying ext stats:
SELECT * FROM check_estimated_rows('select * from join_test_1 j1 join
join_test_2 j2 on ((j1.a + 1 = j2.a + 1) or (j1.b = j2.b))');
 estimated | actual
-----------+--------
    104500 | 100000

After applying ext stats:
SELECT * FROM check_estimated_rows('select * from join_test_1 j1 join
join_test_2 j2 on ((j1.a + 1 = j2.a + 1) or (j1.b = j2.b))');
 estimated | actual
-----------+--------
    190000 | 100000
(1 row)

I agree that, at least for now, we should focus solely on AND clauses.
To do that, we should impose the same restriction in
clauselist_selectivity_or() as we already do in
clauselist_selectivity_ext().

What do you think? Or shall we consider OR-clauses as well?

[0]:
https://www.postgresql.org/message-id/flat/016e33b7-2830-4300-bc89-e7ce9e613bad%40tantorlabs.com

--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2025-06-09 18:59:04 Re: dsm_registry: Add detach and destroy features
Previous Message Jeff Davis 2025-06-09 18:20:51 Re: CREATE DATABASE command for non-libc providers