Re: Postgres Optimizer ignores information about foreign key relationship, severely misestimating number of returned rows in join

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: "Ehrenreich, Sigrid" <Ehrenreich(at)consist(dot)de>
Cc: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Postgres Optimizer ignores information about foreign key relationship, severely misestimating number of returned rows in join
Date: 2020-10-26 20:34:13
Message-ID: 20201026203413.GA9241@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Mon, Oct 26, 2020 at 03:58:05PM +0000, Ehrenreich, Sigrid wrote:
> Hi Performance Guys,
>
> I hope you can help me. I am joining two tables, that have a foreign key relationship. So I expect the optimizer to estimate the number of the resulting rows to be the same as the number of the returned rows of one of the tables. But the estimate is way too low.
>
> I have built a test case, where the problem is easily to be seen.

I reproduced the problem on v14dev.

Note the different estimates between these:

postgres=# explain analyze SELECT * FROM fact INNER JOIN dim USING (low_card,anydata1,anydata2) WHERE fact.low_card=2;
Hash Join (cost=161.58..358.85 rows=112 width=12) (actual time=8.707..15.717 rows=3289 loops=1)

postgres=# explain analyze SELECT * FROM fact INNER JOIN dim USING (low_card,anydata1,anydata2) WHERE fact.low_card BETWEEN 2 AND 2;
Hash Join (cost=324.71..555.61 rows=3289 width=12) (actual time=15.966..23.394 rows=3289 loops=1)

I think because low_card has an equality comparison in addition to the equijoin,
it's being disqualified from the planner's mechanism to consider FKs in join
selectivity.
https://doxygen.postgresql.org/costsize_8c_source.html#l05024

I don't know enough about this to help more than that.

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message David Rowley 2020-10-26 22:25:45 Re: Postgres Optimizer ignores information about foreign key relationship, severly misestimating number of returned rows in join
Previous Message Philip Semanchuk 2020-10-26 18:55:46 Re: Understanding bad estimate (related to FKs?)