Quick Links

Re: Query plan prefers hash join when nested loop is much faster

From:	David Rowley <dgrowleyml(at)gmail(dot)com>
To:	iulian dragos <iulian(dot)dragos(at)databricks(dot)com>
Cc:	pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject:	Re: Query plan prefers hash join when nested loop is much faster
Date:	2020-08-25 10:36:30
Message-ID:	CAApHDvpgtNhKQVPuNCrisKuCB+33BtNK_K2Cn0Cst2muJJPshg@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

On Tue, 25 Aug 2020 at 22:10, iulian dragos
<iulian(dot)dragos(at)databricks(dot)com> wrote:
> Thanks for the tip! Indeed, `n_distinct` isn't right. I found it in pg_stats set at 131736.0, but the actual number is much higher: 210104361. I tried to set it manually, but the plan is still the same (both the actual number and a percentage, -0.4, as you suggested):

You'll need to run ANALYZE on the table after doing the ALTER TABLE to
change the n_distinct. The ANALYZE writes the value to pg_statistic.
ALTER TABLE only takes it as far as pg_attribute's attoptions.
ANALYZE reads that column to see if the n_distinct estimate should be
overwritten before writing out pg_statistic

Just remember if you're hardcoding a positive value that it'll stay
fixed until you change it. If the table is likely to grow, then you
might want to reconsider using a positive value and consider using a
negative value as mentioned in the doc link.

David

In response to

Re: Query plan prefers hash join when nested loop is much faster at 2020-08-25 10:10:05 from iulian dragos

Responses

Re: Query plan prefers hash join when nested loop is much faster at 2020-08-25 12:01:18 from iulian dragos

Browse pgsql-general by date

	From	Date	Subject
Next Message	Peter J. Holzer	2020-08-25 11:24:00	Re: Most effective and fast way to load few Tbyte of data from flat files into postgresql
Previous Message	Peter Eisentraut	2020-08-25 10:28:58	Re: pgbouncer bug?