Quick Links

Re: Query plan prefers hash join when nested loop is much faster

From:	iulian dragos <iulian(dot)dragos(at)databricks(dot)com>
To:	Michael Lewis <mlewis(at)entrata(dot)com>
Cc:	pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject:	Re: Query plan prefers hash join when nested loop is much faster
Date:	2020-08-24 15:00:15
Message-ID:	CAMNsu3kWGR1LaGW+xvRUKmfH4rR6ifurY2H18E-WYU-SuzYjpA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

On Mon, Aug 24, 2020 at 4:21 PM iulian dragos <iulian(dot)dragos(at)databricks(dot)com>
wrote:

> Hi Michael,
>
> Thanks for the answer. It's an RDS instance using SSD storage and the
> default `random_page_cost` set to 4.0. I don't expect a lot of repetitive
> queries here, so I think caching may not be extremely useful. I wonder if
> the selectivity of the query is wrongly estimated (out of 500 million rows,
> only a few thousands are returned).
>
> I tried lowering the `random_page_cost` to 1.2 and it didn't make a
> difference in the query plan.
>

I experimented a bit more with different values for this setting. The only
way I could make it use the index was to use a value strictly less than
`seq_page_cost` (0.8 for instance). That doesn't sound right, though.

The size of the effective_cache_size is fairly high as well (32 GB) for an
instance with 64GB (db.m5.4xlarge).

iulian

In response to

Re: Query plan prefers hash join when nested loop is much faster at 2020-08-24 14:21:43 from iulian dragos

Browse pgsql-general by date

	From	Date	Subject
Next Message	Олег Самойлов	2020-08-24 15:45:42	Re: BUG? Slave don't reconnect to the master
Previous Message	David G. Johnston	2020-08-24 14:43:06	Re: Substitute Variable in select query