Re: Query plan prefers hash join when nested loop is much faster

From: iulian dragos <iulian(dot)dragos(at)databricks(dot)com>
To: Michael Lewis <mlewis(at)entrata(dot)com>
Cc: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Query plan prefers hash join when nested loop is much faster
Date: 2020-08-24 15:00:15
Message-ID: CAMNsu3kWGR1LaGW+xvRUKmfH4rR6ifurY2H18E-WYU-SuzYjpA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Mon, Aug 24, 2020 at 4:21 PM iulian dragos <iulian(dot)dragos(at)databricks(dot)com>
wrote:

> Hi Michael,
>
> Thanks for the answer. It's an RDS instance using SSD storage and the
> default `random_page_cost` set to 4.0. I don't expect a lot of repetitive
> queries here, so I think caching may not be extremely useful. I wonder if
> the selectivity of the query is wrongly estimated (out of 500 million rows,
> only a few thousands are returned).
>
> I tried lowering the `random_page_cost` to 1.2 and it didn't make a
> difference in the query plan.
>

I experimented a bit more with different values for this setting. The only
way I could make it use the index was to use a value strictly less than
`seq_page_cost` (0.8 for instance). That doesn't sound right, though.

The size of the effective_cache_size is fairly high as well (32 GB) for an
instance with 64GB (db.m5.4xlarge).

iulian

>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Олег Самойлов 2020-08-24 15:45:42 Re: BUG? Slave don't reconnect to the master
Previous Message David G. Johnston 2020-08-24 14:43:06 Re: Substitute Variable in select query