Re: Query plan prefers hash join when nested loop is much faster

From: iulian dragos <iulian(dot)dragos(at)databricks(dot)com>
To: Michael Lewis <mlewis(at)entrata(dot)com>
Cc: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Query plan prefers hash join when nested loop is much faster
Date: 2020-08-24 14:21:43
Message-ID: CAMNsu3mGt350XkwXUNhJO0LKuQ96X22NFqjapg=SRY+vo3H3gg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi Michael,

Thanks for the answer. It's an RDS instance using SSD storage and the
default `random_page_cost` set to 4.0. I don't expect a lot of repetitive
queries here, so I think caching may not be extremely useful. I wonder if
the selectivity of the query is wrongly estimated (out of 500 million rows,
only a few thousands are returned).

I tried lowering the `random_page_cost` to 1.2 and it didn't make a
difference in the query plan.

iulian

On Fri, Aug 21, 2020 at 6:30 PM Michael Lewis <mlewis(at)entrata(dot)com> wrote:

> Your system is preferring sequential scan to
> using test_result_module_result_id_idx in this case. What type of storage
> do you use, what type of cache hits do you expect, and what do you have
> random_page_cost set to? That comes to mind as a significant factor in
> choosing index scans based on costs.
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Diego 2020-08-24 14:25:41 Re: Getting away from Oracle APEX, recommendations for PostgreSQL?
Previous Message harish supare 2020-08-24 14:21:03 Re: Substitute Variable in select query