Quick Links

Re: Why PG uses nested-loop join when no indexes are available?

From:	David Grelaud <dgrelaud(at)ideolys(dot)com>
To:	David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject:	Re: Why PG uses nested-loop join when no indexes are available?
Date:	2016-01-20 19:45:54
Message-ID:	CABKm3pjNsFDsgUjPsExiXN_---Dcfj-3M4Z5Q9OwaxiFgK3kLQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

Thank you both for your help.

We will test your patch but we need to understand a bit more the code in
order to follow your discussions.
Actually, your patch helps us to find where to start in the code ;).

> The planner is never going to get it right 100% of the time.

Yes, I agree.
In production environnements, even if PostgreSQL chooses such a bad plan 1%
of the time, it is enough to make clients angry. My goal is to eradicate
this risk of choosing a nested loop in certain cases, which freezes
PostgreSQL during many minutes, whereas a hash-join or something else takes
only 2 seconds to complete. The performance difference is huge.
I mean, even if the plan is not the best one 100% of the time, it should at
least choose a "risk-free" plan, without these "bad" nested-loops. It is
maybe easier said than done but we want to try.

Regards,

*David Grelaud*

2016-01-15 2:16 GMT+01:00 David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>:

> On 15 January 2016 at 04:00, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
>> David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> writes:
>> > Perhaps separating out enable_nestloop so that it only disables
>> > non-parameterised nested loops, and add another GUC for parameterised
>> > nested loops would be a good thing to do. Likely setting
>> enable_nestloop to
>> > off in production would be a slightly easier decision to make, if that
>> was
>> > the case.
>> > It looks pretty simple to do this, so I hacked it up, and attached it
>> here.
>> > There's no doc changes and I'm not that interested in fighting for this
>> > change, it's more just an idea for consideration.
>>
>> I'm not terribly excited by this idea either. If making such a change
>> actually makes things better for someone consistently, I'd argue that
>> the problem is a mistaken cost estimate elsewhere, and we'd be better off
>> to find and fix the real problem. (There have already been discussions
>> of only believing single-row rowcount estimates when they're provably
>> true, which might help if we can figure out how to do it cheaply enough.)
>>
>
> Actually, it's not very hard to hit a bad underestimate at all. All you
> need is a join on two columns which are co-related. Since PostgreSQL
> multiplies the estimated selectivities the row count is going to come out
> too low. This also tricks the planner into thinking that this is a good
> join to perform early, since (it thinks that) it does not produce many rows
> at all. You only need 1 more join to occur after that to choose a nested
> loop join mistakenly to hit the issue.
>
> FWIW TPC-H Q9 has this exact trip hazard with the partsupp table, which is
> the exact reason why this patch was born:
> https://commitfest.postgresql.org/7/210/
>
> I also think that the attitude that we can *always* fix the costs and
> estimates is not the right one. The planner is never going to get it right
> 100% of the time. If we ever think we can build such a planner then someone
> needs to come along and direct us back into the real world.
>
> --
> David Rowley http://www.2ndQuadrant.com/
> <http://www.2ndquadrant.com/>
> PostgreSQL Development, 24x7 Support, Training & Services
>

In response to

Re: Why PG uses nested-loop join when no indexes are available? at 2016-01-15 01:16:10 from David Rowley

Browse pgsql-general by date

	From	Date	Subject
Next Message	Alvaro Herrera	2016-01-20 20:04:52	Re: CoC [Final]
Previous Message	Vik Fearing	2016-01-20 19:19:44	Re: Postgres and timezones