Re: 9.5 regression with unwanted nested loop left join

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Sabino Mullane <greg(at)endpoint(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: 9.5 regression with unwanted nested loop left join
Date: 2016-05-24 15:19:04
Message-ID: 31696.1464103144@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Greg Sabino Mullane <greg(at)endpoint(dot)com> writes:
> We are trying to upgrade a client app to Postgres 9.5, but are running
> into some performance regression issues (even though the curent db is 8.x!).
> One in particular that is puzzling me involves a query that keeps slipping
> into a nested loop left join, rather than a much preferred hash join.

The core of the problem seems to be here:

> -> Hash Join (C=20929.52..44799.53 R=1 W=4) (AT=626.631..1009.945 R=124 L=1)
> Hash Cond: ((gs.id = gregtest_status.id) AND (gs.update_time = (min(gregtest_status.update_time))))

which is implementing the gs/gs2 join here:

> SELECT gs.id, gs.status
> FROM gregtest_status gs
> JOIN (
> SELECT id, min(update_time) AS update_time
> FROM gregtest_status
> WHERE update_time >= '2015-01-01'::date
> GROUP BY id
> ) gs2 ON gs.id = gs2.id AND gs.update_time = gs2.update_time

I tried this back to 8.4, and all versions predict just one row returned
from the gs/gs2 join, whereas in reality you get 124 rows, ie, all rows
produced by gs2 have matches in gs.

The plan 9.0 and up produce is not just a nestloop above this, but a
nestloop with inner materialization step. It would clearly not be
sensible to run this hashjoin many times, but materializing its output
gets rid of the cost-of-data problem --- and then, if you believe there's
only gonna be one row out, a simple nestloop looks cheaper than building
a hash table, at least up to a fairly large number of rows on the other
side.

8.4 avoids this trap only because it doesn't consider injecting a
materialize there.

So a brute-force fix to restore the pre-9.0 behavior would be
"set enable_material = off". But really the problem is that it's
unobvious that all rows in the gs2 output would have matches in gs.
I wonder if there's a way to reformulate the query to make that better.
Do you even need the update_time check?

Assuming that the application's already been optimized for pre-9.0
Postgres, turning off enable_material might not be a disastrous long
term solution for it, though certainly it'd be better if you can move
away from that eventually.

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message parihaaraka 2016-05-24 15:45:43 Re: pg_upgrade error regarding hstore operator
Previous Message Greg Sabino Mullane 2016-05-24 14:33:58 9.5 regression with unwanted nested loop left join