Re: Regression from 9.4-9.6

From: Jim Nasby <jim(at)nasby(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Regression from 9.4-9.6
Date: 2017-10-08 20:02:42
Message-ID: e0a4a8e6-d2e6-c6a8-dbb0-92131302bcbb@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 10/8/17 2:34 PM, Tom Lane wrote:
> Jim Nasby <jim(at)nasby(dot)net> writes:
>> I've got a query that's regressed from 9.4 to 9.6. I suspect it has
>> something to do with the work done around bad plans from single-row
>> estimates.
>
> Why has this indexscan's cost estimate changed so much?
>
>>> -> Index Scan using bdata_filed_departuretime on bdata_forks (cost=0.57..50807.51 rows=1 width=36) (actual time=979.381..3207.777 rows=508 loops=1)
>
>>> -> Index Scan using bdata_filed_departuretime on bdata_forks (cost=0.57..14894236.06 rows=1 width=36) (actual time=892.664..3025.653 rows=508 loops=1)
>
> I think the reason it's discarding the preferable plan is that, with this
> huge increment in the estimated cost getting added to both alternatives,
> the two nestloop plans have fuzzily the same total cost, and it's picking
> the one you don't want on the basis of some secondary criterion.

Great question... the only thing that sticks out is the coalesce(). Let
me see if an analyze with a higher stats target changes anything. FWIW,
the 9.6 database is copied from the 9.4 one once a week and then
pg_upgraded. I'm pretty sure an ANALYZE is part of that process.

9.4:
> -> Index Scan using bdata_filed_departuretime on bdata_forks (cost=0.57..50807.51 rows=1 width=36) (actual time=979.381..3207.777 rows=508 loops=1)
> Index Cond: ((filed_departuretime >= '2017-07-20 05:00:00'::timestamp without time zone) AND (filed_departuretime <= '2017-07-30 04:59:59'::timestamp without time zone))
> Filter: (((view_www IS NULL) OR (view_www IS TRUE)) AND (sch_block_out IS NOT NULL) AND (diverted IS NOT TRUE) AND (true_cancel IS NOT TRUE) AND (sch_block_out >= '2017-07-23 05:00:00'::timestamp without time zone) AND (sch_block_out <= '2017-07-24 04:59:59'::timestamp without time zone) AND (COALESCE(actualarrivaltime, cancellation) >= actualdeparturetime) AND ((act_block_out - sch_block_out) >= '00:15:00'::interval) AND (((SubPlan 2))::text = 'KORD'::text))
> Rows Removed by Filter: 2696593
> SubPlan 2
> -> Index Scan using bd_airport_pkey on bd_airport bd_airport_1 (cost=0.56..4.58 rows=1 width=20) (actual time=0.003..0.003 rows=1 loops=21652)
> Index Cond: (id = bdata_forks.origin_id)

9.6:
> -> Index Scan using bdata_filed_departuretime on bdata_forks (cost=0.57..14894236.06 rows=1 width=36) (actual time=892.664..3025.653 rows=508 loops=1)
> Index Cond: ((filed_departuretime >= '2017-07-20 05:00:00'::timestamp without time zone) AND (filed_departuretime <= '2017-07-30 04:59:59'::timestamp without time zone))
> Filter: (((view_www IS NULL) OR (view_www IS TRUE)) AND (sch_block_out IS NOT NULL) AND (diverted IS NOT TRUE) AND (true_cancel IS NOT TRUE) AND (sch_block_out >= '2017-07-23 05:00:00'::timestamp without time zone) AND (sch_block_out <= '2017-07-24 04:59:59'::timestamp without time zone) AND (COALESCE(actualarrivaltime, cancellation) >= actualdeparturetime) AND ((act_block_out - sch_block_out) >= '00:15:00'::interval) AND (((SubPlan 2))::text = 'KORD'::text))
> Rows Removed by Filter: 2696592
> SubPlan 2
> -> Index Scan using bd_airport_pkey on bd_airport bd_airport_1 (cost=0.56..4.58 rows=1 width=20) (actual time=0.004..0.004 rows=1 loops=21652)
> Index Cond: (id = bdata_forks.origin_id)

--
Jim C. Nasby, Data Architect jim(at)nasby(dot)net
512.569.9461 (cell) http://jim.nasby.net

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Jim Nasby 2017-10-08 20:33:07 Re: Regression from 9.4-9.6
Previous Message Tom Lane 2017-10-08 19:34:51 Re: Regression from 9.4-9.6