From: | Linos <info(at)linos(dot)es> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Merlin Moncure <mmoncure(at)gmail(dot)com> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: performance regression in 9.2/9.3 |
Date: | 2014-06-05 21:09:46 |
Message-ID: | 5390DC9A.3040500@linos.es |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 05/06/14 19:39, Tom Lane wrote:
> Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
>> On Thu, Jun 5, 2014 at 9:54 AM, Linos <info(at)linos(dot)es> wrote:
>>> What I don't understand is why the statistics have this bad information, all my tests are done on a database just restored and analyzed. Can I do something to improve the quality of my database statistics and let the planner do better choices? Maybe increase the statistics target of the columns involved?
>> By that I meant row count estimates coming out of the joins are way
>> off. This is pushing the planner into making bad choices. The most
>> pervasive problem I see is that the row count estimate boils down to
>> '1' at some juncture causing the server to favor nestloop/index scan
>> when something like a hash join would likely be more appropriate.
> There's some fairly wacko stuff going on in this example, like why
> is the inner HashAggregate costed so much higher by 9.3 than 8.4,
> when the inputs are basically the same? And why does 9.3 fail to
> suppress the SubqueryScan on "ven", when 8.4 does get rid of it?
> And why is the final output rows estimate so much higher in 9.3?
> That one is actually higher than the product of the two nestloop
> inputs, which looks like possibly a bug.
>
> I think what's happening is that 9.3 is picking what it knows to be a less
> than optimal join method so that it can sort the output by means of the
> ordered scan "Index Scan using referencia_key on modelo mo", and thereby
> avoid an explicit sort of what it thinks would be 42512461 rows. With a
> closer-to-reality estimate there, it would have gone for a plan more
> similar to 8.4's, ie, hash joins and then an explicit sort.
>
> There is a lot going on in this plan that we haven't been told about; for
> instance at least one of the query's tables seems to actually be a view,
> and some other ones appear to be inheritance trees with partitioning
> constraints, and I'm suspicious that some of the aggregates might be
> user-defined functions with higher than normal costs.
>
> I'd like to see a self-contained test case, by which I mean full details
> about the table/view schemas; it's not clear whether the actual data
> is very important here.
>
> regards, tom lane
Query 2 doesn't use any view and you can find the schema here:
http://pastebin.com/Nkv7FwRr
Query 1 use 5 views: ticket_cabecera, ticket_linea, reserva_cabecera, reserva_linea and tarifa_proveedor_modelo_precio, I have factored out the four first with the same result as before, you can find the new query and the new plan here:
http://pastebin.com/7u2Dkyxp
http://explain.depesz.com/s/2V9d
Actually the execution time is worse than before.
About the last view if I change join from tarifa_proveedor_modelo_precio to tarifa_modelo_precio (a table with nearly the same structure as the view) the query is executed much faster, but I get a similar time changing the (MIN(cab.time_stamp_recepcion)::DATE = ....) to (WHERE cab.time_stamp_recepcion::date = ....) in the "ent" subquery that never was a view.
Anyway I included tarifa_modelo_precio to the query1 schema file for reference and you can find the plan using tarifa_modelo_precio instead of the view tarifa_proveedor_modelo_precio here:
http://explain.depesz.com/s/4gV
query1 schema file:
http://pastebin.com/JpqM87dr
Regards,
Miguel Angel.
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2014-06-05 22:39:13 | Re: SP-GiST bug. |
Previous Message | Petr Jelinek | 2014-06-05 19:53:17 | Re: slotname vs slot_name |