Re: Slow query with planner row strange estimation

From: phb07 <phb07(at)apra(dot)asso(dot)fr>
To: Dimitri <dimitrik(dot)fr(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, damien hostin <damien(dot)hostin(at)axege(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Slow query with planner row strange estimation
Date: 2010-07-12 20:33:17
Message-ID: 4C3B7C0D.6070404@apra.asso.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


Dimitri a écrit :
> It's probably one of the cases when having HINTS in PostgreSQL may be
> very helpful..
>
> SELECT /*+ enable_nestloop=off */ ... FROM ...
>
> will just fix this query without impacting other queries and without
> adding any additional instructions into the application code..
>
> So, why there is a such resistance to implement hints withing SQL
> queries in PG?..
>
> Rgds,
> -Dimitri
>
>
+1.
Another typical case when it would be helpful is with setting the
cursor_tuple_fraction GUC variable for a specific statement, without
being obliged to issue 2 SET statements, one before the SELECT and the
other after.

> On 7/9/10, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
>> On Fri, Jul 9, 2010 at 6:13 AM, damien hostin <damien(dot)hostin(at)axege(dot)com>
>> wrote:
>>
>>>> Have you tried running ANALYZE on the production server?
>>>>
>>>> You might also want to try ALTER TABLE ... SET STATISTICS to a large
>>>> value on some of the join columns involved in the query.
>>>>
>>> Hello,
>>>
>>> Before comparing the test case on the two machines, I run analyse on the
>>> whole and look at pg_stats table to see if change occurs for the columns.
>>> but on the production server the stats never became as good as on the
>>> desktop computer. I set statistic at 10000 on column used by the join, run
>>> analyse which take a 3000000 row sample then look at the stats. The stats
>>> are not as good as on the desktop. Row number is nearly the same but only
>>> 1
>>> or 2 values are found.
>>>
>>> The data are not balanced the same way on the two computer :
>>> - Desktop is 12000 rows with 6000 implicated in the query (50%),
>>> - "Production" (actually a dev/test server) is 6 million rows with 6000
>>> implicated in the query (0,1%).
>>> Columns used in the query are nullable, and in the 5994000 other rows that
>>> are not implicated in the query these columns are null.
>>>
>>> I don't know if the statistic target is a % or a number of value to
>>> obtain,
>>>
>> It's a number of values to obtain.
>>
>>
>>> but event set at max (10000), it didn't managed to collect good stats (for
>>> this particular query).
>>>
>> I think there's a cutoff where it won't collect values unless they
>> occur significantly more often than the average frequency. I wonder
>> if that might be biting you here: without the actual values in the MCV
>> table, the join selectivity estimates probably aren't too good.
>>
>>
>>> As I don't know what more to do, my conclusion is that the data need to be
>>> better balanced to allow the analyse gather better stats. But if there is
>>> a
>>> way to improve the stats/query with this ugly balanced data, I'm open to
>>> it
>>> !
>>>
>>> I hope that in real production, data will never be loaded this way. If
>>> this
>>> appened we will maybe set enable_nestloop to off, but I don't think it's a
>>> good solution, other query have a chance to get slower.
>>>
>> Yeah, that usually works out poorly.
>>
>> --
>> Robert Haas
>> EnterpriseDB: http://www.enterprisedb.com
>> The Enterprise Postgres Company
>>
>> --
>> Sent via pgsql-performance mailing list (pgsql-performance(at)postgresql(dot)org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-performance
>>
>>
Regards.
Philippe Beaudoin.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Josh Berkus 2010-07-13 03:01:08 Re: partition queries hitting all partitions even though check key is specified
Previous Message Bruce Momjian 2010-07-12 16:04:22 Re: Need help in performance tuning.