Skip site navigation (1) Skip section navigation (2)

Re: Question about difference in performance of 2 queries

From: Sean Shanny <shannyconsulting(at)earthlink(dot)net>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Question about difference in performance of 2 queries
Date: 2003-12-29 19:52:38
Message-ID: 3FF08606.2080204@earthlink.net (view raw or flat)
Thread:
Lists: pgsql-performance
Tom,

Thanks.  I will make the changes you suggest concerning the indexes.  I 
am finding partial indexes to be very handy.  :-)

I canceled the explain analyze on the other query as we have found the 
problem and who knows how long it would take to complete.

Thanks again.

--sean

Tom Lane wrote:

>Sean Shanny <shannyconsulting(at)earthlink(dot)net> writes:
>  
>
>>Here is the pg_stats data.  The explain analyze queries are still running.
>>    
>>
>
>  
>
>>select * from pg_stats where tablename = 'f_pageviews' and attname = 
>>'content_key';
>> schemaname |  tablename  |   attname   | null_frac | avg_width | 
>>n_distinct | most_common_vals |   most_common_freqs   
>>|                                  
>>histogram_bounds                                   | correlation
>>------------+-------------+-------------+-----------+-----------+------------+------------------+-----------------------+-------------------------------------------------------------------------------------+-------------
>> public     | f_pageviews | content_key |         0 |         4 |        
>>983 | {-1,1528483}     | {0.749333,0.00166667} | 
>>    
>>
>
>Oh-ho, I see the problem: about 75% of your table has content_key = -1.
>
>Why is that a problem, you ask?  Well, the planner realizes that
>"content_key > -1" is a pretty good restriction condition (better than
>the date condition, apparently) and so it tries to use that as the index
>scan condition.  The problem is that in 7.4 and before, the btree index
>code implements a "> -1" scan starting boundary by finding the first -1
>and then advancing to the first key that's not -1.  So you end up
>scanning through 75% of the index before anything useful happens :-(
>
>I just fixed this poor behavior in CVS tip a couple weeks ago:
>http://archives.postgresql.org/pgsql-committers/2003-12/msg00220.php
>but the patch seems too large and unproven to risk back-patching into
>7.4.*.
>
>If you expect that a pretty large fraction of your data will always have
>dummy content_key, it'd probably be worth changing the index to not
>index -1's at all --- that is, make it a partial index with the
>condition "WHERE content_key > -1".  Another workaround is to leave the
>index as-is but phrase the query WHERE condition as "content_key >= 0"
>instead of "> -1".
>
>			regards, tom lane
>
>---------------------------(end of broadcast)---------------------------
>TIP 5: Have you checked our extensive FAQ?
>
>               http://www.postgresql.org/docs/faqs/FAQ.html
>
>  
>


In response to

pgsql-performance by date

Next:From: Vivek KheraDate: 2003-12-29 20:33:57
Subject: deferred foreign keys
Previous:From: Tom LaneDate: 2003-12-29 19:39:11
Subject: Re: Question about difference in performance of 2 queries

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group