Skip site navigation (1) Skip section navigation (2)

Re: Strange explain on partitioned tables

From: Philippe Rimbault <primbault(at)edd(dot)fr>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Strange explain on partitioned tables
Date: 2010-07-22 14:10:51
Message-ID: 4C48516B.90103@edd.fr (view raw or flat)
Thread:
Lists: pgsql-performance
Greg,

First : thank you for you help.



On 22/07/2010 15:32, Greg Smith wrote:
> Philippe Rimbault wrote:
>> I have one thousand inherited tables like this one (with a different 
>> check constraint on each) :
>
> The PostgreSQL partitioning system is aimed to support perhaps a 
> hundred inherited tables.  You can expect to get poor performance on 
> queries if you create 1000 of them.  That's not the cause of your 
> current problem, just pointing out there's a larger design problem 
> here you'll probably have to fix one day.

Right now, there is only 6 inherited tables, but for performance issue, 
where are testing solutions on more partionned systeme (all work fine 
except for query with order by).

>
>> EXPLAIN ANALYZE
>>     select
>>         documents.id,
>>         documents.num,
>>         sources.name,
>>         l.name
>>     from
>>         documents,
>>         locations l,
>>         sources
>>     where
>>         documents.id_source = 113 and
>>         documents.id_location=l.id and
>>         documents.id_source=sources.id
>>     order by
>>         documents.id desc
>>     limit 5;
>
> Please don't put your EXPLAIN plans surrounded in " marks; it makes it 
> harder to copy out of your message to analyze them with tools.  I put 
> this bad one into http://explain.depesz.com/s/XD and it notes that the 
> "public.documents.id_location = l.id" search is underestimating the 
> number of rows by a factor of 8.7.  You might get a better plan if you 
> can get better table statistics on that column.  Did you run ANALYZE 
> since the partitioning was done?  If not, that could be making this 
> worse.  You might increase the amount of table statistics on this 
> specific column too, not sure what would help without knowing exactly 
> what's in there.
>
> Another thing you can try is suggest the optimizer not use a hash join 
> here and see if it does the right thing instead; be a useful bit of 
> feedback to see what that plan turns out to be.  Just put "set 
> enable_hashjoin=off;" before the rest of the query, it will only 
> impact that session.
>

Sorry for the output of the EXPLAIN ...

VACUUM ANALYZE have been done just before test of query.

I think that the optimizer overestimates "public.documents.id_location = 
l.id" because it plan on the primary table and not the child ...
I've change statistics to 1000 for documents.id_location and result is 
the same.

I've tested "set enable_hashjoin=off;" and the result is worst (sorry 
i'm still using 9.0b3) :

Limit  (cost=197755.49..197755.50 rows=5 width=23) (actual 
time=4187.148..4187.150 rows=5 loops=1)
   ->  Sort  (cost=197755.49..200390.32 rows=1053932 width=23) (actual 
time=4187.146..4187.147 rows=5 loops=1)
         Sort Key: public.documents.id
         Sort Method:  top-N heapsort  Memory: 17kB
         ->  Nested Loop  (cost=151258.55..180250.06 rows=1053932 
width=23) (actual time=1862.214..3769.611 rows=1053929 loops=1)
               ->  Index Scan using pk_sources on sources  
(cost=0.00..8.27 rows=1 width=8) (actual time=0.007..0.013 rows=1 loops=1)
                     Index Cond: (id = 113)
               ->  Merge Join  (cost=151258.55..169702.47 rows=1053932 
width=21) (actual time=1862.204..3360.555 rows=1053929 loops=1)
                     Merge Cond: (l.id = public.documents.id_location)
                     ->  Sort  (cost=1.75..1.81 rows=23 width=10) 
(actual time=0.028..0.036 rows=21 loops=1)
                           Sort Key: l.id
                           Sort Method:  quicksort  Memory: 17kB
                           ->  Seq Scan on locations l  (cost=0.00..1.23 
rows=23 width=10) (actual time=0.002..0.009 rows=23 loops=1)
                     ->  Materialize  (cost=151256.80..156526.46 
rows=1053932 width=13) (actual time=1862.162..2841.302 rows=1053929 loops=1)
                           ->  Sort  (cost=151256.80..153891.63 
rows=1053932 width=13) (actual time=1862.154..2290.881 rows=1053929 loops=1)
                                 Sort Key: public.documents.id_location
                                 Sort Method:  external merge  Disk: 24496kB
                                 ->  Append  (cost=0.00..27810.36 
rows=1053932 width=13) (actual time=0.003..838.644 rows=1053929 loops=1)
                                       ->  Seq Scan on documents  
(cost=0.00..18.25 rows=3 width=39) (actual time=0.000..0.000 rows=0 loops=1)
                                             Filter: (id_source = 113)
                                       ->  Seq Scan on documents_mond 
documents  (cost=0.00..27792.11 rows=1053929 width=13) (actual 
time=0.002..502.345 rows=1053929 loops=1)
                                             Filter: (id_source = 113)
Total runtime: 4198.703 ms



In response to

pgsql-performance by date

Next:From: Craig JamesDate: 2010-07-22 16:57:06
Subject: Re: Using more tha one index per table
Previous:From: Greg SmithDate: 2010-07-22 13:32:36
Subject: Re: Strange explain on partitioned tables

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group