Skip site navigation (1) Skip section navigation (2)

Re: query optimization

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Thomas Kellerer <spam_eater(at)gmx(dot)net>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: query optimization
Date: 2012-04-26 20:08:55
Message-ID: 13222.1335470935@sss.pgh.pa.us (view raw or flat)
Thread:
Lists: pgsql-performance
Thomas Kellerer <spam_eater(at)gmx(dot)net> writes:
> Tom Lane wrote on 26.04.2012 21:17:
>> Um ... did you analyze all the tables, or just some of them?  I get
>> sub-millisecond runtimes if all four tables have been analyzed, but it
>> does seem to pick lousy plans if, say, only a and b have been analyzed.

> Here it's similar to Richard's experience:
> Before analyzing the four tables, the first statement yields this plan:
> [ merge joins ]
> This continues to stay the plan for about 10-15 repetitions, then it turns to this plan
> [ hash joins ]

Hmm.  I see it liking the merge-join plan (with minor variations) with
or without analyze data, but if just some of the tables have been
analyzed, it goes for the hash plan which is a good deal slower.  The
cost estimates aren't that far apart though.  In any case, the only
reason the merge join is so fast is that the data is perfectly ordered
in each table; on a less contrived example, it could well be a lot
slower.

> And the second one yields this one here (Regardless of analyze or not):

Yeah, the trick there is that it's valid to re-order the joins, since
they're both left joins.

In git HEAD I get something like this:

regression=# explain analyze select * from a left join (b inner join c on b.id = c.id) on a.b = b.id where a.id = 4;
                                                        QUERY PLAN                                                         
---------------------------------------------------------------------------------------------------------------------------
 Nested Loop Left Join  (cost=0.00..17.18 rows=1 width=16) (actual time=0.024..0.026 rows=1 loops=1)
   ->  Index Scan using a_idx1 on a  (cost=0.00..8.38 rows=1 width=8) (actual time=0.010..0.010 rows=1 loops=1)
         Index Cond: (id = 4)
   ->  Nested Loop  (cost=0.00..8.80 rows=1 width=8) (actual time=0.011..0.012 rows=1 loops=1)
         ->  Index Only Scan using b_idx1 on b  (cost=0.00..8.38 rows=1 width=4) (actual time=0.006..0.006 rows=1 loops=1)
               Index Cond: (id = a.b)
               Heap Fetches: 1
         ->  Index Only Scan using c_idx1 on c  (cost=0.00..0.41 rows=1 width=4) (actual time=0.004..0.005 rows=1 loops=1)
               Index Cond: (id = b.id)
               Heap Fetches: 1
 Total runtime: 0.080 ms
(11 rows)

but 9.1 and older are not smart enough to do it like that when they
can't re-order the joins.

			regards, tom lane

In response to

Responses

pgsql-performance by date

Next:From: Kevin GrittnerDate: 2012-04-26 20:11:07
Subject: Re: auto-vacuum vs. full table update
Previous:From: Steve CrawfordDate: 2012-04-26 20:00:19
Subject: Re: auto-vacuum vs. full table update

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group