Skip site navigation (1) Skip section navigation (2)

Re: Limit + group + join

From: Mark Kirkwood <markir(at)paradise(dot)net(dot)nz>
To: Tobias Brox <tobias(at)nordicbet(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Limit + group + join
Date: 2005-08-26 03:01:01
Message-ID: 430E85ED.60109@paradise.net.nz (view raw or flat)
Thread:
Lists: pgsql-performance
Tobias,
Interesting example:

The 'desc' seems to be the guy triggering the sort, e.g:

explain select c.id from c join b on c_id=c.id  group by c.id order by 
c.id limit 5;
                                        QUERY PLAN 

-----------------------------------------------------------------------------------------
  Limit  (cost=0.00..0.28 rows=5 width=4)
    ->  Group  (cost=0.00..4476.00 rows=80000 width=4)
          ->  Merge Join  (cost=0.00..4276.00 rows=80000 width=4)
                Merge Cond: ("outer".id = "inner".c_id)
                ->  Index Scan using c_pkey on c  (cost=0.00..1518.00 
rows=80000 width=4)
                ->  Index Scan using b_on_c on b  (cost=0.00..1558.00 
rows=80000 width=4)
(6 rows)

Whereas with it back in again:

explain select c.id from c join b on c_id=c.id  group by c.id order by 
c.id desc limit 5;
                                       QUERY PLAN 

--------------------------------------------------------------------------------------
  Limit  (cost=10741.08..10741.11 rows=5 width=4)
    ->  Group  (cost=10741.08..11141.08 rows=80000 width=4)
          ->  Sort  (cost=10741.08..10941.08 rows=80000 width=4)
                Sort Key: c.id
                ->  Hash Join  (cost=1393.00..4226.00 rows=80000 width=4)
                      Hash Cond: ("outer".c_id = "inner".id)
                      ->  Seq Scan on b  (cost=0.00..1233.00 rows=80000 
width=4)
                      ->  Hash  (cost=1193.00..1193.00 rows=80000 width=4)
                            ->  Seq Scan on c  (cost=0.00..1193.00 
rows=80000 width=4)
(9 rows)


However being a bit brutal:

set enable_mergejoin=false;
set enable_hashjoin=false;

explain select c.id from c join b on c_id=c.id  group by c.id order by 
c.id desc limit 5;
                                             QUERY PLAN 

--------------------------------------------------------------------------------------------------
  Limit  (cost=0.00..15.24 rows=5 width=4)
    ->  Group  (cost=0.00..243798.00 rows=80000 width=4)
          ->  Nested Loop  (cost=0.00..243598.00 rows=80000 width=4)
                ->  Index Scan Backward using c_pkey on c 
(cost=0.00..1518.00 rows=80000 width=4)
                ->  Index Scan using b_on_c on b  (cost=0.00..3.01 
rows=1 width=4)
                      Index Cond: (b.c_id = "outer".id)
(6 rows)

What is interesting is why this plan is being rejected...

Cheers

Mark

Tobias Brox wrote:
> Consider this setup - which is a gross simplification of parts of our
> production system ;-)
> 
>   create table c (id integer primary key);
>   create table b (id integer primary key, c_id integer);
>   create index b_on_c on b(c_id)
> 
>   insert into c (select ... lots of IDs ...);
>   insert into b (select id, id from c); /* keep it simple :-) */
>   
> Now, I'm just interessted in some few rows.  
> 
> All those gives good plans:
> 
> explain select c.id from c order by c.id limit 1;
> explain select c.id from c group by c.id order by c.id limit 1;
> explain select c.id from c join b on c_id=c.id order by c.id limit 1;
> 
> ... BUT ... combining join, group and limit makes havoc:
> 
> explain select c.id from c join b on c_id=c.id  group by c.id order by c.id
> desc limit 5;
>

In response to

Responses

pgsql-performance by date

Next:From: Chun Yit(Chronos)Date: 2005-08-26 06:08:51
Subject: postmaster memory keep going up????
Previous:From: Jeffrey W. BakerDate: 2005-08-26 02:31:20
Subject: Re: Limit + group + join

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group