Skip site navigation (1) Skip section navigation (2)

BUG #5294: Sorts on more than just the order-by clause

From: "Allen Johnson" <akjohnson78(at)gmail(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #5294: Sorts on more than just the order-by clause
Date: 2010-01-21 16:19:51
Message-ID: 201001211619.o0LGJp5I043918@wwwmaster.postgresql.org (view raw or flat)
Thread:
Lists: pgsql-bugs
The following bug has been logged online:

Bug reference:      5294
Logged by:          Allen Johnson
Email address:      akjohnson78(at)gmail(dot)com
PostgreSQL version: 8.4.2
Operating system:   Red Hat ES 5.4
Description:        Sorts on more than just the order-by clause
Details: 

I've been porting our app from Oracle to Postgres and keeping an eye on
performance. Largely, Postgres is performing just as well or better!

I did run into an issue where we are performing a group-by on about 10
columns followed by an order-by of about 5 columns. This query was taking
twice as long as Oracle. When looking at the explain plan, Postgres seems to
be using all the columns in the group-by for sorting instead of _only_ using
what is in the order-by.

While the results are correct this seems to be causing a performance problem
since Postgres is sorting on more columns than it is being asked to. I
reworked the query to get rid of the extra sorting columns but this seems
like a hack. Below is an example query followed by the ugly hack.

Note: The execution times in this example don't mean anything because they
are running on a blank test db. On the production database there was a huge
difference in execution time in favor of the hack query. I just wanted to
illustrate that the sort keys seem incorrect.

Example Query:

select
  ct.name, c.lname, c.fname, c.mname,
  c.email, c.address1, c.address2,
  c.city, c.state, c.zip_code,
  count(a.id)
from
  contacts c
  inner join contact_types ct on (ct.code = c.contact_type_code)
  left join attachments a on (a.contact_id = c.id)
where
  c.company_id = 1
group by
  ct.name, c.lname, c.fname, c.mname,
  c.email, c.address1, c.address2,
  c.city, c.state, c.zip_code
order by
  ct.name, c.lname, c.fname, c.mname;

Example Explain:
                                                                       QUERY
PLAN                                                                       
----------------------------------------------------------------------------
----------------------------------------------------------------------------

 GroupAggregate  (cost=27.98..28.11 rows=3 width=1864) (actual
time=0.037..0.037 rows=0 loops=1)
   ->  Sort  (cost=27.98..27.99 rows=3 width=1864) (actual time=0.035..0.035
rows=0 loops=1)
         Sort Key: ct.name, c.lname, c.fname, c.mname, c.email, c.address1,
c.address2, c.city, c.state, c.zip_code
         Sort Method:  quicksort  Memory: 17kB
         ->  Nested Loop Left Join  (cost=4.27..27.96 rows=3 width=1864)
(actual time=0.017..0.017 rows=0 loops=1)
               ->  Nested Loop  (cost=0.00..16.55 rows=1 width=1864) (actual
time=0.014..0.014 rows=0 loops=1)
                     ->  Index Scan using contacts_company on contacts c 
(cost=0.00..8.27 rows=1 width=1760) (actual time=0.012..0.012 rows=0
loops=1)
                           Index Cond: (company_id = 1)
                     ->  Index Scan using contact_types_pkey on
contact_types ct  (cost=0.00..8.27 rows=1 width=152) (never executed)
                           Index Cond: ((ct.code)::text =
(c.contact_type_code)::text)
               ->  Bitmap Heap Scan on attachments a  (cost=4.27..11.37
rows=3 width=12) (never executed)
                     Recheck Cond: (a.contact_id = c.id)
                     ->  Bitmap Index Scan on attachments_contact 
(cost=0.00..4.27 rows=3 width=0) (never executed)
                           Index Cond: (a.contact_id = c.id)
 Total runtime: 0.192 ms
(15 rows)

* Notice how the sort key is using many more columns than the order-by has
specified.

Hack Query:
select * from (
  select
    ct.name as ct_name, c.lname, c.fname, c.mname,
    c.email, c.address1, c.address2,
    c.city, c.state, c.zip_code,
    count(a.id)
  from
    contacts c
    inner join contact_types ct on (ct.code = c.contact_type_code)
    left join attachments a on (a.contact_id = c.id)
  where
    c.company_id = 1
  group by
    ct.name, c.lname, c.fname, c.mname,
    c.email, c.address1, c.address2,
    c.city, c.state, c.zip_code
) as results
order by 
  ct_name, lname, fname, mname;

Hack Explain:
                                                                       QUERY
PLAN                                                                       
----------------------------------------------------------------------------
----------------------------------------------------------------------------

 Sort  (cost=28.13..28.14 rows=3 width=1868) (actual time=0.054..0.054
rows=0 loops=1)
   Sort Key: ct.name, c.lname, c.fname, c.mname
   Sort Method:  quicksort  Memory: 17kB
   ->  HashAggregate  (cost=28.04..28.08 rows=3 width=1864) (actual
time=0.021..0.021 rows=0 loops=1)
         ->  Nested Loop Left Join  (cost=4.27..27.96 rows=3 width=1864)
(actual time=0.018..0.018 rows=0 loops=1)
               ->  Nested Loop  (cost=0.00..16.55 rows=1 width=1864) (actual
time=0.016..0.016 rows=0 loops=1)
                     ->  Index Scan using contacts_company on contacts c 
(cost=0.00..8.27 rows=1 width=1760) (actual time=0.013..0.013 rows=0
loops=1)
                           Index Cond: (company_id = 1)
                     ->  Index Scan using contact_types_pkey on
contact_types ct  (cost=0.00..8.27 rows=1 width=152) (never executed)
                           Index Cond: ((ct.code)::text =
(c.contact_type_code)::text)
               ->  Bitmap Heap Scan on attachments a  (cost=4.27..11.37
rows=3 width=12) (never executed)
                     Recheck Cond: (a.contact_id = c.id)
                     ->  Bitmap Index Scan on attachments_contact 
(cost=0.00..4.27 rows=3 width=0) (never executed)
                           Index Cond: (a.contact_id = c.id)
 Total runtime: 0.259 ms
(15 rows)

Responses

pgsql-bugs by date

Next:From: Dave PageDate: 2010-01-21 16:30:37
Subject: Re: BUG #5292: Corrupted installer
Previous:From: Kevin GrittnerDate: 2010-01-21 16:15:29
Subject: Re: BUG #5292: Corrupted installer

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group