Re: BUG #5294: Sorts on more than just the order-by clause

From: Greg Stark <gsstark(at)mit(dot)edu>
To: Allen Johnson <akjohnson78(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #5294: Sorts on more than just the order-by clause
Date: 2010-01-22 14:02:57
Message-ID: 407d949e1001220602j2ffad480m91107c43284abf1e@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Thu, Jan 21, 2010 at 9:27 PM, Allen Johnson <akjohnson78(at)gmail(dot)com> wrote:
> What I noticed in the production query was that ~1000ms was spent on
> sorting alone. The hack query reduced that to ~400ms. I should also
> note that there was plenty of work_mem and that the sort was not
> hitting disk.
>

The "hack" form actually seems far worse to me but I guess it depends
on the actual data layout.

Have you tried something like this which I would expect to be far faster:

select
ct.name, c.lname, c.fname, c.mname,
c.email, c.address1, c.address2,
c.city, c.state, c.zip_code,
(select count(*) from attachments where attachments.contact_id = c.id)
from
contacts c
inner join contact_types ct on (ct.code = c.contact_type_code)
where
c.company_id = 1
order by
ct.name, c.lname, c.fname, c.mname;

The question arises why Postgres can't automatically detect that this
query is equivalent. That might come when we start implementing the
"functional dependency" stuff from the standard and can determine that
the group by list uniquely identifies a row from the first join.
Currently we don't do that kind of analysis.

--
greg

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Greg Stark 2010-01-22 14:06:13 Re: BUG #5294: Sorts on more than just the order-by clause
Previous Message WildWezyr 2010-01-22 11:44:16 Re: BUG #5290: Simple loop with insert into and check to avoid duplicate values fails