Re: Performance Problem with sub-select using array

From: "Aaron Bono" <postgresql(at)aranya(dot)com>
To: "Travis Whitton" <tinymountain(at)gmail(dot)com>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: Performance Problem with sub-select using array
Date: 2006-08-29 14:28:42
Message-ID: bf05e51c0608290728t5c1c7cb8h8501aed7e359e3cb@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Make sure you do a reply to all so you include the list....

On 8/28/06, Travis Whitton <tinymountain(at)gmail(dot)com> wrote:
>
> Ok, I actually got down to business with EXPLAIN ANALYZE. My performance
> was actually suffering from the DISTINCT in my SQL query and not the
> subquery, which I guess isn't run repeatedly since it's not constrained and
> can be cached by the optimizer? Bottom line is, by replacing DISTINCT with
> DISTINCT ON all my index conditions show up in the EXPLAIN output. Best of
> all: Total runtime: 353.588 ms. Thanks for the help.
>
> Travis
>
>
> On 8/28/06, Travis Whitton <tinymountain(at)gmail(dot)com> wrote:
> >
> > I'm pretty sure you're right, which leads me to my next question. Is it
> > possible to pass a column from an outer query to a subquery? For example, is
> > there a way to do something like.
> >
> > SELECT owners.id AS owner_id, array(SELECT dogs.name WHERE owners.id =
> > owner_id) ...
> >
> > I would just do a normal inner-join, but then I get a row for each item
> > that would otherwise come back nicely packaged in the array. The overhead of
> > rearranging the data takes even more time than the subquery approach.
> >
> >
> > On 8/28/06, Aaron Bono < postgresql(at)aranya(dot)com> wrote:
> > >
> > > On 8/24/06, Travis Whitton <tinymountain(at)gmail(dot)com > wrote:
> > >
> > > > Hello all, I'm running the following query on about 6,000 records
> > > > worth of data, and it takes about 8 seconds to complete. Can anyone provide
> > > > any suggestions to improve performance? I have an index on two columns in
> > > > the transacts table (program_id, customer_id). If I specify a number for
> > > > customer.id in the sub-select, query time is reduced to about 2
> > > > seconds, which still seems like a million years for only 6,000 records, but
> > > > I'm guessing that the sub-select can't resolve the id since it's done before
> > > > the outer query, so it scans the entre recordset for every row? Transacts is
> > > > a many to many table for customers and programs. I know this query doesn't
> > > > even reference any columns from programs; however, I dynamically insert
> > > > where clauses to constrain the result set.
> > > >
> > > > SELECT distinct customers.id, first_name, last_name, address1,
> > > > contact_city, contact_state, primary_phone, email, array(select
> > > > programs.program_name from transacts, programs where customer_id =
> > > > customers.id and programs.id = transacts.program_id and
> > > > submit_status = 'success') AS partners from customers, transacts, programs
> > > > where transacts.customer_id = customers.id and transacts.program_id=
> > > > programs.id
> > > >
> > >
> > >
> > > My guess is that your problem is that you may be getting 6000 rows,
> > > but the array(select ....) is having to run once for each of record returned
> > > (so it is running 6000 times).
> > >
> > > Try an explain analyze: http://www.postgresql.org/docs/7.4/interactive/sql-explain.html
> > > - that will reveal more of where the performance problem is.
> > >
> >
==================================================================
Aaron Bono
Aranya Software Technologies, Inc.
http://www.aranya.com
http://codeelixir.com
==================================================================

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Aaron Bono 2006-08-29 14:34:35 Re: Performance Problem with sub-select using array
Previous Message Jaime Casanova 2006-08-29 12:33:42 Re: dinamic sql