Re: Help with tuning this query (with explain analyze finally)

From: John Arbash Meinel <john(at)arbash-meinel(dot)com>
To: Ken <ken(at)upfactor(dot)com>
Cc: Richard Huxton <dev(at)archonet(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Help with tuning this query (with explain analyze finally)
Date: 2005-03-04 16:56:39
Message-ID: 42289347.7040607@arbash-meinel.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers-win32 pgsql-performance

Ken wrote:

> Richard,
>
> What do you mean by summary table? Basically a cache of the query
> into a table with replicated column names of all the joins? I'd
> probably have to whipe out the table every minute and re-insert the
> data for each carrier in the system. I'm not sure how expensive this
> operation would be, but I'm guessing it would be fairly heavy-weight.
> And maintaince would be a lot harder because of the duplicated
> columns, making refactorings on the database more error-prone. Am I
> understanding your suggestion correctly? Please correct me if I am.
>
>> Can you turn the problem around? Calculate what you want for all
>> users (once every 60 seconds) and stuff those results into a summary
>> table. Then let the users query the summary table as often as they
>> like (with the understanding that the figures aren't going to update
>> any faster than once a minute)
>
It's the same idea of a materialized view, or possibly just a lazy cache.

Just try this query:

CREATE TABLE cachedview AS
select p.id as person_id, s.*, ss.*

from shipment s
inner join shipment_status ss on s.current_status_id=ss.id
inner join release_code rc on ss.release_code_id=rc.id
left outer join driver d on s.driver_id=d.id
left outer join carrier_code cc on s.carrier_code_id=cc.id
where s.carrier_code_id in (
select cc.id
from person p
inner join carrier_to_person ctp on p.id=ctp.person_id
inner join carrier c on ctp.carrier_id=c.id
inner join carrier_code cc on cc.carrier_id = c.id
)
and s.current_status_id is not null
and s.is_purged=false
and(rc.number='9' )
and(ss.date>=current_date-31 )

order by ss.date desc ;

Notice that I took out the internal p.id = blah.
Then you can do:

CREATE INDEX cachedview_person_id_idx ON cachedview(person_id);

Then from the client side, you can just run:
SELECT * from cachedview WHERE person_id = <id>;

Now, this assumes that rc.number='9' is what you always want. If that
isn't the case, you could refactor a little bit.

This unrolls all of the work, a table which should be really fast to
query. If this query takes less than 10s to generate, than just have a
service run it every 60s. I think for refreshing, it is actually faster
to drop the table and recreate it, rather than deleteing the entries.
Dropping also has the advantage that if you ever add more rows to s or
ss, then the table automatically gets the new entries.

Another possibility, is to have the "cachedview" not use "s.*, ss.*",
but instead just include whatever the primary keys are for those tables.
Then your final query becomes:

SELECT s.*, ss.* FROM cachedview cv, s, ss WHERE cv.person_id = <id>,
cv.s_id = s.<pkey>, cv.ss_id = ss.<pkey>;

Again, this should be really fast, because you should have an index on
cv.person_id and only have say 300 rows there, and then you are just
fetching a few rows from s and ss. You can also use this time to do some
of your left joins against other tables.

Does this make sense? The biggest advantage you have is your "60s"
statement. With that in hand, I think you can do a lot of caching
optimizations.

John
=:->

In response to

Responses

Browse pgsql-hackers-win32 by date

  From Date Subject
Next Message John Arbash Meinel 2005-03-04 17:07:35 Re: Help with tuning this query (with explain analyze finally)
Previous Message Ken 2005-03-04 16:36:26 Re: Help with tuning this query (with explain analyze finally)

Browse pgsql-performance by date

  From Date Subject
Next Message John Arbash Meinel 2005-03-04 17:07:35 Re: Help with tuning this query (with explain analyze finally)
Previous Message Ken 2005-03-04 16:36:26 Re: Help with tuning this query (with explain analyze finally)