Re: Help with tuning this query (with explain analyze finally)

From: John A Meinel <john(at)arbash-meinel(dot)com>
To: Ken <ken(at)upfactor(dot)com>
Cc: Richard Huxton <dev(at)archonet(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Help with tuning this query (with explain analyze finally)
Date: 2005-03-04 19:56:12
Message-ID: 4228BD5C.9020800@arbash-meinel.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers-win32 pgsql-performance

John Arbash Meinel wrote:

> Ken wrote:
>
>> Richard,
>>
>> What do you mean by summary table? Basically a cache of the query
>> into a table with replicated column names of all the joins? I'd
>> probably have to whipe out the table every minute and re-insert the
>> data for each carrier in the system. I'm not sure how expensive this
>> operation would be, but I'm guessing it would be fairly heavy-weight.
>> And maintaince would be a lot harder because of the duplicated
>> columns, making refactorings on the database more error-prone. Am I
>> understanding your suggestion correctly? Please correct me if I am.
>>
>>> Can you turn the problem around? Calculate what you want for all
>>> users (once every 60 seconds) and stuff those results into a summary
>>> table. Then let the users query the summary table as often as they
>>> like (with the understanding that the figures aren't going to update
>>> any faster than once a minute)
>>
>>
> It's the same idea of a materialized view, or possibly just a lazy cache.
>
...

> This unrolls all of the work, a table which should be really fast to
> query. If this query takes less than 10s to generate, than just have a
> service run it every 60s. I think for refreshing, it is actually faster
> to drop the table and recreate it, rather than deleteing the entries.
> Dropping also has the advantage that if you ever add more rows to s or
> ss, then the table automatically gets the new entries.
>
Just as a small update. If completely regenerating the cache takes to
long, the other way to do it, is to create insert and update triggers on
s and ss, such that as they change, they also update the cachedview table.

Something like

CREATE TRIGGER on_ss_ins AFTER INSERT ON ss FOR EACH ROW EXECUTE
INSERT INTO cached_view SELECT p.id as person_id, s.*, ss.* FROM
<the big stuff> WHERE s.id = NEW.id;

This runs the same query, but notice that the WHERE means it only allows
the new row. So this query should run fast. It is a little bit of
overhead on each of your inserts, but it should keep the cache
up-to-date. With something like this, I would have the final client
query still include the date restriction, since you accumulate older
rows into the cached view. But you can run a daily process that prunes
out everything older than 31 days, which keeps the cachedview from
getting really large.

John
=:->

In response to

Browse pgsql-hackers-win32 by date

  From Date Subject
Next Message E U 2005-03-05 19:21:02 Question about odbc
Previous Message Josh Berkus 2005-03-04 18:29:11 Re: Help with tuning this query (with explain analyze finally)

Browse pgsql-performance by date

  From Date Subject
Next Message Dave Held 2005-03-04 20:59:01 MAIN vs. PLAIN
Previous Message Josh Berkus 2005-03-04 18:29:11 Re: Help with tuning this query (with explain analyze finally)