Re: Do Views execute underlying query everytime ??

From: PFC <lists(at)boutiquenumerique(dot)com>
To: "Amit V Shah" <ashah(at)tagaudit(dot)com>, "'John A Meinel'" <john(at)arbash-meinel(dot)com>
Cc: "'newz(at)bearfruit(dot)org'" <newz(at)bearfruit(dot)org>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Do Views execute underlying query everytime ??
Date: 2005-06-21 16:59:48
Message-ID: op.ssqb5ymuth1vuj@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


From what you say I understand that you have a huge table like this :

( name, value, id )

And you want to make statistics on (value) according to (name,id).

***************************************************

First of all a "materialized view" doen't exist in postgres, it's just a
word to name "a table automatically updated by triggers".
An example would be like this :

table orders (order_id, ...)
table ordered_products (order_id, product_id, quantity, ...)

If you want to optimize the slow request :
"SELECT product_id, sum(quantity) as total_quantity_ordered
FROM ordered_products GROUP BY product_id"

You would create a cache table like this :
table ordered_products_cache (product_id, quantity)

And add triggers ON UPDATE/INSERT/DELETE on table ordered_products to
update ordered_products_cache accordingly.

Of course in this case everytime someone touches ordered_products, an
update is issued to ordered_products_cache.

***************************************************

In your case I don't think that is the solution, because you do big
updates. With triggers this would mean issuing one update of your
materialized view per row in your big update. This could be slow.

In this case you might want to update the cache table in one request
rather than doing an awful lot of updates.

So you have two solutions :

1- Junk it all and rebuild it from scratch (this can be faster than it
seems)
2- Put the rows to be added in a temporary table, update the cache table
considering the difference between this temporary table and your big
table, then insert the rows in the big table.

This is the fastest solution but it requires a bit more coding (not THAT
much though).

***************************************************

As for the structure of your cache table, you want :

Screen 1 -
Stat1 Stat2 Stat3
Value Value Value
Value Value Value

Screen 2 -
Stat3 Stat1 Stat5
Value Value Value
Value Value Value

You have several lines, so what is that ? is it grouped by date ? I'll
presume it is.

So your screens basically show a subset of :

SELECT date, name, sum(value) FROM table GROUP BY name, date

This is what you should put in your summary table.
Then index it on (date,name) and build your screens with :

SELECT * FROM summary WHERE (date BETWEEN .. AND ..) AND name IN (Stat3,
Stat1, Stat5)

That should be pretty easy ; you get a list of (name,date,value) that you
just have to format accordingly on your screen.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Josh Berkus 2005-06-21 17:13:40 Re: autovacuum suggestions for 500,000,000+ row tables?
Previous Message John A Meinel 2005-06-21 16:54:55 Re: Another question on indexes (drop and recreate)