Materialized views in PostgreSQL use the rule system like views do, but persist the results in a table-like form. The main differences between:
CREATE MATERIALIZED VIEW mymatview AS SELECT * FROM mytab;
CREATE TABLE mymatview AS SELECT * FROM mytab;
are that the materialized view cannot subsequently be directly updated and that the query used to create the materialized view is stored in exactly the same way that a view's query is stored, so that fresh data can be generated for the materialized view with:
REFRESH MATERIALIZED VIEW mymatview;
The information about a materialized view in the PostgreSQL system catalogs is exactly the same as it is for a table or view. So for the parser, a materialized view is a relation, just like a table or a view. When a materialized view is referenced in a query, the data is returned directly from the materialized view, like from a table; the rule is only used for populating the materialized view.
While access to the data stored in a materialized view is often much faster than accessing the underlying tables directly or through a view, the data is not always current; yet sometimes current data is not needed. Consider a table which records sales:
CREATE TABLE invoice ( invoice_no integer PRIMARY KEY, seller_no integer, -- ID of salesperson invoice_date date, -- date of sale invoice_amt numeric(13,2) -- amount of sale );
If people want to be able to quickly graph historical sales data, they might want to summarize, and they may not care about the incomplete data for the current date:
CREATE MATERIALIZED VIEW sales_summary AS SELECT seller_no, invoice_date, sum(invoice_amt)::numeric(13,2) as sales_amt FROM invoice WHERE invoice_date < CURRENT_DATE GROUP BY seller_no, invoice_date ORDER BY seller_no, invoice_date; CREATE UNIQUE INDEX sales_summary_seller ON sales_summary (seller_no, invoice_date);
This materialized view might be useful for displaying a graph in the dashboard created for salespeople. A job could be scheduled to update the statistics each night using this SQL statement:
REFRESH MATERIALIZED VIEW sales_summary;
Another use for a materialized view is to allow faster access to data brought across from a remote system, through a foreign data wrapper. A simple example using file_fdw is below, with timings, but since this is using cache on the local system the performance difference on a foreign data wrapper to a remote system could be greater. Setup:
CREATE EXTENSION file_fdw; CREATE SERVER local_file FOREIGN DATA WRAPPER file_fdw; CREATE FOREIGN TABLE words (word text NOT NULL) SERVER local_file OPTIONS (filename '/etc/dictionaries-common/words'); CREATE MATERIALIZED VIEW wrd AS SELECT * FROM words; CREATE UNIQUE INDEX wrd_word ON wrd (word); CREATE EXTENSION pg_trgm; CREATE INDEX wrd_trgm ON wrd USING gist (word gist_trgm_ops); VACUUM ANALYZE wrd;
Now let's spell-check a word. Using file_fdw directly:
SELECT count(*) FROM words WHERE word = 'caterpiler'; count ------- 0 (1 row)
The plan is:
Aggregate (cost=4125.19..4125.20 rows=1 width=0) (actual time=26.013..26.014 rows=1 loops=1) -> Foreign Scan on words (cost=0.00..4124.70 rows=196 width=0) (actual time=26.011..26.011 rows=0 loops=1) Filter: (word = 'caterpiler'::text) Rows Removed by Filter: 99171 Foreign File: /etc/dictionaries-common/words Foreign File Size: 938848 Total runtime: 26.081 ms
If the materialized view is used instead, the query is much faster:
Aggregate (cost=4.44..4.45 rows=1 width=0) (actual time=0.074..0.074 rows=1 loops=1) -> Index Only Scan using wrd_word on wrd (cost=0.42..4.44 rows=1 width=0) (actual time=0.071..0.071 rows=0 loops=1) Index Cond: (word = 'caterpiler'::text) Heap Fetches: 0 Total runtime: 0.119 ms
Either way, the word is spelled wrong, so let's look for what we might have wanted. Again using file_fdw:
SELECT word FROM words ORDER BY word <-> 'caterpiler' LIMIT 10; word --------------- cater caterpillar Caterpillar caterpillars caterpillar's Caterpillar's caterer caterer's caters catered (10 rows)
Limit (cost=2195.70..2195.72 rows=10 width=32) (actual time=218.904..218.906 rows=10 loops=1) -> Sort (cost=2195.70..2237.61 rows=16765 width=32) (actual time=218.902..218.904 rows=10 loops=1) Sort Key: ((word <-> 'caterpiler'::text)) Sort Method: top-N heapsort Memory: 25kB -> Foreign Scan on words (cost=0.00..1833.41 rows=16765 width=32) (actual time=0.046..200.965 rows=99171 loops=1) Foreign File: /etc/dictionaries-common/words Foreign File Size: 938848 Total runtime: 218.966 ms
Using the materialized view:
Limit (cost=0.28..1.02 rows=10 width=9) (actual time=24.916..25.079 rows=10 loops=1) -> Index Scan using wrd_trgm on wrd (cost=0.28..7383.70 rows=99171 width=9) (actual time=24.914..25.076 rows=10 loops=1) Order By: (word <-> 'caterpiler'::text) Total runtime: 25.884 ms
If you can tolerate periodic update of the remote data to the local database, the performance benefit can be substantial.
Please use this form to add your own comments regarding your experience with particular features of PostgreSQL, clarifications of the documentation, or hints for other users. Please note, this is not a support forum, and your IP address will be logged. If you have a question or need help, please see the faq, try a mailing list, or join us on IRC. Note that submissions containing URLs or other keywords commonly found in 'spam' comments may be silently discarded. Please contact the webmaster if you think this is happening to you in error.
Proceed to the comment form.