Given two tables:
CREATE TABLE product_price_history
hid bigint NOT NULL,
hdate timestamp without time zone NOT NULL,
id bigint NOT NULL,
product_id bigint NOT NULL,
.... more columns here
CONSTRAINT pk_product_price_history PRIMARY KEY (hid);
CREATE INDEX idx_product_price_history_id_hdate
CREATE TABLE product_price_offer_history
hid bigint NOT NULL,
product_price_id bigint NOT NULL,
isfeatured smallint NOT NULL,
price double precision NOT NULL,
shipping double precision NOT NULL,
.... some more coumns here
CONSTRAINT pk_product_price_offer_history PRIMARY KEY (hid, offerno)
product_price_history - tablesize=23GB, indexes size=4GB, row count = 87
product_price_offer_history - tablesize=24GB, indexes size=7GB, row
count = 235 million
These tables store historical data of some million products from the
The following commands are executed on them daily:
CLUSTER idx_product_price_history_id_hdate on product_price_history;
CLUSTER pk_product_price_offer_history on product_price_offer_history;
Here is a query:
date_part('epoch', min(pph.hdate) ) as hdate_ticks,
min(ppoh.price+ppoh.shipping) as price_plus_shipping
inner join product_price_offer_history ppoh on ppoh.hid = pph.hid
where pph.id = 37632081
group by ppoh.merchantid,pph.hid,pph.hdate
order by pph.hid asc
I think that the query plan is correct:
"GroupAggregate (cost=5553554.25..5644888.17 rows=2283348 width=50)"
" -> Sort (cost=5553554.25..5559262.62 rows=2283348 width=50)"
" Sort Key: pph.hid, ppoh.merchantid, pph.hdate"
" -> Nested Loop (cost=0.00..5312401.66 rows=2283348 width=50)"
" -> Index Scan using idx_product_price_history_id_hdate
on product_price_history pph (cost=0.00..8279.80 rows=4588 width=16)"
" Index Cond: (id = 37632081)"
" -> Index Scan using pk_product_price_offer_history on
product_price_offer_history ppoh (cost=0.00..1149.86 rows=498 width=42)"
" Index Cond: (ppoh.hid = pph.hid)"
" Filter: (ppoh.isfeatured = 1)"
So it uses two index scans on the indexes we CLUSTER the tables on.
Number of rows returned is usually between 100 and 20 000.
Here is the problem. When I first open this query for a given
identifier, it runs for 100 seconds. When I try to run it again for the
same identifier it returns the same rows within one second!
The indexes are very well conditioned: from the 235 million rows, any id
given occurs at most 20 000 times. It is a btree index, so it should
already be stored sorted, and the 20 000 rows to be returned should fit
into a few database pages. Even if they are not in the cache, PostgreSQL
should be able to read the required pages within a second.
I understand that for an index scan, PostgreSQL also needs to read the
rows from the table. But since these tables are CLUSTER-ed on those
specific indexes, all the data needed shoud fit on a few database pages
and PostgreSQL should be able to read them within a second.
Then why it is taking 100 seconds to do the query for the first time and
why it is just one sec for the second time? Probably my thinking is
wrong, but I suppose it means that the data is spread on thousands of
pages on the disk.
How is that possible? What am I doing wrong?
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
pgsql-performance by date
|Next:||From: Josh Berkus||Date: 2011-03-23 20:29:17|
|Subject: Re: Shouldn't we have a way to avoid "risky" plans?|
|Previous:||From: Cédric Villemain||Date: 2011-03-23 20:23:46|
|Subject: Re: buffercache/bgwriter|