Skip site navigation (1) Skip section navigation (2)

Slow query on CLUTER -ed tables

From: Laszlo Nagy <gandalf(at)shopzeus(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Cc: Daniel Fekete <dani(at)shopzeus(dot)com>
Subject: Slow query on CLUTER -ed tables
Date: 2011-03-23 20:29:16
Message-ID: 4D8A581C.6040505@shopzeus.com (view raw or flat)
Thread:
Lists: pgsql-performance
Given two tables:

CREATE TABLE product_price_history
(
   hid bigint NOT NULL,
   hdate timestamp without time zone NOT NULL,
   id bigint NOT NULL,
   product_id bigint NOT NULL,
.... more columns here
   CONSTRAINT pk_product_price_history PRIMARY KEY (hid);

CREATE INDEX idx_product_price_history_id_hdate
   ON product_price_history
   USING btree
   (id, hdate);


CREATE TABLE product_price_offer_history
(
   hid bigint NOT NULL,
   product_price_id bigint NOT NULL,
   isfeatured smallint NOT NULL,
   price double precision NOT NULL,
   shipping double precision NOT NULL,
     .... some more coumns here
   CONSTRAINT pk_product_price_offer_history PRIMARY KEY (hid, offerno)
);

Stats:

product_price_history - tablesize=23GB, indexes size=4GB, row count = 87 
million
product_price_offer_history - tablesize=24GB, indexes size=7GB, row 
count = 235 million


These tables store historical data of some million products from the 
last year.
The following commands are executed on them daily:

CLUSTER idx_product_price_history_id_hdate on product_price_history;
CLUSTER pk_product_price_offer_history on product_price_offer_history;

Here is a query:

select
   date_part('epoch', min(pph.hdate) )  as hdate_ticks,
   min(ppoh.price+ppoh.shipping) as price_plus_shipping
from
   product_price_history pph
   inner join product_price_offer_history ppoh on ppoh.hid = pph.hid
where pph.id = 37632081
  and ppoh.isfeatured=1
group by ppoh.merchantid,pph.hid,pph.hdate
order by pph.hid asc


I think that the query plan is correct:


"GroupAggregate  (cost=5553554.25..5644888.17 rows=2283348 width=50)"
"  ->  Sort  (cost=5553554.25..5559262.62 rows=2283348 width=50)"
"        Sort Key: pph.hid, ppoh.merchantid, pph.hdate"
"        ->  Nested Loop  (cost=0.00..5312401.66 rows=2283348 width=50)"
"              ->  Index Scan using idx_product_price_history_id_hdate 
on product_price_history pph  (cost=0.00..8279.80 rows=4588 width=16)"
"                    Index Cond: (id = 37632081)"
"              ->  Index Scan using pk_product_price_offer_history on 
product_price_offer_history ppoh  (cost=0.00..1149.86 rows=498 width=42)"
"                    Index Cond: (ppoh.hid = pph.hid)"
"                    Filter: (ppoh.isfeatured = 1)"

So it uses two index scans on the indexes we CLUSTER the tables on. 
Number of rows returned is usually between 100 and 20 000.


Here is the problem. When I first open this query for a given 
identifier, it runs for 100 seconds. When I try to run it again for the 
same identifier it returns the same rows within one second!

The indexes are very well conditioned: from the 235 million rows, any id 
given occurs at most 20 000 times. It is a btree index, so it should 
already be stored sorted, and the 20 000 rows to be returned should fit 
into a few database pages. Even if they are not in the cache, PostgreSQL 
should be able to read the required pages within a second.

I understand that for an index scan, PostgreSQL also needs to read the 
rows from the table. But since these tables are CLUSTER-ed on those 
specific indexes, all the data needed shoud fit on a few database pages 
and PostgreSQL should be able to read them within a second.

Then why it is taking 100 seconds to do the query for the first time and 
why it is just one sec for the second time? Probably my thinking is 
wrong, but I suppose it means that the data is spread on thousands of 
pages on the disk.

How is that possible? What am I doing wrong?

Thanks,

    Laszlo


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


Responses

pgsql-performance by date

Next:From: Josh BerkusDate: 2011-03-23 20:29:17
Subject: Re: Shouldn't we have a way to avoid "risky" plans?
Previous:From: C├ędric VillemainDate: 2011-03-23 20:23:46
Subject: Re: buffercache/bgwriter

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group