inheritance, and plans

From: Grzegorz Jaśkiewicz <gryzman(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: inheritance, and plans
Date: 2009-02-06 18:00:30
Message-ID: 2f4958ff0902061000h6551de8an83c39f2de1fffb7a@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hey folks,

I have few tables, that inherit from table X.
The query I perform, tries to obtain information about changes in all
tables that inherit from X,
aside from that, I have table Y that keeps another information related
to changes, but in bit different schema.
Anyway, there is one unique id field, shared amongst them.

When I want to obtain all that information, I do:

select updateid from (
select updateid from r.skel
union all
select updateid from r.history
) as foo
where updateid > 1232634919168805;

And what amazes me, is that no matter what value I choose in where X >
, postgres will always think this is the best plan:

QUERY PLAN
-------------------------------------------------------------------------------------------------------
Subquery Scan foo (cost=0.00..167736.75 rows=978726 width=8)
Filter: (foo.updateid > 1232634919168805::bigint)
-> Append (cost=0.00..131034.54 rows=2936177 width=8)
-> Subquery Scan "*SELECT* 1" (cost=0.00..130999.94
rows=2934947 width=8)
-> Result (cost=0.00..101650.47 rows=2934947 width=8)
-> Append (cost=0.00..101650.47 rows=2934947 width=8)
-> Seq Scan on skel (cost=0.00..24.80
rows=1480 width=8)
-> Seq Scan on a skel
(cost=0.00..22028.96 rows=923596 width=8)
-> Seq Scan on b skel (cost=0.00..8.01
rows=201 width=8)
-> Seq Scan on c skel (cost=0.00..1.81
rows=81 width=8)
-> Seq Scan on d skel
(cost=0.00..22117.94 rows=923594 width=8)
-> Seq Scan on e skel (cost=0.00..6.03
rows=303 width=8)
-> Seq Scan on f skel (cost=0.00..6.02
rows=202 width=8)
-> Seq Scan on g skel (cost=0.00..1987.40
rows=85140 width=8)
-> Seq Scan on h skel (cost=0.00..1.01
rows=1 width=8)
-> Seq Scan on i skel
(cost=0.00..55454.99 rows=999999 width=8)
-> Seq Scan on j skel (cost=0.00..13.50
rows=350 width=8)
-> Seq Scan on history (cost=0.00..22.30 rows=1230 width=8)
(18 rows)

so my question is, why isn't postgres use index on some tables , and
search for the X > N individually ?

Because, yet - I tried to recreate problem, but I wasn't able. I have
this test db:

create schema r;

create sequence fooseq;
create domain r.fooint AS bigint NOT NULL default nextval('fooseq');

create table skel(aid r.fooint, cd timestamp default now() not null);

create table one( a bigserial, aid r.fooint, cd timestamp not null);
create table two( a bigserial, aid r.fooint, cd timestamp not null);
create table three( a bigserial, aid r.fooint, cd timestamp not null);
create table four( a bigserial, aid r.fooint, cd timestamp not null);
create table five( a bigserial, aid r.fooint, cd timestamp not null);

create unique index one_aid on one(aid);
create unique index two_aid on two(aid);
create unique index three_aid on three(aid);
create unique index four_aid on four(aid);
create unique index five_aid on five(aid);

create table numbers( something int default random()*666, aid_foo r.fooint);

create unique index numbers_aid on numbers(aid_foo);

insert into one(a, cd) select generate_series(1,2000000), now();
insert into two(a, cd) select generate_series(1,200000), now();
insert into three(a, cd) select generate_series(1,2200000), now();
insert into four(a, cd) select generate_series(1,2200000), now();
insert into five(a, cd) select generate_series(1,2200000), now();

insert into numbers(something) select generate_series(1,870000);

alter table one inherit skel;
alter table two inherit skel;
alter table three inherit skel;
alter table four inherit skel;
alter table five inherit skel;

But no matter how many tables I throw in ( and I got to 20 ) - it will
always do it right:
gjaskie=# explain select aid from (select aid from skel union all
select aid_foo as aid from numbers) AS foo where aid > 999000;
QUERY PLAN
---------------------------------------------------------------------------------------------
Result (cost=0.00..178034.88 rows=8661268 width=8)
-> Append (cost=0.00..178034.88 rows=8661268 width=8)
-> Seq Scan on skel (cost=0.00..32.12 rows=590 width=8)
Filter: ((aid)::bigint > 999000)
-> Index Scan using one_aid on one skel
(cost=0.00..34549.76 rows=991445 width=8)
Index Cond: ((aid)::bigint > 999000)
-> Seq Scan on two skel (cost=0.00..3774.00 rows=199980 width=8)
Filter: ((aid)::bigint > 999000)
-> Seq Scan on three skel (cost=0.00..41513.00 rows=2199780 width=8)
Filter: ((aid)::bigint > 999000)
-> Seq Scan on four skel (cost=0.00..41513.00 rows=2199780 width=8)
Filter: ((aid)::bigint > 999000)
-> Seq Scan on five skel (cost=0.00..41513.00 rows=2199780 width=8)
Filter: ((aid)::bigint > 999000)
-> Seq Scan on numbers (cost=0.00..15140.00 rows=869913 width=8)
Filter: ((aid_foo)::bigint > 999000)
(16 rows)

Time: 36.326 ms

But, if I add another union, it screws it up:

gjaskie=# explain select aid from (select aid from skel union all
select aid_foo as aid from numbers union all select 1 aid) AS foo
where aid > 999000;
QUERY PLAN
--------------------------------------------------------------------------------------------------------
Subquery Scan foo (cost=0.00..374659.56 rows=3223924 width=8)
Filter: (foo.aid > 999000)
-> Append (cost=0.00..253762.42 rows=9671771 width=8)
-> Result (cost=0.00..253762.40 rows=9671770 width=8)
-> Append (cost=0.00..253762.40 rows=9671770 width=8)
-> Result (cost=0.00..144079.70 rows=8801770 width=8)
-> Append (cost=0.00..144079.70
rows=8801770 width=8)
-> Seq Scan on skel
(cost=0.00..27.70 rows=1770 width=8)
-> Seq Scan on one skel
(cost=0.00..32739.00 rows=2000000 width=8)
-> Seq Scan on two skel
(cost=0.00..3274.00 rows=200000 width=8)
-> Seq Scan on three skel
(cost=0.00..36013.00 rows=2200000 width=8)
-> Seq Scan on four skel
(cost=0.00..36013.00 rows=2200000 width=8)
-> Seq Scan on five skel
(cost=0.00..36013.00 rows=2200000 width=8)
-> Seq Scan on numbers (cost=0.00..12965.00
rows=870000 width=8)
-> Subquery Scan "*SELECT* 3" (cost=0.00..0.02 rows=1 width=0)
-> Result (cost=0.00..0.01 rows=1 width=0)
(16 rows)

Time: 1.502 ms

now the question is, how my test db's query:
select aid from (select aid from skel union all select aid_foo as
aid from numbers union all select 1 aid) AS foo where aid > 999000;

differ from original:
select updateid from ( select updateid from r.skel union all select
updateid from r.history ) as foo where updateid > 1232634919168805;

Oh, and the value N doesn't change the plan here either :/

tested on both 8.3 and 8.4, same results..
ideas welcomed

--
GJ

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Arjen van der Meijden 2009-02-06 18:09:40 Re: suggestions for postgresql setup on Dell 2950 , PERC6i controller
Previous Message Scott Carey 2009-02-06 17:57:41 Re: suggestions for postgresql setup on Dell 2950 , PERC6i controller