Superfluous merge/sort

From: Anuradha Ratnaweera <ARatnaweera(at)virtusa(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Superfluous merge/sort
Date: 2003-02-25 04:11:43
Message-ID: 20030225041143.GA4886@aratnaweera.virtusa.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


Question in brief: does the planner/optimizer take into account the
foreign key constraints?

If the answer is "no", please stop reading here.

Here comes the details. Let me give a simple example to illustrate the
situation.

1. We have two tables t1 and t2.

create table t1 (
id integer primary key,
dummy integer
);

create table t2 (
id integer,
dummy integer
);

2. We create indexes on all the non-pkey fields.

create index t1_dummy_idx on t1(dummy);
create index t2_id_idx on t2(id);
create index t2_dummy_idx on t2(dummy);

3. We make t2(id) a foreign key of t1(id).

alter table t2 add constraint t2_fkey foreign key (id) references t1(id);

4. Populate "t1" with unique "id"s from 0 to 19999 with a dummy value.

copy "t1" from stdin;
0 654
1 86097
...
19998 93716
19999 9106
\.

5. Populate "t2" with 50000 "id"s with a normal distribution.

copy "t2" from stdin;
8017 98659
11825 5946
...
8202 35994
8436 19729
\.

Now we are ready to go ...

- First query is to find the "dummy" values with highest frequency.

=> explain select dummy from t2 group by dummy order by count(*) desc limit 10;
NOTICE: QUERY PLAN:

Limit (cost=2303.19..2303.19 rows=10 width=4)
-> Sort (cost=2303.19..2303.19 rows=5000 width=4)
-> Aggregate (cost=0.00..1996.00 rows=5000 width=4)
-> Group (cost=0.00..1871.00 rows=50000 width=4)
-> Index Scan using t2_dummy_idx on t2 (cost=0.00..1746.00 rows=50000 width=4)

EXPLAIN

- Second query is esseitially the same, but we do a merge with "t1" on
the foreign key (just for the sake of illustrating the point).

=> explain select t2.dummy from t1, t2 where t1.id = t2.id group by t2.dummy order by count(*) desc limit 10;
NOTICE: QUERY PLAN:

Limit (cost=7643.60..7643.60 rows=10 width=12)
-> Sort (cost=7643.60..7643.60 rows=5000 width=12)
-> Aggregate (cost=7086.41..7336.41 rows=5000 width=12)
-> Group (cost=7086.41..7211.41 rows=50000 width=12)
-> Sort (cost=7086.41..7086.41 rows=50000 width=12)
-> Merge Join (cost=0.00..3184.00 rows=50000 width=12)
-> Index Scan using t1_pkey on t1 (cost=0.00..638.00 rows=20000 width=4)
-> Index Scan using t2_id_idx on t2 (cost=0.00..1746.00 rows=50000 width=8)

EXPLAIN

Does this mean that the planner/optimizer doesn't take into account the
foreign key constraint?

Anuradha

--

Debian GNU/Linux (kernel 2.4.21-pre4)

Don't worry about the world coming to an end today. It's already tomorrow
in Australia.
-- Charles Schulz

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Mark Halliwell 2003-02-25 08:03:40 Query not using the index
Previous Message Tom Lane 2003-02-24 20:50:35 Re: slow query