Re: Query planner and foreign key constraints

From: Christian Schröder <cs(at)deriva(dot)de>
To: PostgreSQL <pgsql-general(at)postgresql(dot)org>
Subject: Re: Query planner and foreign key constraints
Date: 2009-01-04 09:44:50
Message-ID: 49608512.9000206@deriva.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Christian Schröder wrote:
> in our PostgreSQL 8.2.9 database I have these tables:
>
> create table table1 (
> key1 char(12),
> key2 integer,
> primary key (key1, key2)
> );
>
> create table table2 (
> key1 char(12),
> key2 integer,
> key3 varchar(20),
> primary key (key1, key2, key3),
> foreign key (key1, key2) references table1 (key1, key2)
> );
>
> Table1 has 896576 rows. Table2 has 1630788 rows. The statistics target
> of the columns key1 and key2 in both tables has been set to 1000. Both
> tables have been analyzed.
> When I join both tables using key1 and key2 there will be exactly
> 1630788 rows because for each row in table2 there *must* exist a row
> in table1. But the query planner doesn't think so:
>
> # explain analyze select * from table1 inner join table2 using (key1,
> key2);
> QUERY
> PLAN
> ----------------------------------------------------------------------------------------------------------------------------------------------
>
> Merge Join (cost=0.00..94916.58 rows=39560 width=44) (actual
> time=0.103..7105.960 rows=1630788 loops=1)
> Merge Cond: ((table1.key1 = table2.key1) AND (table1.key2 =
> table2.key2))
> -> Index Scan using table1_pkey on table1 (cost=0.00..22677.65
> rows=896576 width=20) (actual time=0.038..689.059 rows=896576 loops=1)
> -> Index Scan using table2_pkey on table2 (cost=0.00..59213.16
> rows=1630788 width=44) (actual time=0.049..1108.220 rows=1630788 loops=1)
> Total runtime: 7525.492 ms
> (5 rows)
>
> You can also find the query plan at
> http://explain-analyze.info/query_plans/2648-query-plan-1371.
>
> What can I do to make the query planner realize that the join will
> have 1630788 rows? This join is part of a view which I then use in
> other joins and this wrong assumption leads to really bad performance.
I have not yet found any solution. My queries still take several minutes
to complete. :-(
No ideas at all?

Regards,
Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax: +49 551 489500-91
Hans-Böckler-Straße 2 http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Thomas Kellerer 2009-01-04 10:33:49 Re: PostgreSQL 8.4 download?
Previous Message David Wilson 2009-01-03 23:26:45 Re: Timestamp shift when importing data