nested loops in joins, ambiguous rewrite rules

From: Charles Hornberger <hornberger(at)tabloid(dot)net>
To: pgsql-general(at)postgresql(dot)org
Subject: nested loops in joins, ambiguous rewrite rules
Date: 1999-01-29 11:31:13
Message-ID: 3.0.5.32.19990129033113.00b2f7c0@tabloid.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I've got a Postgres 6.4 DB that is doing, to my mind, funny things.

The main problem is that joins over multiple tables are taking a long, long
time to execute.

When I do EXPLAINs on join queries, I find that the optimizer is choosing
to do a lot of nested loops and sequential scans, instead of Merge Joins or
something that sounds more efficient.

For instance, here's a EXPLAIN SELECT on a view that joins 7 tables. All
of the join columns are indexed.

NOTICE: QUERY PLAN:

Nested Loop (cost=0.00 size=1 width=140)
-> Nested Loop (cost=0.00 size=1 width=124)
-> Nested Loop (cost=0.00 size=1 width=108)
-> Nested Loop (cost=0.00 size=1 width=92)
-> Nested Loop (cost=0.00 size=1 width=76)
-> Merge Join (cost=0.00 size=1 width=60)
-> Seq Scan (cost=0.00 size=0 width=0)
-> Sort (cost=0.00 size=0 width=0)
-> Seq Scan on article
(cost=0.00 size=0 width=32)
-> Seq Scan (cost=0.00 size=0 width=0)
-> Sort (cost=0.00 size=0 width=0)
-> Seq Scan on article_text
(cost=0.00 size=0 width=28)
-> Seq Scan on article_source (cost=0.00 size=0
width=16)
-> Seq Scan on section (cost=0.00 size=0 width=16)
-> Seq Scan on locale (cost=0.00 size=0 width=16)
-> Seq Scan on volume (cost=0.00 size=0 width=16)
-> Seq Scan on issue (cost=0.00 size=0 width=16)

EXPLAIN

This view was created with:

CREATE VIEW all_articles AS
SELECT article.article_id, article.print_publ_date,
article.print_page_no,
article_text.headline, article_text.subhead,
article_source.source_name,
section.section_name,
locale.locale_name,
volume.volume_name,
issue.issue_name
FROM article,
article_text,
article_source,
section,
locale,
volume,
issue
WHERE article.article_id = article_text.article_id
AND article.article_source_id = article_source.source_id
AND article.section_id = section.section_id
AND article.locale_id = locale.locale_id
AND article.volume_id = volume.volume_id
AND article.issue_id = issue.issue_id ;

It takes 5-7 minutes to perform a query on this view, even though there are
no records in any of these tables.

One more thing I noticed. In trying to figure out what's going wrong, I
dumped the structure of an existing DB using `pg_dump -s`. Then when I
tried to create a new DB from the dump file, I got the following error:

CREATE RULE "_RETall_articles" AS ON SELECT TO "all_articles" DO INSTEAD
SELECT "article_id", "print_publ_date", "print_page_no", "headline",
"subhead", "source_name", "section_name", "locale_name", "volume_name",
"issue_name" FROM "article", "article_text", "article_source", "section",
"locale", "volume", "issue" WHERE ((((("article_id" = "article_id") AND
("article_source_id" = "source_id")) AND ("section_id" = "section_id")) AND
("locale_id" = "locale_id")) AND ("volume_id" = "volume_id")) AND
("issue_id" = "issue_id");
ERROR: Column article_id is ambiguous

It seems that pg_dump isn't handling the SQL correctly.

Thanks in advance for any advice.

Charlie

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Dustin Tenney 1999-01-29 16:16:41 Duplicate Data/RAID-1/postgres 6.4.2
Previous Message Marcus Mascari 1999-01-29 11:05:14 Re: [GENERAL] Cascading Updates