Re: Weird query plans for my queries, causing terrible performance.

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Arjen van der Meijden <acm(at)tweakers(dot)net>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Weird query plans for my queries, causing terrible performance.
Date: 2003-01-31 01:34:45
Message-ID: 4951.1043976885@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Arjen van der Meijden <acm(at)tweakers(dot)net> writes:
> Of coarse the query needs to be joined with the F_Users table.
> My tries were:
> ...
> SELECT F_Topics.TopicID,F_Topics.StatusID as StatusID, F_Users.UserID
> FROM F_Topics, F_Users
> WHERE (
> (StatusID IN(1) AND F_Topics.Deleted = false AND ForumID = 15)
> OR
> (F_Topics.Lastmessage > '2002-08-01 23:27:03+02' AND F_Topics.Deleted =
> false AND ForumID = 15))
> AND F_Topics.UserID = F_Users.UserID

> (and a inner join'ed version of the above)

You sure you tried the inner-join case? I did

explain SELECT F_Topics.TopicID,F_Topics.StatusID as StatusID, F_Users.UserID
FROM F_Topics JOIN F_Users USING (userid)
WHERE
(StatusID IN(1) AND F_Topics.Deleted = false AND ForumID = 15)
OR
(F_Topics.Lastmessage > '2002-08-01 23:27:03+02' AND F_Topics.Deleted =
false AND ForumID = 15)

and got the plan you wanted:

Nested Loop (cost=0.00..14.51 rows=1 width=16)
-> Index Scan using f_topics_forum_status_deleted, f_topics_forum_lastmessage_deleted on f_topics (cost=0.00..9.67 rows=1 width=12)
Index Cond: (((forumid = 15) AND (statusid = 1) AND (deleted = false)) OR ((forumid = 15) AND (lastmessage > '2002-08-01 17:27:03-04'::timestamp with time zone) AND (deleted = false)))
Filter: (((statusid = 1) AND (deleted = false) AND (forumid = 15)) OR ((lastmessage > '2002-08-01 17:27:03-04'::timestamp with time zone) AND (deleted = false) AND (forumid = 15)))
-> Index Scan using f_users_pkey on f_users (cost=0.00..4.82 rows=1 width=4)
Index Cond: ("outer".userid = f_users.userid)

Of course the cost estimates are bogus because I have no data in the
test tables, but the thing is capable of producing the plan you want.

I believe the reason the query you give above doesn't work like you want
is that the planner first converts the whole WHERE clause to CNF form
(canonical AND-of-OR layout), and then is unable to separate out the
join clause from the spaghetti-like restriction clause. The
CNF-conversion heuristic is usually a good one, but not in this case.

Writing the join clause as a JOIN clause keeps it separate from WHERE,
preventing this mistake from being made. Then the WHERE clause is
already in DNF (canonical OR-of-ANDs) form, which the planner also
likes, so it doesn't rearrange it.

It might be that we could improve the qual-rearrangement heuristics
by trying to keep join clauses separate from single-relation
restrictions. Anyone care to take a look at it? The gold is all hidden
in src/backend/optimizer/prep/prepqual.c ...

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Jean-Christian Imbeault 2003-01-31 03:08:24 Basic SQL join question
Previous Message Joep DeVocht 2003-01-31 01:29:11 Clearing tables questions