Re: BUG #1878: Different execution plans for the same query.

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alexei Novakov <alexei-novakov(at)rogers(dot)com>
Cc: pgsql-bugs(at)postgreSQL(dot)org
Subject: Re: BUG #1878: Different execution plans for the same query.
Date: 2005-09-23 22:56:04
Message-ID: 7272.1127516164@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Alexei Novakov <alexei-novakov(at)rogers(dot)com> writes:
> Attaching the dumps of the tables. And here is the
> query:

> select T1.MAIN_ID, T6.LEVEL_IND, T12.MAIN_ID
> from TMP_1 T1, TMP_2 T2, TMP_2 T3, TMP_2 T4, TMP_2 T5,
> TMP_3 T6,
> TMP_2 T7, TMP_2 T8, TMP_2 T9, TMP_2 T10, TMP_2
> T11, TMP_1 T12
> where T2.ID_2 = T1.MAIN_ID
> and T2.ID_TO_ID_TYPE = 201269
> and 990765472621084681 = T2.ID_1
> and T4.ID_1 = T12.MAIN_ID
> and T4.ID_TO_ID_TYPE = 201143
> and T3.ID_TO_ID_TYPE = 201143
> and 990765472621084681 = T3.ID_1
> and T4.ID_2 = T3.ID_2
> and T7.ID_1 = T12.MAIN_ID
> and T7.ID_TO_ID_TYPE = 201224
> and T6.ID_TO_ID_TYPE = 201209
> and T6.LEVEL_IND != 0
> and T5.ID_TO_ID_TYPE = 201224
> and 990765472621084681 = T5.ID_1
> and T6.ID_1 = T5.ID_2
> and T7.ID_2 = T6.ID_2
> and T9.ID_1 = T12.MAIN_ID
> and T9.ID_TO_ID_TYPE = 201125
> and T8.ID_TO_ID_TYPE = 201125
> and 990765472621084681 = T8.ID_1
> and T9.ID_2 = T8.ID_2
> and T11.ID_1 = T12.MAIN_ID
> and T11.ID_TO_ID_TYPE = 201071
> and T10.ID_TO_ID_TYPE = 201071
> and 990765472621084681 = T10.ID_1
> and T11.ID_2 = T10.ID_2
> and T12.ID_TYPE = 301286

I looked into this and found that the problem is bad rowcount estimates,
which lead the planner to make plans that are good only by chance.
Since you have enough tables in the query to exceed geqo_threshold, the
random aspect of the behavior isn't surprising --- but increasing
geqo_threshold isn't likely to solve your problem, it'll probably just
produce repeatable not-very-good behavior :-(

The reason why the rowcount estimates are so far off is that there's a
lot of correlation that the planner doesn't recognize. For instance
you've got

and T4.ID_TO_ID_TYPE = 201143
and T3.ID_TO_ID_TYPE = 201143
and T4.ID_2 = T3.ID_2

where T3 and T4 are actually the same table and so the number of rows
matching the above conditions is much higher than a naive guess would
suggest. It's possible that the planner could do better if it had
cross-column correlation statistics, but it doesn't.

It's hard for me to tell whether any of these conditions are redundant
given your schema, but if they are I'd recommend removing the redundant
conditions --- they aren't doing much for you except fooling the
planner.

regards, tom lane

Browse pgsql-bugs by date

  From Date Subject
Next Message Vaccaro 2005-09-24 15:11:25 HELP
Previous Message Mark Stosberg 2005-09-23 21:42:02 Re: suggestion: fix 'now' -> CURRENT_TIMESTAMP