Re: Unnecessary repeat condition for a self inner join

From: "Robins Tharakan" <tharakan(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-sql <pgsql-sql(at)postgresql(dot)org>
Subject: Re: Unnecessary repeat condition for a self inner join
Date: 2008-07-12 08:05:02
Message-ID: 36af4bed0807120105l5545d6e3s27428f1f2841c0f6@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

> While the optimizer theoretically could deduce the extra restriction
> condition, it doesn't attempt to. It's extremely unclear that the extra
> cycles to look for such cases would be repaid on average, because cases
> like this aren't that common. The current state of affairs is that
> the system will deduce implied equality conditions, but not implied
> inequality conditions.

One good thing is that the equality conditions are taken care of. But I fail
to understand why do you believe that the second case is rare. I think the
optimizer would (in all self-join inequality conditions) tend towards a
table scan, which for a large table is a disaster. (Of course, the index
scan would help only if the result-set is small)

Besides, I did a simple test and although you are right about the optimizer
deducing implied equality conditions, this holds true only for a direct
join. In the second query, the optimizer recommends a table scan even for a
simple IN() condition.

Is that normal ?

Regards,
*Robins Tharakan*

Query 1:

SELECT n1.scheme_code
FROM nav n1
INNER JOIN nav n2 ON n1.scheme_code = n2.scheme_code
WHERE n1.scheme_code = 290

"Nested Loop (cost=0.00..16147232.47 rows=4796100 width=4)"
" -> Index Scan using nav__schemecode_date_lookup3b on nav n1
(cost=0.00..7347.91 rows=2190 width=4)"
" Index Cond: (scheme_code = 290)"
" -> Index Scan using nav__schemecode_date_lookup3b on nav n2
(cost=0.00..7347.91 rows=2190 width=4)"
" Index Cond: (290 = scheme_code)"

Query 2:

SELECT n1.scheme_code
FROM nav n1
INNER JOIN nav n2 ON n1.scheme_code = n2.scheme_code
WHERE n1.scheme_code IN (1, 2)

"Hash Join (cost=206004.00..431864.83 rows=10720451 width=4)"
" Hash Cond: (n1.scheme_code = n2.scheme_code)"
" -> Bitmap Heap Scan on nav n1 (cost=139.62..13663.13 rows=4378
width=4)"
" Recheck Cond: (scheme_code = ANY ('{1,2}'::integer[]))"
" -> Bitmap Index Scan on nav__schemecode_date_lookup3b
(cost=0.00..138.53 rows=4378 width=0)"
" Index Cond: (scheme_code = ANY ('{1,2}'::integer[]))"
" -> Hash (cost=112078.06..112078.06 rows=5395306 width=4)"
" -> Seq Scan on nav n2 (cost=0.00..112078.06 rows=5395306
width=4)"

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Dave Page 2008-07-12 08:40:37 Re: Rollback in Postgres
Previous Message Simon Riggs 2008-07-12 07:56:17 Re: Rollback in Postgres