Re: IN vs EXISTS equivalence

From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Simon Riggs" <simon(at)2ndquadrant(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: IN vs EXISTS equivalence
Date: 2008-09-03 21:27:29
Message-ID: 48BEBAF1.EE98.0025.0@wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>>> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> If you're still interested in testing CVS HEAD's handling of EXISTS,
> I've about finished what I wanted to do with it.

It's been hectic here, but I've managed to let some stuff run in the
background using an old test case from here:

http://archives.postgresql.org/pgsql-hackers/2007-03/msg01408.php

explain analyze
SELECT "A"."adjustmentNo", "A"."tranNo", "A"."countyNo", "H"."date",
"H"."userId", "H"."time"
FROM "Adjustment" "A"
JOIN "TranHeader" "H" ON ("H"."tranId" = "A"."adjustmentNo" AND
"H"."countyNo" = "A"."countyNo" AND "H"."tranNo" = "A"."tranNo")
WHERE "H"."tranType" = 'A'
AND "A"."date" > DATE '2006-01-01'
AND "H"."countyNo" = 66
AND "A"."countyNo" = 66
AND EXISTS
(
SELECT 1 FROM "TranDetail" "D"
WHERE "D"."tranNo" = "H"."tranNo"
AND "D"."countyNo" = "H"."countyNo"
AND "D"."caseNo" LIKE '2006TR%'
)
;

On development machine using 8.3.3:

Nested Loop (cost=0.00..399190.49 rows=1 width=37) (actual
time=7184.068..3249391.592 rows=12372 loops=1)
Join Filter: (("A"."adjustmentNo")::text = ("H"."tranId")::text)
-> Seq Scan on "Adjustment" "A" (cost=0.00..5218.87 rows=247869
width=17) (actual time=9.804..1695.691 rows=248674 loops=1)
Filter: (((date)::date > '2006-01-01'::date) AND
(("countyNo")::smallint = 66))
-> Index Scan using "TranHeader_pkey" on "TranHeader" "H"
(cost=0.00..1.57 rows=1 width=37) (actual time=13.056..13.056 rows=0
loops=248674)
Index Cond: ((("H"."tranNo")::integer =
("A"."tranNo")::integer) AND (("H"."countyNo")::smallint = 66))
Filter: ((("H"."tranType")::text = 'A'::text) AND (subplan))
SubPlan
-> Index Scan using "TranDetail_TranDetCaseNo" on
"TranDetail" "D" (cost=0.00..1.29 rows=1 width=0) (actual
time=13.017..13.017 rows=0 loops=248674)
Index Cond: ((("caseNo")::text >= '2006TR'::text) AND
(("caseNo")::text < '2006TS'::text) AND (("tranNo")::integer =
($0)::integer) AND (("countyNo")::smallint = ($1)::smallint))
Filter: (("caseNo")::text ~~ '2006TR%'::text)
Total runtime: 3249404.662 ms

On the same machine, using the snapshot from this morning:

Nested Loop (cost=1963.24..38483.54 rows=1 width=37) (actual
time=372.964..986.994 rows=12372 loops=1)
Join Filter: (("H"."tranNo")::integer = ("A"."tranNo")::integer)
-> Merge Semi Join (cost=1963.24..31012.28 rows=21317 width=37)
(actual time=372.926..839.298 rows=12372 loops=1)
Merge Cond: (("H"."tranNo")::integer =
("D"."tranNo")::integer)
Join Filter: (("D"."countyNo")::smallint =
("H"."countyNo")::smallint)
-> Index Scan using "TranHeader_pkey" on "TranHeader" "H"
(cost=0.00..27848.57 rows=322517 width=37) (actual time=3.722..526.124
rows=311963 loops=1
)
Index Cond: (("countyNo")::smallint = 66)
Filter: (("tranType")::text = 'A'::text)
-> Sort (cost=1963.17..2027.08 rows=25565 width=6) (actual
time=171.512..191.688 rows=76597 loops=1)
Sort Key: "D"."tranNo"
Sort Method: quicksort Memory: 6663kB
-> Index Scan using "TranDetail_TranDetCaseNo" on
"TranDetail" "D" (cost=0.00..91.57 rows=25565 width=6) (actual
time=0.031..100.688 rows=7659
7 loops=1)
Index Cond: ((("caseNo")::text >= '2006TR'::text)
AND (("caseNo")::text < '2006TS'::text) AND (("countyNo")::smallint =
66))
Filter: (("caseNo")::text ~~ '2006TR%'::text)
-> Index Scan using "Adjustment_pkey" on "Adjustment" "A"
(cost=0.00..0.34 rows=1 width=17) (actual time=0.009..0.010 rows=1
loops=12372)
Index Cond: ((("A"."adjustmentNo")::text =
("H"."tranId")::text) AND (("A"."countyNo")::smallint = 66))
Filter: (("A".date)::date > '2006-01-01'::date)
Total runtime: 991.097 ms

The chosen plan looks very reasonable, and performs very well. Nice!

After converting the database I originally forgot to run VACUUM
ANALYZE. Even planning "blind" and doing hint-bit rewrites it picked
a plan which ran in under 10 seconds.

I'll be running other tests as I get the chance.

-Kevin

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Chernow 2008-09-03 22:10:53 Re: libpq events patch
Previous Message Alvaro Herrera 2008-09-03 20:19:46 Re: [PATCH] Cleanup of GUC units code