Efficiency of EXISTS?

From: Kenneth Tilton <ktilton(at)mcna(dot)net>
To: pgsql-performance(at)postgresql(dot)org
Subject: Efficiency of EXISTS?
Date: 2012-07-23 21:12:39
Message-ID: CAECCA8Zj3d7_tpXGd4N+=9fZxSKO+FrHBbzxhEB8Q=e33G1p9Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

My mental model of the EXISTS clause must be off. This snippet appears at
the end of a series of WITH clauses I suspect are irrelevant:

with etc etc ... , cids as
> (select distinct c.id from ddr2 c
> join claim_entries ce on ce.claim_id = c.id
> where (c.assigned_ddr = 879
> or exists (select 1 from ddr_cdt dc
> where
> dc.sys_user_id = 879
> and dc.document_type = c.document_type
> -- makes it faster: and (dc.cdt_code is null or dc.cdt_code = ce.cpt_code)
> )))
>
> select count(*) from cids

If I uncomment the bit where it says "make it faster" I get decent response
and the graphical analyze display shows the expected user+doctype+cdtcode
index is being used (and nice thin lines suggesting efficient lookup).

As it is, the analyze display shows the expected user+doctype index* being
used but the lines are fat, and performance is an exponential disaster.

* I created the (to me ) redundant user+doctype index trying to get
Postgres to Do the Right Thing(tm), but I can see that was not the issue.

I presume the reason performance drops off a cliff is because there can be
9000 cdt_codes for one user+doctype, but I was hoping EXISTS would just
look to see if there was at least one row matching user+doctype and return
its decision. I have tried select *, select 1, and limit 1 on the nested
select to no avail.

Am I just doing something wrong? I am a relative noob. Is there some other
hint I can give the planner?

Thx, ken

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Merlin Moncure 2012-07-23 21:52:33 Re: Efficiency of EXISTS?
Previous Message Robert Haas 2012-07-23 19:38:35 Re: High CPU Usage