Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From: Stephan Szabo <sszabo(at)megazone(dot)bigpanda(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>, Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, Peter Childs <Blue(dot)Dragon(at)blueyonder(dot)co(dot)uk>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)
Date: 2003-09-29 20:09:17
Message-ID: 20030929112706.U94710@megazone.bigpanda.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On Sun, 28 Sep 2003, Tom Lane wrote:

> Stephan Szabo <sszabo(at)megazone(dot)bigpanda(dot)com> writes:
> > On Sat, 27 Sep 2003, Tom Lane wrote:
> >> I thought of what seems to be a better design for the check query: use
> >> a LEFT JOIN and check for NULL in the righthand joined column.
>
> > Hmm, my initial testing showed that it really was a little slower
> > than a more complicated one with NOT EXISTS so I'd abandoned it. How does
> > it fare for you compared to:
> > select f1, f2 from fk where not exists (select 1 from pk where pk.f1=fk.f1
> > and pk.f2=pk.f2) where fk.f1 is not null and fk.f2 is not null;
>
> Were you testing against 7.3 or 7.4? On what kinds of tables?
>
> In 7.4 I think that the JOIN would yield as good or better a plan. The
> best possible plan for the NOT EXISTS query is effectively a nestloop
> with inner indexscan, which is great if the FK table is small and the
> PK table is large, but it sucks otherwise. The planner should choose a
> plan of this form for the LEFT JOIN given that combination of table
> sizes, and so there shouldn't be any great difference in runtime in that
> case. But in other combinations, such as large FK and small PK, other
> plan types will beat the pants off nestloop.

As an update, so far I still am getting better results with NOT EXISTS
than the left join. For a 50m row fk, 10k row pk where the rows are just
the keys, I'm getting a plan like
Merge Join
Index scan on pktable
Sort
Seqscan on fktable

which is taking about 2-4 times longer for me than the not exists
depending on sort_mem (at 4096,64000,128000).

When I lowered random_page_cost to 1, I got an indexscan on fktable, but
that hadn't seemed to finish after about 2 hours (as opposed to about
30-35 minutes for the not exists and about 30 min - 1 1/2 hours for the
sort+seqscan version.

I want to do some more tests where there's extraneous data in both tables
and see what that does to the results.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Stephan Szabo 2003-09-29 20:19:22 Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)
Previous Message Jan Wieck 2003-09-29 20:09:05 Re: Triggers and COPY

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephan Szabo 2003-09-29 20:19:22 Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)
Previous Message Marc G. Fournier 2003-09-29 20:04:10 Re: 7.4 status