Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From: Stephan Szabo <sszabo(at)megazone(dot)bigpanda(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>, Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, Peter Childs <Blue(dot)Dragon(at)blueyonder(dot)co(dot)uk>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)
Date: 2003-09-28 18:29:01
Message-ID: 20030928111850.B73144@megazone.bigpanda.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers


On Sat, 27 Sep 2003, Tom Lane wrote:

> [ continuing a discussion from mid-August ]
>
> Stephan Szabo <sszabo(at)megazone(dot)bigpanda(dot)com> writes:
> >> I assume what you have in mind is to replace
> >> validateForeignKeyConstraint() with something that does a join of the
> >> two tables via an SPI command.
>
> > It'd probably be:
> > MATCH unspecified:
> > SELECT keycolumns FROM referencing_table WHERE
> > (keycolumns) NOT IN (SELECT refcols FROM referenced_table
> > WHERE refcol1 IS NOT NULL AND ... )
> > AND keycolumn1 IS NOT NULL AND ...;
>
> > MATCH FULL: (something like, I haven't tried it)
> > SELECT keycolumns FROM referencing_table WHERE
> > ((keycolumns) NOT IN (SELECT refcols FROM referenced_table
> > WHERE refcol1 IS NOT NULL AND ...)
> > AND
> > (keycolumn1 IS NOT NULL AND ...)
> > )
> > OR ((keycolumn1 IS NOT NULL)!=(keycolumn2 IS NOT NULL) OR ...)
>
> I thought of what seems to be a better design for the check query: use
> a LEFT JOIN and check for NULL in the righthand joined column. For
> example, I think a MATCH UNSPECIFIED on two columns could be tested like
> this:
>
> select f1,f2
> from fk left join pk on (fk.f1=pk.f1 and fk.f2=pk.f2)
> where pk.f1 is null and (fk.f1 is not null and fk.f2 is not null);
>
> and MATCH FULL is the same except
>
> where pk.f1 is null and (fk.f1 is not null OR fk.f2 is not null);
>
> MATCH PARTIAL would be harder; I think you'd need to generate a separate
> query for each subset of the columns, in which you would probe for
> unmatched rows having exactly that subset non-null. But it could be
> done.
>
> Do you see any logical error here?
>
> In some preliminary tests, the planner seems to be able to choose
> reasonable plans for this type of query even without pg_statistic data,
> as long as it knows the table sizes (which it would do after CREATE INDEX).
> So it would work reasonably well during a pg_dump script, I think.

Hmm, my initial testing showed that it really was a little slower
than a more complicated one with NOT EXISTS so I'd abandoned it. How does
it fare for you compared to:
select f1, f2 from fk where not exists (select 1 from pk where pk.f1=fk.f1
and pk.f2=pk.f2) where fk.f1 is not null and fk.f2 is not null;

I believe the above is the appropriate not exists form for match
unspecified.

I've actually got code (that no longer cleanly applies, but...) that uses
the single query version with NOT EXISTS (which could be easily changed to
either of the other forms) and was planning to put it together for a patch
when 7.5 devel started because I figured it wasn't precisely a bug and
wouldn't get accepted for 7.4.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Bruce Momjian 2003-09-28 18:55:03 Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)
Previous Message Dennis Gearon 2003-09-28 18:21:00 Re: Tuning/performance question.

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2003-09-28 18:55:03 Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)
Previous Message Alvaro Herrera 2003-09-28 18:15:53 Re: Improving REINDEX for system indexes (long)