Re: More efficient RI checks - take 2

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Corey Huinker <corey(dot)huinker(at)gmail(dot)com>, Antonin Houska <ah(at)cybertec(dot)at>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: More efficient RI checks - take 2
Date: 2020-04-23 14:35:32
Message-ID: 23914.1587652532@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Wed, Apr 22, 2020 at 6:40 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> But it's not entirely clear to me that we know the best plan for a
>> statement-level RI action with sufficient certainty to go that way.

> Well, I guess I'd naively think we want an index scan on a plain
> table. It is barely possible that in some corner case a sequential
> scan would be faster, but could it be enough faster to save the cost
> of planning? I doubt it, but I just work here.

I think we're failing to communicate here. I agree that if the goal
is simply to re-implement what the RI triggers currently do --- that
is, retail one-row-at-a-time checks --- then we could probably dispense
with all the parser/planner/executor overhead and directly implement
an indexscan using an API at about the level genam.c provides.
(The issue of whether it's okay to require an index to be available is
annoying, but we could always fall back to the old ways if one is not.)

However, what I thought this thread was about was switching to
statement-level RI checking. At that point, what we're talking
about is performing a join involving a not-known-in-advance number
of tuples on each side. If you think you can hard-wire the choice
of join technology and have it work well all the time, I'm going to
say with complete confidence that you are wrong. The planner spends
huge amounts of effort on that and still doesn't always get it right
... but it does better than a hard-wired choice would do.

Maybe there's room to pursue both things --- you could imagine,
perhaps, looking at the planner's estimate of number of affected
rows at executor startup and deciding from that whether to fire
per-row or per-statement RI triggers. But we're really going to
want different implementations within those two types of triggers.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2020-04-23 14:55:51 Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays
Previous Message Ranier Vilela 2020-04-23 13:47:41 [PATCH] Fix Null pointer dereferences (pgoutput.c)