Re: UPSERT wiki page, and SQL MERGE syntax

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Kevin Grittner <kgrittn(at)ymail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>
Subject: Re: UPSERT wiki page, and SQL MERGE syntax
Date: 2014-10-10 22:03:08
Message-ID: CAM3SWZStdChN6-ieJbc20OGD8TwmZ6-um6O8Gz2BOFzXn9YFVg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Oct 10, 2014 at 2:16 PM, Kevin Grittner <kgrittn(at)ymail(dot)com> wrote:
>> Would you be okay with this never working with partial unique
>> indexes? That gives me something to work with.
>
> That seems like the only sensible course, to me.

Okay, great. If that's the consensus, then that's all I need to know.

>> I'm not that worried about expression indexes, actually. I'm mostly
>> worried about partial unique indexes, particularly when before insert
>> row-level triggers are in play

> There is a problem if any column of the index allows nulls, since
> that would allow multiple rows in the target table to match an
> argument. Proving that an expression does not could be tricky. I
> suggest that, at least for a first cut, we restrict this to
> matching an index on NOT NULL columns.

Hmm. That could definitely be a problem. But the problem is clearly
the user's fault. It could allow multiple rows to "match", as you say
- but you're talking about a different concept of matching. Unlike
with certain other systems, that isn't a would-be unique violation,
and so it isn't a "match" in the sense I intend, and so you insert.
You don't match more than one row, because (in the sense peculiar to
this feature) there are no matches, and insertion really is the
correct outcome.

As a user, you have no more right to be surprised by this then by the
fact that you wouldn't see a dup violation with a similar vanilla
INSERT - some people are surprised by that, with regularity.

Maybe I need to emphasize the distinction in the documentation between
the two slightly different ideas of "matching" that are in tension
here. I think it would be a mistake to forcibly normalize things to
remove that distinction (by adding an artificial restriction), at any
rate.

>> Did you consider my example? I think that people will like this idea,
>> too - that clearly isn't the only consideration, though. As you say,
>> it would be very easy to implement this. However, IMV, we shouldn't,
>> because it is hazardous.
>
> To quote the cited case:
>
>> 1. Developer writes the query, and it works fine.
>>
>> 2. Some time later, the DBA adds an inserted_at column (those are
>> common). The DBA is not aware of the existence of this particular
>> query. The new column has a default value of now(), say.
>>
>> Should we UPDATE the inserted_at column to be NULL? Or (more
>> plausibly) the default value filled in by the INSERT? Or leave it be?
>> I think there is a case to be made for all of these behaviors, and
>> that tension makes me prefer to not do this at all. It's like
>> encouraging "SELECT *" queries in production, only worse.
>
> That does point out the problem, in general, with carrying values
> from a BEFORE INSERT trigger into an UPDATE. Perhaps if the INSERT
> fails the UPDATE phase should start with the values specified in
> the first place, and not try to use anything returned by the BEFORE
> INSERT triggers?

I think that that behavior is far more problematic for numerous other
reasons, though. Potentially, you're not seeing what *really* caused
the conflict at all. I just don't think it's worth it to save a bit of
typing.

--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Sehrope Sarkuni 2014-10-10 22:31:20 Materialized views don't show up in information_schema
Previous Message Peter Geoghegan 2014-10-10 21:24:48 Re: UPSERT wiki page, and SQL MERGE syntax