INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Bruce Momjian <bruce(at)momjian(dot)us>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Subject: INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0
Date: 2015-03-05 01:18:17
Message-ID: CAM3SWZS9DTFt1=sQT=WFdp5UFkOac2Qc1i5WE-Od4BXJZ=JsCw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Attached patch series forms what I'm calling V3.0 of the INSERT ... ON
CONFLICT IGNORE/UPDATE feature. (Sorry about all the threads. I feel
this development justifies a new thread, though.)

This features a new way of organizing the code. Heikki and I are now
in agreement that the best way of incrementally committing the work is
to do ON CONFLICT IGNORE first (perhaps someone else can review ON
CONFLICT UPDATE). This patch series is organized so that the first
commit adds tests, documentation and code that only relates to the
IGNORE variant. RLS support is still split out as a separate commit,
which is not intended to actually be committed separately (from the
main ON CONFLICT UPDATE commit). As before, I've just done that to
highlight that part for the benefit of RLS subject matter experts. The
RTE permissions split patch is also broken out, but I believe that
there is consensus that that could sensibly be committed on its own.

There are some minor changes to the code/documentation itself:

* Ran the code through pg_indent.

* Documented ON CONFLICT UPDATE as a MERGE alternative under
"unsupported SQL features" (sql_features.txt).

* Minor tweaks of comments here and there.

* Logical decoding fixes.

* Call ExecCheckPlanOutput() for an ON CONFLICT UPDATE auxiliary query, too.

I should point out that I'm aware of the following open issues around
the patch series (most of these apply to the base IGNORE commit, so
Heikki should of course look out for them):

* Andres and I are still hashing out details of whether or not certain
assumptions made around handling super deletion within logical
decoding are safe [1]. Consider the decoding stuff suspect for now.

* There is still an unresolved semantics issue with unique index
inference and non-default opclasses. I think it's sufficient that the
available/defined unique indexes dictate our idea of a unique
violation (that necessitates taking the alternative path). Even in a
world where there exists a non-default opclass with an "equals"
operator that does not match that of the default opclass (that is not
really the world we live in, because the only counter-example known is
made that way specifically to *discourage* its use by users), this
seems okay to me. It seems okay to me because surely the relevant
definition of equality is the one actually chosen for the available
unique index. If there exists an ambiguity for some strange reason
(i.e. there are two unique indexes, on the same attribute(s), but with
different "equals" operators), then its a costing issue, so the
behavior given is essentially non-predictable (it could end up being
either...but hey, those are the semantics). I have a very hard time
imagining how that could ever be the case, even when we have (say)
case insensitive opclasses for the text type. A case insensitive
opclass is stricter than a case sensitive opclass. Why would a user
ever want both on the same attribute(s) of the same table? Is the user
really more or less expecting to never get a unique violation on the
non-arbitrating unique index, despite all this?

If reviewers are absolutely insistent that this theoretical ambiguity
is a problem, we can add an optional CREATE INDEX style opclass
specification (I'm already using the IndexElems representation from
CREATE INDEX for the inference specification, actually, so that would
be easy enough). I really have a hard time believing that the ugliness
is a problem for those hypothetical users that eventually consider
"equals" operator ambiguity among opclasses among available unique
indexes to be a problem. I haven't just gone and implemented this
already because I didn't want to document that an opclass
specification will be accepted. Since there is literally no reason why
anyone would care today, I see no reason to add what IMV would really
just be cruft.

* I'm flying blind with the SEPostgres changes. Unfortunately, it's
not very possible to test SEPostgres on my machine. (Does not apply to
IGNORE variant.)

As always, the jjanes_upsert tool [2] has proven invaluable with
testing this patch (Thanks Jeff!). Reviewers should look to that tool
for ideas on how the patch might be broken. It caught a number of race
conditions with exclusion constraints in the past, for example. We're
in good shape with those now (and have been since V2.3 - see "livelock
insurance" comments within the IGNORE commit).

Thoughts?

[1] http://www.postgresql.org/message-id/20150303111342.GF2579@alap3.anarazel.de
[2] https://github.com/petergeoghegan/jjanes_upsert
--
Peter Geoghegan

Attachment Content-Type Size
0004-RLS-support-for-ON-CONFLICT-UPDATE.patch text/x-patch 33.2 KB
0003-Support-INSERT-.-ON-CONFLICT-UPDATE.patch text/x-patch 189.2 KB
0002-Make-UPDATE-privileges-distinct-from-INSERT-privileg.patch text/x-patch 30.9 KB
0001-Support-INSERT-.-ON-CONFLICT-IGNORE.patch text/x-patch 166.3 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kouhei Kaigai 2015-03-05 01:27:15 Re: Join push-down support for foreign tables
Previous Message Jan de Visser 2015-03-05 01:13:27 Re: Idea: closing the loop for "pg_ctl reload"