Re: INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: hlinnaka <hlinnaka(at)iki(dot)fi>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Stephen Frost <sfrost(at)snowman(dot)net>, Bruce Momjian <bruce(at)momjian(dot)us>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Subject: Re: INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0
Date: 2015-04-08 05:59:56
Message-ID: CAM3SWZT1J6nFUad9t0XzAgyXBEz3qtuRSg=aPCGVDJ6qoK40_w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I attach a revision - V3.2

On Mon, Mar 30, 2015 at 9:20 AM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
>> * Not supporting inheritance properly makes me uncomfortable. I don't
>> think users will think that's a acceptable/reasonable restriction.

Inheritance is now supported to the greatest extent anyone could
reasonably expect. In other words, inference works just fine for child
tables, and now parent tables too (so you can UPSERT both). Of course,
because of the existing restriction on making unique constraints cross
tables participating in inheritance, UPSERT similarly cannot work
across inheritance tables. This should be totally obvious to anyone
that uses inheritance - if you won't get a conflict (duplicate
violation) based on an ordinary insert, then of course UPSERT will not
take the alternative UPDATE path just because someone imagines that a
(would-be) duplicate violation should happen.

>> * Let's not use t_ctid directly, but add a wrapper
>
> We talked about a union. This seems quite doable.

This now uses a union. And it now actually stores a token value!

>> * The code uses LockTupleExclusive to lock rows. That means the fkey key
>> locking doesn't work; That's annoying. This means that using upsert
>> will potentially cause deadlocks in other sessions :(. I think you'll
>> have to determine what lock to acquire by fetching the tuple, doing
>> the key comparison, and acquire the appropriate lock. That should be
>> fine.
>
> Looking into the implementation of this.

Not quite sold on this, on second thought (although let's focus on the
WAL logging stuff - the immediate blocker to committing the IGNORE
variant). Perhaps you can explain why you think it's important.

I like that I am able to fully lock the row when the predicate isn't
passed. I think that's a useful feature in some cases (it particularly
makes sense for higher isolation levels that expect to repeat the same
command and not get a serialization failure). It also keeps the
already complicated function ExecLockUpdateTuple() significantly more
simple.

>> * I think we should decouple the insertion and wal logging more. I think
>> the promise tuple insertion should be different from the final
>> insertion of the actual tuple. For one it seems cleaner to me, for
>> another it will avoid the uglyness around logical decoding. I think
>> also that the separation will make it more realistic to use something
>> like this for a COPY variant that doesn't raise unique violations and
>> such.
>
> Your COPY argument swung this for me. I'm looking into the implementation.

I have a prototype implementation of this with V3.2 - it clearly needs
more work, but I thought it was best to post sooner rather than later.
I am reusing in-place update infrastructure for this. This should give
Heikki something to play with, since he wasn't quite sold on this
idea. Certainly, what I have here is not good enough to commit - there
is unnecessary double WAL logging of tuple contents, just for example.
More generally, my recent changes to heapam.c certainly lack polish. I
have something for us to discuss, though, and under the circumstances
I think that's a good thing. Grep for "XXX" and "TODO" comments for
more.

Logical decoding is not handled at all, since I hit a snag with
building a tuple from the in-place update WAL records, and I didn't
want to block on that (especially given the general uncertainty about
if and how to affirm that a speculative insertion succeeded - IWO, if
we should go Andres' way there to avoid making transaction reassembly
for decoding more messy). I have at least reverted the logical
decoding transaction reassembly peek-ahead thing that Andres hated so
much, though. I hope we can reach consensus on what to do on this
point of WAL logging/logical decoding in particular real soon now.

>> * We discussed the infererence and that it should just reuse (code,
>> grammar, docs) the column specification from create index.

The inference specification now both accepts collation and opclass
specifications along the lines we discussed, and can infer multiple
unique indexes per Heikki/Robert's complaint (so there is no longer a
costing aspect to it at all).

There are lots of tests for the inference of collations and opclasses
- if you want to know how it works, look at those (e.g. to learn about
handling of edge cases with redundant or overlapping cases, perhaps
due only to differing collations). I've come up with something very
flexible/forgiving, I think. Also, new tests were added for the new
inheritance support.

The documentation has been updated to reflect all of this.

There is still no way to specify a named constraint (which is perhaps
useful for exclusion constraints - although perhaps it's good enough
to have the IGNORE variant only work with exclusion constraints in the
common case where there is no inference specification), or PRIMARY KEY
syntax. Still have not made text explain output display arbiter unique
indexes (although multiple arbiter unique indexes are now visible from
the non-text explain output, since as I mentioned multiple specific
unique indexes may now be inferred). This needs some more tweaking. It
was helpful with the new tests for inference (with complex variations
is collations and opclasses, and multiple unique indexes inferred in a
number of cases).

I've been meaning to revisit Dean Rasheed's recent remarks on RLS. But
that hasn't happened yet.

Thanks
--
Peter Geoghegan

Attachment Content-Type Size
0001-Support-INSERT-.-ON-CONFLICT-IGNORE.patch.gz application/x-gzip 52.2 KB
0004-RLS-support-for-ON-CONFLICT-UPDATE.patch.gz application/x-gzip 8.1 KB
0003-Support-INSERT-.-ON-CONFLICT-UPDATE.patch.gz application/x-gzip 44.2 KB
0002-Make-UPDATE-privileges-distinct-from-INSERT-privileg.patch.gz application/x-gzip 7.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Max Filippov 2015-04-08 06:31:55 Re: configure can't detect proper pthread flags
Previous Message Michael Paquier 2015-04-08 05:30:35 Re: New error code to track unsupported contexts