Re: INSERT .. ON CONFLICT DO SELECT [FOR ..]

From: Marko Tiikkaja <marko(at)joh(dot)to>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: INSERT .. ON CONFLICT DO SELECT [FOR ..]
Date: 2017-09-04 17:05:57
Message-ID: CAL9smLBtT62UPBbrkRBMDms5ONtsh2e22gF87zNE29P4FfVg8g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Sep 4, 2017 at 4:09 AM, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:

> On Tue, Aug 15, 2017 at 12:17 AM, Marko Tiikkaja <marko(at)joh(dot)to> wrote:
> > On Tue, Aug 15, 2017 at 7:43 AM, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> >>
> >> On Mon, Aug 14, 2017 at 6:23 PM, Marko Tiikkaja <marko(at)joh(dot)to> wrote:
> >> > Attached is a patch for $SUBJECT. It might still be a bit rough
> around
> >> > the
> >> > edges and probably light on docs and testing, but I thought I'd post
> it
> >> > anyway so I won't forget.
> >>
> >> Is it possible for ON CONFLICT DO SELECT to raise a cardinality
> >> violation error? Why or why not?
> >
> >
> > No. I don't see when that would need to happen. But I'm guessing you
> have
> > a case in mind?
>
> Actually, no, I didn't. But I wondered if you did. I think that it
> makes some sense not to, now that I think about it. ON CONFLICT DO
> NOTHING doesn't have cardinality violations, because it cannot affect
> a row twice if there are duplicates proposed for insertion (at least,
> not through any ON CONFLICT related codepath). But, this opinion only
> applies to ON CONFLICT DO SELECT, not ON CONFLICT DO SELECT FOR
> UPDATE. And I have other reservations, which I'll go in to
> momentarily, about ON CONFLICT DO SELECT in general. So, the upshot is
> that I think we need cardinality violations in all cases for this
> feature. Why would a user ever not want to know that the row was
> locked twice?
>

I had to look at the patch to see what I'd done, and the tests suggest that
we already complain about this with if a FOR [lock level] clause is present:

+begin transaction isolation level read committed;
+insert into selfconflict values (10,1), (10,2) on conflict(f1) do
select for update returning *;
+ERROR: ON CONFLICT command cannot affect row a second time
+HINT: Ensure that no rows proposed for insertion within the same
command have duplicate constrained values.
+commit;

(in expected/insert_conflict.out.)

> On to the subject of my more general reservation: Why did you include
> ON CONFLICT DO SELECT at all? Why not just ON CONFLICT DO SELECT FOR
> UPDATE (and FOR SHARE, ...) ?
>

If you don't intend to actually do anything with the row in the same
database transaction, locking seems unnecessary. For example, you might
want to provide an idempotent method in your API which inserts the data and
returns the ID to the caller. The transaction is committed by the time the
client sees the data, so locking is just extra overhead.

> I think I know what you're going to say about it: ON CONFLICT DO
> NOTHING doesn't lock the conflicting row, so why should I insist on it
> here (why not have an ON CONFLICT DO SELECT variant, too)?
>

I wasn't going to say that, no.

> In other words, while ON CONFLICT DO NOTHING may have set a precedent
> here, it at least has semantics that limit the consequences of not
> locking the row; and it *feels* a little bit dirty to use it
> indifferently, even where that makes sense. ON CONFLICT DO SELECT is
> probably going to be used within wCTEs some of the time. I'm not sure
> that a plain ON CONFLICT DO SELECT variant won't allow unpredictable,
> complicated problems when composed within a more complicated query.
>

Yeah, in most cases you'd probably do a SELECT FOR KEY SHARE. And I
wouldn't mind that being the default, but it would mean inventing special
syntax with no previous precedent (as far as I know) for not locking the
row in cases where the user doesn't want that. And that doesn't seem too
attractive, either.

Maybe it would be better to make the default sensible for people who are
not super familiar with postgres. I don't know. And the more I think
about the use case above, the less I care about it. But I'm generally
against interfaces which put arbitrary restrictions on what power users can
do on the basis that less experienced people might misuse the interface.

.m

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-09-04 17:08:39 Re: Variable substitution in psql backtick expansion
Previous Message Pavel Stehule 2017-09-04 17:01:17 Re: Variable substitution in psql backtick expansion