Re: INSERT .. ON CONFLICT DO SELECT [FOR ..]

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Marko Tiikkaja <marko(at)joh(dot)to>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: INSERT .. ON CONFLICT DO SELECT [FOR ..]
Date: 2017-09-04 17:46:53
Message-ID: CAH2-Wzm06fp76oJz8XYN+m9G6zZtpYWULu-qyrShjcjRtdtGDA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Sep 4, 2017 at 10:05 AM, Marko Tiikkaja <marko(at)joh(dot)to> wrote:
> I had to look at the patch to see what I'd done, and the tests suggest that
> we already complain about this with if a FOR [lock level] clause is present:
>
> +begin transaction isolation level read committed;
> +insert into selfconflict values (10,1), (10,2) on conflict(f1) do select
> for update returning *;
> +ERROR: ON CONFLICT command cannot affect row a second time
> +HINT: Ensure that no rows proposed for insertion within the same
> command have duplicate constrained values.
> +commit;
>
> (in expected/insert_conflict.out.)

Right. I saw that you do it for ON CONFLICT DO SELECT FOR UPDATE, and so on.

>> On to the subject of my more general reservation: Why did you include
>> ON CONFLICT DO SELECT at all? Why not just ON CONFLICT DO SELECT FOR
>> UPDATE (and FOR SHARE, ...) ?
>
>
> If you don't intend to actually do anything with the row in the same
> database transaction, locking seems unnecessary. For example, you might
> want to provide an idempotent method in your API which inserts the data and
> returns the ID to the caller. The transaction is committed by the time the
> client sees the data, so locking is just extra overhead.

That makes sense, but I personally feel that the plain ON CONFLICT DO
SELECT variant isn't worth the trouble. I welcome other people's
opinions on that.

>> I think I know what you're going to say about it: ON CONFLICT DO
>> NOTHING doesn't lock the conflicting row, so why should I insist on it
>> here (why not have an ON CONFLICT DO SELECT variant, too)?
>
>
> I wasn't going to say that, no.

Well, it was a foundation for the ON CONFLICT DO SELECT variant that I
actually agree with, in any case.

>> In other words, while ON CONFLICT DO NOTHING may have set a precedent
>> here, it at least has semantics that limit the consequences of not
>> locking the row; and it *feels* a little bit dirty to use it
>> indifferently, even where that makes sense. ON CONFLICT DO SELECT is
>> probably going to be used within wCTEs some of the time. I'm not sure
>> that a plain ON CONFLICT DO SELECT variant won't allow unpredictable,
>> complicated problems when composed within a more complicated query.
>
>
> Yeah, in most cases you'd probably do a SELECT FOR KEY SHARE. And I
> wouldn't mind that being the default, but it would mean inventing special
> syntax with no previous precedent (as far as I know) for not locking the row
> in cases where the user doesn't want that. And that doesn't seem too
> attractive, either.

I'm not in favor of spellings that are inconsistent with SELECT FOR ... .

> Maybe it would be better to make the default sensible for people who are not
> super familiar with postgres. I don't know. And the more I think about the
> use case above, the less I care about it.

I was going to ask you to offer a real-world example of where the
plain ON CONFLICT DO SELECT variant is compelling. If you care about
it (if you're not going to concede the point), then I still suggest
doing so.

> But I'm generally against
> interfaces which put arbitrary restrictions on what power users can do on
> the basis that less experienced people might misuse the interface.

I agree with that as a broad principle, but disagree that it applies
here. Or if it does, then you have yet to convince me of it. In all
sincerity, if you think that I just don't understand your perspective,
then please try to make me understand it. Would a power user actually
miss ON CONFLICT DO SELECT? And if that's the case, would it really be
something they'd remember 5 minutes later?

I actually think that ON CONFLICT DO NOTHING does have semantics that
are, shall we say, questionable. That's the cost of having it not lock
conflicting rows during big ETL operations. That's a huge practical
benefit for ETL use-cases. Whereas here, with ON CONFLICT DO SELECT,
I see a somewhat greater risk, and a much, much smaller benefit. A
benefit that might actually be indistinguishable from zero.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2017-09-04 17:47:50 Re: Variable substitution in psql backtick expansion
Previous Message Tom Lane 2017-09-04 17:46:52 Re: pgbench - allow to store select results into variables