Re: [HACKERS] MERGE SQL Statement for PG11

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] MERGE SQL Statement for PG11
Date: 2018-02-09 01:23:09
Message-ID: CAH2-WznUpOvdDyKVMdoOa7f+T3HtdAh4gqKOFQak7m-s0Qjc7Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Feb 7, 2018 at 7:51 PM, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com> wrote:
> I understand getting EPQ semantics right is very important. Can you please
> (once again) summarise your thoughts on what you think is the *most*
> appropriate behaviour? I can then think how much efforts might be involved
> in that. If the efforts are disproportionately high, we can discuss if
> settling for some not-so-nice semantics, like we apparently did for
> partition key updates.

I personally believe that the existing EPQ semantics are already
not-so-nice. They're what we know, though, and we haven't actually had
any real world complaints, AFAIK.

My concern is mostly just that MERGE manages to behave in a way that
actually "provides a single SQL statement that can conditionally
INSERT, UPDATE or DELETE rows, a task that would otherwise require
multiple procedural language statements", as the docs put it. As long
as MERGE manages to do something as close to that high level
description as possible in READ COMMITTED mode (with our current
semantics for multiple statements in RC taken as the baseline), then
I'll probably be happy.

Some novel new behavior -- "EPQ with a twist"-- is clearly necessary.
I feel a bit uneasy about it because anything that anybody suggests is
likely to be at least a bit arbitrary (EPQ itself is kind of
arbitrary). We only get to make a decision on how "EPQ with a twist"
will work once, and that should be a decision that is made following
careful deliberation. Ambiguity is much more likely to kill a patch
than a specific technical defect, at least in my experience. Somebody
can usually just fix a technical defect.

> I am sorry, I know you and Simon have probably done that a few times already
> and I should rather study those proposals first. So it's okay if you don't
> want to repeat those; I will work on them next week once I am back from
> holidays.

Unfortunately, I didn't get very far with Simon on this. I didn't
really start talking about this until recently, though, so it's not
like you missed much. The first time I asked Simon about this was
January 23rd, and I first proposed something about 10 days ago.
Something very tentative.

(I did ask some questions about EPQ, and even WHEN ... AND quals much
earlier, but that was in the specific context of a debate about
MERGE's use of ON CONFLICT's speculative insertion mechanism. I
consider this to be a totally different discussion, that ended before
Simon even posted his V1 patch, and isn't worth spending your time on
now.)

> TBH I did not consider partitioning any less complex and it was indeed very
> complex, requiring at least 3 reworks by me. And from what I understood, it
> would have been a blocker too. So is subquery handling and RLS. That's why I
> focused on addressing those items while you and Simon were still debating
> EPQ semantics.

Sorry if I came across as dismissive of that effort. That was
certainly not my intention. I am pleasantly surprised that you've
managed to move a number of things forward rather quickly.

I'll rephrase: while it would probably have been a blocker in theory
(I didn't actually weigh in on that), I doubted that it would actually
end up doing so in practice (and it now looks like I was right to
doubt that, since you got it done). It was a theoretical blocker, as
opposed to an open item that could drag on indefinitely despite
everyone's best efforts. Obviously details matter, and obviously there
are a lot of details to get right outside of RC semantics, but it
seems wise to focus on the big risk that is EPQ/RC conflict handling.

The only other thing that comes close to that risk is the risk that
we'll get stuck on RLS. Though even the RLS discussion may actually
end up being blocked on this crucial question of EPQ/RC conflict
handling. Did you know that the RLS docs [1] have a specific
discussion of the implications of EPQ for users of RLS, and that it
mentions doing things like using SELECT ... FOR SHARE to work around
the problem? It has a whole example of a scenario that users actually
kind of need to know about, at least in theory. RC conflict handling
semantics could bleed into a number of other things.

I'll need to think some more about RC conflict handling (deciding what
"EPQ with a twist" actually means), since I haven't focused on MERGE
recently. Bear with me.

[1] https://www.postgresql.org/docs/current/static/ddl-rowsecurity.html
--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Etsuro Fujita 2018-02-09 01:48:10 Re: postgres_fdw: perform UPDATE/DELETE .. RETURNING on a join directly
Previous Message Tatsuo Ishii 2018-02-09 01:13:33 Re: PostgreSQL 2018-02-08 Security Update Release