Re: INSERT ... ON CONFLICT UPDATE/IGNORE 4.0, parser/executor stuff

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, hlinnaka <hlinnaka(at)iki(dot)fi>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: INSERT ... ON CONFLICT UPDATE/IGNORE 4.0, parser/executor stuff
Date: 2015-04-27 22:28:43
Message-ID: CAM3SWZRFeuwc8gv+jG=0RRS=OyFXBxgjA+BTBVda-o8qeEYUqw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin pgsql-hackers

On Mon, Apr 27, 2015 at 2:52 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> So, I'm looking. And I've a few questions:
> * Why do we need to spread knowledge about speculative inserts that wide?
> It's now in 1) Query, 2) ParseState 3) ModifyTable 4) InsertStmt. That
> seems a bit wide - and as far as I see not really necessary. That e.g.
> transformUpdateStmt() has if (!pstate->p_is_speculative) blocks seems
> wrong.

Maybe it shouldn't be spelt "p_is_speculative", because speculative
insertion is a low-level mechanism. Maybe it should be something like
p_is_auxiliary...but you don't like that word either. :-)

If setTargetTable() took the RangeVar NULL-ness as a proxy for a child
auxiliary query, and if free_parsestate() took a flag to indicate that
it shouldn't perform a heap_close() when an auxiliary statement's
parent must do it instead, then it wouldn't be nescessary to add a
p_is_speculative field. But that seems worse.

> * afaics 'auxiliary query' isn't something we have under that
> name. This isn't really different to what wCTEs do, so I don't think
> we need this term here.

Well, I guess auxiliary is just a descriptive word, as opposed to one
that describes a technical mechanism peculiar to Postgres/this patch.
I don't feel that it's important either way.

> * You refer to 'join like syntax' in a couple places. I don't really see
> the current syntax being that join like? Is that just a different
> perception of terminology or is that just remnants of earlier
> approaches?

No. It is join-like. It looks the same as and behaves similarly to
UPDATE .... FROM ... . There is one difference, though -- excluded.*
is not visible from RETURNING (because of the ambiguity of doing it
the UPDATE way, where, as with a self-join, both tuples have identical
column names. And because the INSERT RETURNING behavior should be
dominant). That's all I mean by "join-like". People thought that
looked nicer than a straight expression that operates on a Var (which
FWIW is kind of what MySQL does), so that's the way it works.

> * I am rather unconvinced we need the whole 'ExcludedVar' mechanism. I
> don't really see why we need it at all? Can't we 'just' make those
> plain vars referring to the normal targetlist "entry"? I feel like I'm
> missing something here.

If you allow multiple RTEs by not having the rewriter swap out
EXCLUDED vars for ExcludedExpr(), what you end up with is an UPDATE
subquery with a join (and, if you change nothing else in the patch, a
segfault). Fixing that segfault is probably going to require more
complexity in the optimizer, and certainly will require more in the
executor. Imagine teaching ModifyTable nodes about rescan to make a
row lock conflict handled from its parent (if we had a "real" join).
Alternatively, maybe you could have the EPQ stuff install a tuple and
then execute a single EvalPlanQualNext() against a scantuple (which
you'd also have to install using EPQ). You can install multiple
tuples, which is used for SELECT FOR UPDATE's EPQ stuff. But that
seems very significantly more complicated than what I actually did.

Think of ExcludedExpr as a way of pretending that an expression on the
target's Vars is a Var referencing a distinct RTE, simply because
people think that looks nicer. The EXCLUDED.* tuple legitimately
originates form the same tuple context as the target tuple, which the
structure of the code reflects.

> * The whole dealing with the update clause doesn't seem super
> clean. I'm not sure yet how to do it nicer though :(

At least it isn't all that much code. I think that the main thing that
this design has to recommend it is code re-use. The patch actually
isn't all that big in terms of lines of code.
--
Peter Geoghegan

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Andre_Mikulec 2015-04-27 23:00:32 How do I install/run PostgreSQL 9.4 64 bit on Windows 7 Professional?
Previous Message Andres Freund 2015-04-27 21:52:58 Re: INSERT ... ON CONFLICT UPDATE/IGNORE 4.0, parser/executor stuff

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2015-04-27 22:41:32 Re: mogrify and indent features for jsonb
Previous Message Andres Freund 2015-04-27 21:52:58 Re: INSERT ... ON CONFLICT UPDATE/IGNORE 4.0, parser/executor stuff