Re: INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Stephen Frost <sfrost(at)snowman(dot)net>, Bruce Momjian <bruce(at)momjian(dot)us>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Subject: Re: INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0
Date: 2015-04-16 17:25:29
Message-ID: CAM3SWZQd1MAZRp9TrZosvGL_==P1jF-9Y5v7fx6rqhrO23M+_Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Apr 16, 2015 at 2:18 AM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> On 2015-04-15 18:53:15 +0300, Heikki Linnakangas wrote:
>> Hmm, ok, I've read the "INSERT ... ON CONFLICT UPDATE and logical decoding"
>> thread now, and I have to say that IMHO it's a lot more sane to handle this
>> in ReorderBufferCommit() like Peter first did, than to make the main
>> insertion path more complex like this.
>
> I don't like Peter's way much. For one it's just broken. For another
> it's quite annoying to trigger disk reads to figure out what actual type
> of record something is.
>
> If we go that way that's the way I think it should be done: Whenever we
> encounter a speculative record we 'unlink' it from the changes that will
> be reused for spooling from disk and do nothing further. Then we just
> continue reading through the records.

You mean we continue iterating through *changes* from ReorderBufferCommit()?

> If the next thing we encounter is
> a super-deletion we throw away that record. If it's another type of
> change (or even bettter a 'speculative insertion succeeded' record)
> insert it.

By "insert it", I gather you mean report to the the logical decoding
plugin as an insert change.

> That'll still require some uglyness, but it should not be too
> bad.

So, to be clear, what you have in mind is sort of a hybrid between my
first and second approaches (mostly my first approach).

We'd have coordination between records originally decoded into
changes, maybe "intercepting" them during xact reassembly, like my
first approach. We'd mostly do the same thing as the first approach,
actually. The major difference would be that there'd be explicit
"speculative affirmation" WAL records. But we wouldn't rely on those
affirmation records within ReorderBufferCommit() - we'd rely on the
*absence* of a super deletion WAL record (to report an insert change
to the decoding plugin). To emphasize, like my first approach, it
would be based on an *absence* of a super deletion WAL record.

However, like my second approach, there would be a "speculative
affirmation" WAL record. The new speculative affirmation WAL record
would however be quite different to what my second approach to logical
decoding (the in-place update record thing that was in V3.3) actually
did. In particular, it would be far more simple, and the tuple would
be built from the original insertion record within logical decoding.

Right now, I'm tired, so bear with me. What I think I'm not quite
getting here is how the new type of "affirmation" record affects
visibility (or anything else, actually). Clearly dirty snapshots need
to see the record to set a speculative insertion token for their
caller to wait on (and HeapTupleSatisfiesVacuum() cannot see the
speculatively inserted tuple as reclaimable, of course). They need
this *before* the speculative insertion is affirmed, of course.

Maybe you mean: If the speculative insertion xact is in progress, then
the tuple is visible to several types of snapshots (dirty, vacuum,
self, any). If it is not, then tuples are not visible because they are
only speculative (and not *confirmed*). If they were confirmed, and
the xact was committed, then those tuples are logically and physically
indistinguishable from tuples that were inserted in the ordinary
manner.

Is that it? Basically, the affirmation records have nothing much to do
with logical decoding in particular. But you still need to super
delete, so that several types of snapshots ((dirty, vacuum, self, any)
*stop* seeing the tuple as visible independent of the inserting xact
being in progress.

> I earlier thought that'd not be ok because there could be a new catalog
> snapshot inbetween, but I was mistaken: The locking on the source
> transaction prevents problems.

I thought this was the case.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Qingqing Zhou 2015-04-16 18:49:25 Re: [PERFORM] pushing order by + limit to union subqueries
Previous Message Andrew Dunstan 2015-04-16 16:59:35 Re: Supporting src/test/modules in MSVC builds