Re: INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: hlinnaka <hlinnaka(at)iki(dot)fi>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Bruce Momjian <bruce(at)momjian(dot)us>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Subject: Re: INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0
Date: 2015-03-25 16:42:29
Message-ID: CAM3SWZRaJG+7jDhPJFhgfcsMWh1KquZZCGNa-D9eTBBVb4+Fng@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 18, 2015 at 2:41 PM, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
> Here's what I had in mind: the inserter tags the tuple with the speculative
> insertion token, by storing the token in the t_ctid field. If the inserter
> needs to super-delete the tuple, it sets xmax like in a regular deletion,
> but also sets another flag to indicate that it was a super-deletion.

I was able to quickly hack up a prototype of this in my hotel room at
pgConf.US. It works fine at first blush, passing the jjanes_upsert
stress tests and my own regression tests without a problem. Obviously
it needs more testing and clean-up before posting, but I was pleased
with how easy this was.

> When another backend inserts, and notices that it has a potential conflict
> with the first tuple, it tries to acquire a hw-lock on the token. In most
> cases, the inserter has long since completed the insertion, and the
> acquisition succeeds immediately but you have to check because the token is
> not cleared on a completed insertion.

You don't even have to check/take a ShareLock on the token when the
other xact committed/aborted, because you know that if it is there,
then based on that (and based on the fact that it wasn't super
deleted) the tuple is visible/committed, or (in the event of
other-xact-abort) not visible/aborted. In other words, we continue to
only check for a speculative token when the inserting xact is in
flight - we just take the token from the heap now instead. Not much
needs to change, AFAICT.

> Regarding the physical layout: We can use a magic OffsetNumber value above
> MaxOffsetNumber to indicate that the t_ctid field stores a token rather than
> a regular ctid value. And another magic t_ctid value to indicate that a
> tuple has been super-deleted. The token and the super-deletion flag are
> quite ephemeral, they are not needed after the inserting transaction has
> completed, so it's nice to not consume the valuable infomask bits for these
> things. Those states are conveniently not possible on an updated tuple, when
> we would need the t_ctid field for it's current purpose.

Haven't done anything about this yet. I'm just using an infomask2 bit
for now. Although that was only because I forgot that you suggested
this before having a go at implementing this new t_ctid scheme!

My next revision will have a more polished version of this scheme. I'm
not going to immediately act on Robert's feedback elsewhere (although
I'd like to), owing to time constraints - no reason to deny you the
opportunity to review the entirely unrelated low-level speculative
locking mechanism due to that.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2015-03-25 17:18:51 Re: What exactly is our CRC algorithm?
Previous Message Thom Brown 2015-03-25 16:23:08 Re: Parallel Seq Scan