Re: WIP: Detecting SSI conflicts before reporting constraint violations

From: Kevin Grittner <kgrittn(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: Detecting SSI conflicts before reporting constraint violations
Date: 2016-03-11 12:25:57
Message-ID: CACjxUsOPpSKySrT8d9aqxEM811UQowYXc51rzE804ju-XN1cgA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 10, 2016 at 11:31 PM, Thomas Munro
<thomas(dot)munro(at)enterprisedb(dot)com> wrote:

> Here's a much simpler version with more comments

> It handles the same set of isolation test specs.

I'm impressed that you found a one-line patch that seems to get us
90% of the way to a new guarantee; but I think if we're going to do
something here it should take us from one clear guarantee to
another. We really don't have a new guarantee here that is easy to
express without weasel-words. :-( Let me see whether I can figure
out how to cover that one permutation that is left after this
one-liner.

In terms of theory, one way to look at this is that an insert of an
index tuple to a unique index involves a read from that index which
finds a "gap", which in SSI is normally covered by a predicate lock
on the appropriate part of the index. (Currently that is done at
the page level, although hopefully we will eventually enhance that
to use "next-key gap locking".) Treating the index tuple insertion
as having an implied read of that gap is entirely justified and
proper -- internally the read actually does happen.

That leaves the question of whether it is theoretically sound to
report a transient error due to the actions of a concurrent
serializable transaction as a serialization failure when it
involves a unique index.

Anyone using serializable transactions to prevent problems from
race conditions would consider that the fact that an error was
caused by the action of a concurrent transaction and would not
occur if the transaction were retried from the start as far more
important than details such as it being a duplicate key error or
what table or key is involved. If we can get LOG level logging of
those details, fine, but let's not put such things "in the face" of
the user when we don't need to do so -- anyone using serializable
transactions should have a generalized transaction retry mechanism
which should handle this quietly behind the scenes.

There have been multiple requests to have more information about
the details of serialization failures when such detail is
available, so that people can tune their transactions to minimize
such retries. IMO we should see whether we can provide table/key
in the error detail for this sort of serialization failure.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2016-03-11 12:32:48 Re: RFC: replace pg_stat_activity.waiting with something more descriptive
Previous Message Amit Kapila 2016-03-11 12:15:52 Re: [HACKERS] Re: [HACKERS] Re: [HACKERS] Windows service is not starting so there’s message in log: FATAL: "could not create shared memory segment “Global/PostgreSQL.851401618”: Permission denied”