Re: Getting rid of "tuple concurrently updated" elog()s with concurrent DDLs (at least ALTER TABLE)

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Postgres hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Getting rid of "tuple concurrently updated" elog()s with concurrent DDLs (at least ALTER TABLE)
Date: 2017-12-27 07:53:42
Message-ID: 20171227075342.GA1281@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 26, 2017 at 10:47:59PM -0800, Robert Haas wrote:
> On Tue, Dec 26, 2017 at 4:14 AM, Michael Paquier
> <michael(dot)paquier(at)gmail(dot)com> wrote:
>> On Tue, Dec 26, 2017 at 4:30 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
>>> You're proposing to lock the entire relation against many forms of concurrent DDL, just to get rid of that error? That seems unacceptable.
>>> Isn't the canonical way to solve this to take object locks?
>>
>> Sure. That's where things in lmgr.c come into play, like
>> LockSharedObject(), and you could hold with an exclusive lock on a
>> given object until the end of a transaction before opening the catalog
>> relation with heap_open(), however with those you need to know the
>> object OID before taking a lock on the parent relation, right? So you
>> face problems with lock upgrades, or race conditions show up more
>> easily. I have to admit that I have not dug much into the problem yet,
>> it is easy enough to have isolation tests by the way, and I just
>> noticed that ALTER DATABASE SET can equally trigger the error.
>
> I don't understand what you mean by "you need to know the object OID
> before taking a lock on the parent relation, right?". That seems
> wrong.

Well, the point I am trying to outline here is that it is essential to
take a lock on the object ID before opening pg_authid with heap_open()
with a low-level lock to avoid any concurrency issues when working on
the catalog relation opened.

> I think you might need something like what was done in
> b3ad5d02c9cd8a4c884cd78480f221afe8ce5590; if, after we look up the
> name and before we acquire a lock on the OID, we accept any
> invalidation messages, recheck that the object we've locked is still
> the one associated with that name.

Indeed, this bit is important, and RangeVarGet does so. Looking at
get_object_address(), it also handles locking of the object. Using that
as interface to address all the concurrency problems could be appealing,
and less invasive for a final result as many DDLs are visibly
impacted (still need to make a list here). What do you think?
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Konstantin Knizhnik 2017-12-27 08:21:12 Re: AS OF queries
Previous Message Fabien COELHO 2017-12-27 07:49:07 Re: [HACKERS] pow support for pgbench