Quick Links

Re: Alter index rename concurrently to

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Cc:	Andrey Klychkov <aaklychkov(at)mail(dot)ru>, Victor Yegorov <vyegorov(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Alter index rename concurrently to
Date:	2018-07-27 14:16:37
Message-ID:	CA+TgmobtmFT5g-0dA=vEFFtogjRAuDHcYPw+qEdou5dZPnF=pg@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Wed, Jul 18, 2018 at 5:20 AM, Peter Eisentraut
<peter(dot)eisentraut(at)2ndquadrant(dot)com> wrote:
> In the 2012 thread, it was mentioned several times that some thought
> that renaming without an exclusive lock was unsafe, but without any
> further reasons. I think that was before MVCC catalog snapshots were
> implemented, so at that time, any catalog change without an exclusive
> lock would have been risky. After that was changed, the issue hasn't
> been mentioned again in the 2013 and beyond thread, and it seems that
> everyone assumed that it was OK.
>
> It we think that it is safe, then I think we should just lower the lock
> level of RENAME by default.
>
> In your patch, the effect of the CONCURRENTLY keyword is just to change
> the lock level, without any further changes. That doesn't make much
> sense. If we think the lower lock level is OK, then we should just use
> it always.

Right. I think that, in the world of MVCC catalog scans, there are
basically two main concerns around reducing the lock levels for DDL.
The first is the way that the shared invalidation queue works. If
people reading a certain piece of information take a lock X, and
people changing it take a lock Y which conflicts with X, then the
shared invalidation machinery guarantees that everyone always sees the
latest version. If not, some backends may continue to use the old
version for an unbounded period of time -- there doesn't seem to be a
guarantee that AcceptInvalidationMessages() even runs once per
command, if say the transaction already holds all the locks the query
needs. Moreover, there is no predictability to when the new
information may become visible -- it may happen just by chance that
the changes become visible almost at once. I think it would be
worthwhile for somebody to consider adjusting or maybe even
significantly rewriting the invalidation code to provide some
less-unpredictable behavior in this area. It would be much more
palatable to make non-critical changes under lesser lock levels if you
could describe in an understandable way when those changes would take
effect. If you could say something like "normally within a second" or
"no later than the start of the next query" it would be a lot better.

The second problem is the relation cache. Lots of things cache
pointers to the RelationData structure or its subsidiary structures,
and if any of that gets rebuild, the pointer addresses may change,
leaving those pointers pointing to garbage. This is handled in
different ways in different places. One approach is -- if we do a
rebuild of something, say the partition descriptor, and find that the
new one is identical to the old one, then free the new one and keep
the old one. This means that it's safe to rely on the pointer not
changing underneath you provided you hold a lock strong enough to
prevent any DDL that might change the partition descriptor. But note
that something like this is essential for any concurrent DDL to work
at all. Without this, concurrent DDL that changes some completely
unrelated property of the table (e.g. the reloptions) would cause a
relcache rebuild that would invalidate cached pointers to the
partition descriptor, leading to crashes.

With respect to this particular patch, I don't know whether there are
any hazards of the second type. What I'd check, if it were me, is
what structures in the index's relcache entry are going to get rebuilt
when the index name changes. If any of those look like things that
something that somebody could hold a pointer to during the course of
query execution, or more generally be relying on not to change during
the course of query execution, then you've got a problem. As to the
first category, I suspect it's possible to construct cases where the
fact that the index's name hasn't change fails to get noticed for an
alarmingly long period of time after the change has happened. I also
suspect that an appropriate fix might be to ensure that
AcceptInvalidationMessages() is run at least once at the beginning of
parse analysis. I don't think that will make this kind of thing
race-free, but if you can't demonstrate a case where the new name
fails to be noticed immediately without resorting to setting
breakpoints with a debugger, it's probably good enough. We might need
to worry about whether the extra call to AcceptInvalidationMessages()
costs enough to be noticeable, but I hope it doesn't.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Re: Alter index rename concurrently to at 2018-07-18 09:20:57 from Peter Eisentraut

Responses

Re: Alter index rename concurrently to at 2018-07-31 21:10:28 from Peter Eisentraut

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Bossart, Nathan	2018-07-27 14:40:42	Re: BUG #15182: Canceling authentication due to timeout aka Denial of Service Attack
Previous Message	Robert Haas	2018-07-27 13:44:54	Re: PartitionDispatch's partdesc field