Re: VACUUM FULL versus unsafe order-of-operations in DDL commands

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: VACUUM FULL versus unsafe order-of-operations in DDL commands
Date: 2011-08-14 20:31:53
Message-ID: CA+TgmoZY6=KPuE2SBCEZbYgVi1xPhBFm1F=FQ+b47Qsv7u-vKg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Aug 14, 2011 at 2:21 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> So, as the testing rolls on, I started to see some failures in various
> ALTER-FOREIGN-thingy commands.  The cause proved to be that numerous
> places in foreigncmds.c do this:
>
>        tuple = SearchSysCacheCopy(...);
>
>        ... alter the tuple as needed ...
>
>        rel = heap_open(target-catalog, RowExclusiveLock);
>
>        simple_heap_update(rel, &tuple->t_self, tuple);
>
>        heap_close(rel, RowExclusiveLock);
>
> rather than the more common pattern in which the catalog is opened
> first.

Interesting. I vaguely recall flipping some of those around (to put
the lock acquisition first) before committing the 9.1-era foreign
table patch; it didn't seem like an entirely healthy thing to do. But
I didn't really have any concrete notion of why it might be dangerous.

> foreigncmds.c is not hard to fix, but the scary aspect of this is the
> possibility that we've made the same mistake elsewhere, or might do so
> again in future.  Some desultory examination of simple_heap_update and
> simple_heap_delete calls didn't find any other instances, but I am not
> sure I didn't miss anything.  And this seems like an easy trap to fall
> into when refactoring (the current work to try to unify operations like
> ALTER OWNER could easily get into this kind of problem, for instance).
>
> I tried to think of some practical way to mechanically test for this
> type of error, but came up with nothing.  Any ideas?

Hmm. How about setting the TID to an illegal value of some kind when
a catcache tuple is extracted without a table lock? Then any
subsequent update or delete using that tuple would blow up. I think
that'd be way too expensive to do in normal running but perhaps we
could have a #define...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2011-08-14 23:22:37 Re: VACUUM FULL versus TOAST
Previous Message Alexander Korotkov 2011-08-14 19:30:43 Re: WIP: Fast GiST index build