Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Simon Riggs <simon(at)2ndQuadrant(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
Date: 2019-05-01 19:05:50
Message-ID: 20190501190550.jigb3tem3xxzspad@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2019-05-01 14:44:12 -0400, Tom Lane wrote:
> I'm busy looking into the REINDEX-on-pg_class mess, and one thing I found
> while testing HEAD is that with CLOBBER_CACHE_ALWAYS on, once you've
> gotten a failure, your session is hosed:
>
> regression=# select 1;
> ?column?
> ----------
> 1
> (1 row)
>
> regression=# reindex index pg_class_oid_index;
> psql: ERROR: could not read block 0 in file "base/16384/35344": read only 0 of 8192 bytes
> regression=# select 1;
> psql: ERROR: could not open file "base/16384/35344": No such file or directory

Yea, I noticed that too. Lead me astray for a while, because it
triggered apparent REINDEX failures for pg_index too, even though not
actually related to the reindex.

> The problem is that the nailed relcache entry for pg_class_oid_index got
> updated to point to the new relfilenode, and that change did not get
> undone by transaction abort. There seem to be several bugs contributing
> to that, but the one I'm asking about right now is that commit ae9aba69a
> caused RelationCacheInvalidate to skip over relations that have
> rd_newRelfilenodeSubid set, as indeed pg_class_oid_index will have at the
> instant of the failure.

That commit message is quite unhelpful about what exactly this was
intended to fix. I assume it was intended to make COPY FREEZE usage
predictable, but...

> This seems quite wrong, because it prevents us from rebuilding the
> entry with corrected values. In particular notice that the change
> causes us to skip the RelationInitPhysicalAddr call that would
> normally be applied to a nailed mapped index in that loop. That's
> completely fatal in this case --- it keeps us from restoring the
> correct relfilenode that the mapper would now tell us, if we only
> bothered to ask.

Indeed. I'm a bit surprised that doesn't lead to more problems.

Not sure I understand where the RelationCacheInvalidate() call is coming
from in this case though. Shouldn't the entry have been invalidated
individually through ATEOXact_Inval(false)?

> I think perhaps what we should do is revert ae9aba69a and instead do
>
> relcacheInvalsReceived++;
>
> - if (RelationHasReferenceCountZero(relation))
> + if (RelationHasReferenceCountZero(relation) &&
> + relation->rd_newRelfilenodeSubid == InvalidSubTransactionId)
> {
> /* Delete this entry immediately */
> Assert(!relation->rd_isnailed);
>
> so that entries with rd_newRelfilenodeSubid nonzero are preserved but are
> rebuilt.

Seems like a reasonable approach.

> There might be an argument for treating rd_createSubid the same way,
> not sure. That field wouldn't ever be set on a system catalog, so
> it's less of a hazard, but there's something to be said for treating
> the two fields alike.

Seems sensible to treat it the same way. Not sure if there's an actual
problem where the current treatment could cause a problem, but seems
worth improving.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-05-01 19:08:56 Re: REINDEX INDEX results in a crash for an index of pg_class since 9.6
Previous Message Tom Lane 2019-05-01 18:44:12 Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)