Re: [HACKERS] WAL logging problem in 9.4.3?

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: noah(at)leadboat(dot)com
Cc: robertmhaas(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org, 9erthalion6(at)gmail(dot)com, andrew(dot)dunstan(at)2ndquadrant(dot)com, hlinnaka(at)iki(dot)fi, michael(at)paquier(dot)xyz
Subject: Re: [HACKERS] WAL logging problem in 9.4.3?
Date: 2020-01-21 09:45:57
Message-ID: 20200121.184557.1462314355964996736.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thank you for the comment.

At Sat, 18 Jan 2020 19:51:39 -0800, Noah Misch <noah(at)leadboat(dot)com> wrote in
> On Tue, Jan 14, 2020 at 07:35:22PM +0900, Kyotaro Horiguchi wrote:
> > At Thu, 26 Dec 2019 12:46:39 +0900 (JST), Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> wrote in
> > > At Wed, 25 Dec 2019 16:15:21 -0800, Noah Misch <noah(at)leadboat(dot)com> wrote in
> > > > === Defect 1: Forgets to skip WAL after SAVEPOINT; DROP TABLE; ROLLBACK TO
...
> > This is complex than expected. The DROP TABLE unconditionally removed
> > relcache entry. To fix that, I tried to use rd_isinvalid but it failed
> > because there's a state that a relcache invalid but the corresponding
> > catalog entry is alive.
> >
> > In the attached patch 0002, I added a boolean in relcache that
> > indicates that the relation is already removed in catalog but not
> > committed.
>
> This design could work, but some if its properties aren't ideal. For example,
> RelationIdGetRelation() can return a !rd_isvalid relation when the relation
> has been dropped. What others designs did you consider, if any?

I thought that the entries with rd_isdropped is true cannot be fetched
by other transactions because the relid is not seen there. Still the
same session could do that by repeatedly reindexing or invalidation on
the same relation and I think it is safe because the content of the
entry coudln't be changed and the cache content is reusable. That
being said, it is actually makes things unclear.

I came up with two alternatives. One is a variant of
RelationIdGetRelation for the purpose. The new function
RelationIdGetRelationCache is currently used (only) in ATExecAddIndex
so we could restrict it to return only dropped relations.

Another is another "stashed" relcache, but it seems to make things too
complex.

> On Thu, Jan 16, 2020 at 02:20:57PM +0900, Kyotaro Horiguchi wrote:
> > --- a/src/backend/utils/cache/relcache.c
> > +++ b/src/backend/utils/cache/relcache.c
> > @@ -3114,8 +3153,10 @@ AtEOXact_cleanup(Relation relation, bool isCommit)
> > */
> > if (relation->rd_createSubid != InvalidSubTransactionId)
> > {
> > - if (isCommit)
> > - relation->rd_createSubid = InvalidSubTransactionId;
> > + relation->rd_createSubid = InvalidSubTransactionId;
> > +
> > + if (isCommit && !relation->rd_isdropped)
> > + {} /* Nothing to do */
>
> What is the purpose of this particular change? This executes at the end of a
> top-level transaction. We've already done any necessary syncing, and we're
> clearing any flags that caused WAL skipping. I think it's no longer
> productive to treat dropped relations differently.

It executes the pending *relcache* drop we should have done in
ATPostAlterTypeCleanup (or in RelationClearRelation) if the newness
flags were false. The entry misses the chance of being removed (then
bloats the relcache) if we don't do that there. I added a comment
there to exlaining that.

> > @@ -3232,6 +3272,19 @@ AtEOSubXact_cleanup(Relation relation, bool isCommit,
> > }
> > }
> >
> > + /*
> > + * If this relation registered pending sync then dropped, subxact rollback
> > + * cancels the uncommitted drop, and commit propagates it to the parent.
> > + */
> > + if (relation->rd_isdropped)
> > + {
> > + Assert (!relation->rd_isvalid &&
> > + (relation->rd_createSubid != InvalidSubTransactionId ||
> > + relation->rd_firstRelfilenodeSubid != InvalidSubTransactionId));
> > + if (!isCommit)
> > + relation->rd_isdropped = false;
>
> This does the wrong thing when there exists some subtransaction rollback that
> does not rollback the DROP:

Sorry for my stupid. I actually thought something like that on the
way. After all, I concluded that the dropped flag ought to behave same
way with rd_createSubid.

> \pset null 'NULL'
> begin;
> create extension pg_visibility;
> create table droppedtest (c int);
> select 'droppedtest'::regclass::oid as oid \gset
> savepoint q; drop table droppedtest; release q; -- rd_dropped==true
> select * from pg_visibility_map(:oid); -- processes !rd_isvalid rel (not ideal)
> savepoint q; select 1; rollback to q; -- rd_dropped==false (wrong)
> savepoint q; select 1; rollback to q;
> select pg_relation_size(:oid), pg_relation_filepath(:oid),
> has_table_privilege(:oid, 'SELECT'); -- all nulls, okay
> select * from pg_visibility_map(:oid); -- assertion failure
> rollback;

And I teached RelationIdGetRelation not to return dropped
relations. So the (not ideal) cases just fail as before.

Three other fixes not mentined above are made. One is the useless
rd_firstRelfilenodeSubid in the condition to dicide whether to
preserve or not a relcache entry, and the forgotten copying of other
newness flags. Another is the forgotten SWAPFIELD on
rd_dropedSubid. The last is the forgotten change in
out/equal/copyfuncs.

Please find the attached.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
v33-0001-Rework-WAL-skipping-optimization.patch text/x-patch 98.7 KB
v33-0002-Fix-the-defect-1.patch text/x-patch 10.3 KB
v33-0003-Fix-the-defect-2.patch text/x-patch 4.7 KB
v33-0004-Fix-the-defect-3.patch text/x-patch 4.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Matteo Beccati 2020-01-21 09:55:46 Re: [HACKERS] kqueue
Previous Message Luis Carril 2020-01-21 09:36:33 Re: Option to dump foreign data in pg_dump