Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Maciek Sakrejda <m(dot)sakrejda(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints
Date: 2022-03-22 20:44:22
Message-ID: CA+TgmoYKovODW2Y7rQmmRFaKu445p9uAahjpgfbY8eyeL07BXA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 21, 2022 at 2:23 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Mon, Mar 21, 2022 at 11:21 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> > I tried to debug the case but I realized that somehow
> > CHECK_FOR_INTERRUPTS() is not calling the
> > AcceptInvalidationMessages() and I could not find the same while
> > looking into the code as well. While debugging I noticed that
> > AcceptInvalidationMessages() is called multiple times but that is only
> > through LockRelationId() but while locking the relation we had already
> > closed the previous smgr because at a time we keep only one smgr open.
> > And that's the reason it is not hitting the issue which we think it
> > could. Is there any condition under which it will call
> > AcceptInvalidationMessages() through CHECK_FOR_INTERRUPTS() ? because
> > I could not see while debugging as well as in code.
>
> Yeah, I think the reason you can't find it is that it's not there. I
> was confused in what I wrote earlier. I think we only process sinval
> catchups when we're idle, not at every CHECK_FOR_INTERRUPTS(). And I
> think the reason for that is precisely that it would be hard to write
> correct code otherwise, since invalidations might then get processed
> in a lot more places. So ... I guess all we really need to do here is
> avoid assuming that the results of smgropen() are valid across any
> code that might acquire relation locks. Which I think the code already
> does.

So I talked to Andres and Thomas about this and they told me that I
was right to worry about this problem. Over on the thread about "wrong
fds used for refilenodes after pg_upgrade relfilenode changes
Reply-To:" there is a plan to make use ProcSignalBarrier to make smgr
objects disappear, and ProcSignalBarrier can be processed at any
CHECK_FOR_INTERRUPTS(), so then we'd have a problem here. Commit
f10f0ae420ee62400876ab34dca2c09c20dcd030 established a policy that you
should always re-fetch the smgr object instead of reusing one you've
already got, and even before that it was known to be unsafe to keep
them around for any period of time, because anything that opened a
relation, including a syscache lookup, could potentially accept
invalidations. So most of our code is already hardened against the
possibility of smgr objects disappearing. I have a feeling there may
be some that isn't, but it would be good if this patch didn't
introduce more such code at the same time that patch is trying to
introduce more ways to get rid of smgr objects. It was suggested to me
that what this patch ought to be doing is calling
CreateFakeRelcacheEntry() and then using RelationGetSmgr(fakerel)
every time we need the SmgrRelation, without ever keeping it around
for any amount of code. That way, if the smgr relation gets closed out
from under us at a CHECK_FOR_INTERRUPTS(), we'll just recreate it at
the next RelationGetSmgr() call.

Andres also noted that he thinks the patch performs redundant cleanup,
because of the fact that it uses RelationCreateStorage. That will
arrange to remove files on abort, but createdb() also has its own
mechanism for that. It doesn't seem like a thing to do twice in two
different ways.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-03-22 20:55:49 Re: MDAM techniques and Index Skip Scan patch
Previous Message Jacob Champion 2022-03-22 20:42:37 Re: [PATCH] Accept IP addresses in server certificate SANs