Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints
Date: 2021-09-03 08:55:10
Message-ID: CAFiTN-sP_6hWv5AxcwnWCgg=4hyEeeZcCgFucZsYWpr3XQbP1g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jun 18, 2021 at 12:18 AM Andres Freund <andres(at)anarazel(dot)de> wrote:

> Hi,
>
> On 2021-06-17 14:22:52 -0400, Robert Haas wrote:
> > On Thu, Jun 17, 2021 at 2:17 PM Andres Freund <andres(at)anarazel(dot)de>
> wrote:
> > > Adding a hacky special case implementation for cross-database relation
> > > accesses that violates all kinds of assumptions (like holding a lock on
> > > a relation when accessing it / pinning pages, processing relcache
> > > invals, ...) doesn't seem like a good plan.
> >
> > I agree that we don't want hacky code that violates assumptions, but
> > bypassing shared_buffers is a bit hacky, too. Can't we lock the
> > relations as we're copying them? We know pg_class's OID a fortiori,
> > and we can find out all the other OIDs as we go.

> We possibly can - but I'm not sure that won't end up violating some
> other assumptions.
>

Yeah, we can surely lock the relation as described by Robert, but IMHO,
while creating the database we are already holding the exclusive lock on
the database and there is no one else allowed to be connected to the
database, so do we actually need to bother about the lock for the
correctness?

> > I'm just thinking that the hackiness of going around shared_buffers
> > feels irreducible, but maybe the hackiness in the patch is something
> > that can be solved with more engineering.
>
> Which bypassing of shared buffers are you talking about here? We'd still
> have to solve a subset of the issues around locking (at least on the
> source side), but I don't think we need to read pg_class contents to be
> able to go through shared_buffers? As I suggested, we can use the init
> fork presence to infer relpersistence?
>

I believe we want to avoid scanning pg_class for identifying the relation
list so that we can avoid this special-purpose code? IMHO, scanning the
disk, such as going through all the tablespaces and then finding the source
database directory and identifying each relfilenode, also appears to be a
special-purpose code, unless I am missing what you mean by special-purpose
code.

Or do you mean that looking at the filesystem at all is bypassing shared
> buffers?
>

I think we already have such a code in multiple places where we bypass the
shared buffers for copying the relation
e.g. index_copy_data(), heapam_relation_copy_data(). So the current system
as it stands, we can not claim that we are designing something for the
first time where we are bypassing the shared buffers. So if we are
thinking that bypassing the shared buffers is hackish and we don't want to
use it for the new patches then lets avoid it completely even while
identifying the relfilenodes to be copied.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2021-09-03 09:41:50 Re: Question about an Extension Project
Previous Message Phil Krylov 2021-09-03 08:17:47 Re: [PATCH] pg_ctl should not truncate command lines at 1024 characters