Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints
Date: 2021-09-02 15:22:23
Message-ID: CA+TgmoZQOCWztZV5KrCT=_78vFsX6bO65RXTU5aE+u7zoCJ+qw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Sep 2, 2021 at 2:06 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> 0003- The main patch for WAL logging the created database operation.

Andres pointed out that this approach ends up accessing relations
without taking a lock on them. It doesn't look like you did anything
about that.

+ /* Built-in oids are mapped directly */
+ if (classForm->oid < FirstGenbkiObjectId)
+ relfilenode = classForm->oid;
+ else if (OidIsValid(classForm->relfilenode))
+ relfilenode = classForm->relfilenode;
+ else
+ continue;

Am I missing something, or is this totally busted?

[rhaas pgsql]$ createdb
[rhaas pgsql]$ psql
psql (15devel)
Type "help" for help.

rhaas=# select oid::regclass from pg_class where relfilenode not in
(0, oid) and oid < 10000;
oid
-----
(0 rows)

rhaas=# vacuum full pg_attrdef;
VACUUM
rhaas=# select oid::regclass from pg_class where relfilenode not in
(0, oid) and oid < 10000;
oid
--------------------------------
pg_attrdef_adrelid_adnum_index
pg_attrdef_oid_index
pg_toast.pg_toast_2604
pg_toast.pg_toast_2604_index
pg_attrdef
(5 rows)

/*
+ * Now drop all buffers holding data of the target database; they should
+ * no longer be dirty so DropDatabaseBuffers is safe.

The way things worked before, this was true, but now AFAICS it's
false. I'm not sure whether that means that DropDatabaseBuffers() here
is actually unsafe or whether it just means that you haven't updated
the comment to explain the reason.

+ * Since we copy the file directly without looking at the shared buffers,
+ * we'd better first flush out any pages of the source relation that are
+ * in shared buffers. We assume no new changes will be made while we are
+ * holding exclusive lock on the rel.

Ditto.

+ /* As always, WAL must hit the disk before the data update does. */

Actually, the way it's coded now, part of the on-disk changes are done
before WAL is issued, and part are done after. I doubt that's the
right idea. There's nothing special about writing the actual payload
bytes vs. the other on-disk changes (creating directories and files).
In any case the ordering deserves a better-considered comment than
this one.

+ XLogRegisterData((char *) PG_MAJORVERSION, nbytes);

Surely this is utterly pointless.

+ CopyDatabase(src_dboid, dboid, src_deftablespace, dst_deftablespace);
PG_END_ENSURE_ERROR_CLEANUP(createdb_failure_callback,
PointerGetDatum(&fparms));

I'd leave braces around the code for which we're ensuring error
cleanup even if it's just one line.

+ if (info == XLOG_DBASEDIR_CREATE)
{
xl_dbase_create_rec *xlrec = (xl_dbase_create_rec *) XLogRecGetData(record);

Seems odd to rename the record but not change the name of the struct.
I think I would be inclined to keep the existing record name, but if
we're going to change it we should be more thorough.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2021-09-02 15:28:27 Re: replay of CREATE TABLESPACE eats data at wal_level=minimal
Previous Message Jaime Casanova 2021-09-02 15:20:50 Re: shared-memory based stats collector