Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints

From: Andres Freund <andres(at)anarazel(dot)de>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Maciek Sakrejda <m(dot)sakrejda(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints
Date: 2022-03-31 16:21:59
Message-ID: 20220331162159.2jh6553qi7ix6uac@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2022-03-31 13:22:24 +0530, Dilip Kumar wrote:
> 0001 is changing the strategy to file copy during initdb and 0002
> patch adds the test cases for both these cases.

Thanks!

> From 4a997e2a95074a520777cd2b369f9c728b360969 Mon Sep 17 00:00:00 2001
> From: Dilip Kumar <dilipkumar(at)localhost(dot)localdomain>
> Date: Thu, 31 Mar 2022 10:43:16 +0530
> Subject: [PATCH 1/2] Use file_copy strategy during initdb
>
> Because skipping the checkpoint during initdb will not result
> in significant savings, so there is no point in using wal_log
> as that will simply increase the cluster size by generating
> extra wal.
> ---
> src/bin/initdb/initdb.c | 14 +++++++++++---
> 1 file changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
> index 5e36943..1256082 100644
> --- a/src/bin/initdb/initdb.c
> +++ b/src/bin/initdb/initdb.c
> @@ -1856,6 +1856,11 @@ make_template0(FILE *cmdfd)
> * it would fail. To avoid that, assign a fixed OID to template0 rather
> * than letting the server choose one.
> *
> + * Using file_copy strategy is preferable over wal_log here because
> + * skipping the checkpoint during initdb will not result in significant
> + * savings, so there is no point in using wal_log as that will simply
> + * increase the cluster size by generating extra wal.

It's not just the increase in size, it's also the increase in time due to WAL logging.

> * (Note that, while the user could have dropped and recreated these
> * objects in the old cluster, the problem scenario only exists if the OID
> * that is in use in the old cluster is also used in the new cluster - and
> @@ -1863,7 +1868,7 @@ make_template0(FILE *cmdfd)
> */
> static const char *const template0_setup[] = {
> "CREATE DATABASE template0 IS_TEMPLATE = true ALLOW_CONNECTIONS = false OID = "
> - CppAsString2(Template0ObjectId) ";\n\n",
> + CppAsString2(Template0ObjectId) " STRATEGY = file_copy;\n\n",

I'd perhaps break this into a separate line, but...

> From d0759bcfc4fed674e938e4a03159f5953ca9718d Mon Sep 17 00:00:00 2001
> From: Dilip Kumar <dilipkumar(at)localhost(dot)localdomain>
> Date: Thu, 31 Mar 2022 12:07:19 +0530
> Subject: [PATCH 2/2] Create database test coverage
>
> Test create database strategy wal replay and alter database
> set tablespace.
> ---
> src/test/modules/test_misc/t/002_tablespace.pl | 12 ++++++++++++
> src/test/recovery/t/001_stream_rep.pl | 24 ++++++++++++++++++++++++
> 2 files changed, 36 insertions(+)
>
> diff --git a/src/test/modules/test_misc/t/002_tablespace.pl b/src/test/modules/test_misc/t/002_tablespace.pl
> index 04e5439..f3bbddc 100644
> --- a/src/test/modules/test_misc/t/002_tablespace.pl
> +++ b/src/test/modules/test_misc/t/002_tablespace.pl
> @@ -83,7 +83,19 @@ $result = $node->psql('postgres',
> "ALTER TABLE t SET tablespace regress_ts1");
> ok($result == 0, 'move table in-place->abs');
>
> +# Test ALTER DATABASE SET TABLESPACE
> +$result = $node->psql('postgres',
> + "CREATE DATABASE testdb TABLESPACE regress_ts1");
> +ok($result == 0, 'create database in tablespace 1');
> +$result = $node->psql('testdb',
> + "CREATE TABLE t ()");
> +ok($result == 0, 'create table in testdb database');
> +$result = $node->psql('postgres',
> + "ALTER DATABASE testdb SET TABLESPACE regress_ts2");
> +ok($result == 0, 'move database to tablespace 2');

This just tests the command doesn't fail, but not whether it actually did
something useful. Seem we should at least insert a row or two into the the
table, and verify they can be accessed?

> +# Create database with different strategies and check its presence in standby
> +$node_primary->safe_psql('postgres',
> + "CREATE DATABASE testdb1 STRATEGY = FILE_COPY; ");
> +$node_primary->safe_psql('testdb1',
> + "CREATE TABLE tab_int AS SELECT generate_series(1,10) AS a");
> +$node_primary->safe_psql('postgres',
> + "CREATE DATABASE testdb2 STRATEGY = WAL_LOG; ");
> +$node_primary->safe_psql('testdb2',
> + "CREATE TABLE tab_int AS SELECT generate_series(1,10) AS a");
> +
> +# Wait for standbys to catch up
> +$primary_lsn = $node_primary->lsn('write');
> +$node_primary->wait_for_catchup($node_standby_1, 'replay', $primary_lsn);
> +
> +$result =
> + $node_standby_1->safe_psql('testdb1', "SELECT count(*) FROM tab_int");
> +print "standby 1: $result\n";
> +is($result, qq(10), 'check streamed content on standby 1');
> +
> +$result =
> + $node_standby_1->safe_psql('testdb2', "SELECT count(*) FROM tab_int");
> +print "standby 1: $result\n";
> +is($result, qq(10), 'check streamed content on standby 1');
> +
> # Check that only READ-only queries can run on standbys
> is($node_standby_1->psql('postgres', 'INSERT INTO tab_int VALUES (1)'),
> 3, 'read-only queries on standby 1');

I'd probably add a function for creating database / table and then testing it,
with a strategy parameter. That way we can afterwards add more tests verifying
that everything worked.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2022-03-31 16:25:18 Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints
Previous Message Andres Freund 2022-03-31 16:10:02 Re: pgsql: Add 'basebackup_to_shell' contrib module.