Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints

From: Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Neha Sharma <neha(dot)sharma(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints
Date: 2021-12-22 10:56:18
Message-ID: CAE9k0PkPyDYm9joSN6myRFN__taXvFUqf1e8oR=OP0GVYDYkwQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Dec 22, 2021 at 2:44 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:

> On Tue, Dec 21, 2021 at 11:10 AM Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>
> wrote:
> >
> > I am getting the below error when running the same test-case that Neha
> shared in her previous email.
> >
> > ERROR: 55000: some relations of database "test1" are already in
> tablespace "tab1"
> > HINT: You must move them back to the database's default tablespace
> before using this command.
> > LOCATION: movedb, dbcommands.c:1555
> >
> > test-case:
> > ========
> > create tablespace tab1 location '/home/ashu/test1';
> > create tablespace tab location '/home/ashu/test';
> >
> > create database test tablespace tab;
> > \c test
> >
> > create table t(a int primary key, b text);
> >
> > create or replace function large_val() returns text language sql as
> 'select array_agg(md5(g::text))::text from generate_series(1, 256) g';
> >
> > insert into t values (generate_series(1,100000), large_val());
> >
> > alter table t set tablespace tab1 ;
> >
> > \c postgres
> > create database test1 template test;
> >
> > \c test1
> > alter table t set tablespace tab;
> >
> > \c postgres
> > alter database test1 set tablespace tab1; -- this fails with the given
> error.
> >
> > Observations:
> > ===========
> > Please note that before running above alter database statement, the
> table 't' is moved to tablespace 'tab' from 'tab1' so not sure why
> ReadDir() is returning true when searching for table 't' in tablespace
> 'tab1'. It should have returned NULL here:
> >
> > while ((xlde = ReadDir(dstdir, dst_dbpath)) != NULL)
> > {
> > if (strcmp(xlde->d_name, ".") == 0 ||
> > strcmp(xlde->d_name, "..") == 0)
> > continue;
> >
> > ereport(ERROR,
> > (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> > errmsg("some relations of database \"%s\" are
> already in tablespace \"%s\"",
> > dbname, tblspcname),
> > errhint("You must move them back to the database's
> default tablespace before using this command.")));
> > }
> >
> > Also, if I run the checkpoint explicitly before executing the above
> alter database statement, this error doesn't appear which means it only
> happens with the new changes because earlier we were doing the force
> checkpoint at the end of createdb statement.
> >
>
> Basically, ALTER TABLE SET TABLESPACE, will register the
> SYNC_UNLINK_REQUEST for the table files w.r.t the old tablespace, but
> those will get unlinked during the next checkpoint. Although the
> files must be truncated during commit itself but unlink might not have
> been processed until the next checkpoint. This is the explanation for
> the behavior you found during your investigation, but I haven't looked
> into the issue so I will do it latest by tomorrow and send my
> analysis.
>
> Thanks for working on this.
>

Yeah the problem here is that the old rel file that needs to be unlinked
still exists in the old tablespace. Earlier, without your changes we were
doing force checkpoint before starting with the actual work for the alter
database which unlinked/deleted the rel file from the old tablespace, but
that is not the case here. Now we have removed the force checkpoint from
movedb() which means until the auto checkpoint happens the old rel file
will remain in the old tablespace thereby creating this problem.

--
With Regards,
Ashutosh Sharma.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ajin Cherian 2021-12-22 11:29:43 Re: row filtering for logical replication
Previous Message osumi.takamichi@fujitsu.com 2021-12-22 10:24:05 RE: Optionally automatically disable logical replication subscriptions on error