Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints
Date: 2021-11-24 04:19:32
Message-ID: CAFiTN-v=U58by_BeiZruNhykxk1q9XUxF+qLzD2LZAsEn2EBkg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Nov 23, 2021 at 10:29 PM John Naylor
<john(dot)naylor(at)enterprisedb(dot)com> wrote:
>
> Hi,
>
> I've looked over this patch set and email thread a couple times, and I don't see anything amiss, but I'm also not terribly familiar with the subsystems this part of the code relies on. I haven't yet tried to stress test with a large database, but it seems like a good idea to do so.

Thanks, John for looking into the patches. Yeah, that makes sense,
next week I will try to test with a large database and maybe with
multiple tablespaces as well to see how this behaves.

> I have a couple comments and questions:
>
> 0006:
>
> + * XXX We can optimize RelationMapOidToFileenodeForDatabase API
> + * so that instead of reading the relmap file every time, it can
> + * save it in a temporary variable and use it for subsequent
> + * calls. Then later reset it once we're done or at the
> + * transaction end.
>
> Do we really need to consider optimizing here? Only a handful of relations will be found in the relmap, right?

You are right, it is actually not required I will remove this comment.

>
> + * Once we start copying files from the source database, we need to be able
> + * to clean 'em up if we fail. Use an ENSURE block to make sure this
> + * happens. (This is not a 100% solution, because of the possibility of
> + * failure during transaction commit after we leave this routine, but it
> + * should handle most scenarios.)
>
> This comment in master started with
>
> - * Once we start copying subdirectories, we need to be able to clean 'em
>
> Is the distinction important enough to change this comment? Also, is "most scenarios" still true with the patch? I haven't read into how ENSURE works.

Actually, it is like PG_TRY(), CATCH() block with extra assurance to
cleanup on shm_exit as well. And in the cleanup function, we go
through all the tablespaces and remove the new DB-related directory
which we are trying to create. And you are right, we actually don't
need to change the comments.

> Same with this comment change, seems fine the way it was:

Correct.

> - * Use an ENSURE block to make sure we remove the debris if the copy fails
> - * (eg, due to out-of-disk-space). This is not a 100% solution, because
> - * of the possibility of failure during transaction commit, but it should
> - * handle most scenarios.
> + * Use an ENSURE block to make sure we remove the debris if the copy fails.
> + * This is not a 100% solution, because of the possibility of failure
> + * during transaction commit, but it should handle most scenarios.
>
> And do we need additional tests? Maybe we don't, but it seems good to make sure.
>
> I haven't looked at 0007, and I have no opinion on which approach is better.

Okay, I like approach 6 because of mainly two reasons, 1) it is not
directly scanning the raw file to identify which files to copy so
seems cleaner to me 2) with 0007 if we directly scan directory we
don't know the relation oid, so before acquiring the buffer lock there
is no way to acquire the relation lock.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2021-11-24 04:20:29 Re: Some RELKIND macro refactoring
Previous Message Michael Paquier 2021-11-24 04:11:12 Re: VS2022: Support Visual Studio 2022 on Windows