Re: In-placre persistance change of a relation

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: jakub(dot)wartak(at)tomtom(dot)com
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: In-placre persistance change of a relation
Date: 2022-01-14 02:43:10
Message-ID: 20220114.114310.2172685740667639154.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I found a bug.

mdmarkexists() didn't close the tentatively opend fd. This is a silent
leak on Linux and similars and it causes delete failure on Windows.
It was the reason of the CI failure.

027_persistence_change.pl uses interactive_psql() that doesn't work on
the Windos VM on the CI.

In this version the following changes have been made in 0001.

- Properly close file descriptor in mdmarkexists.

- Skip some tests when IO::Pty is not available.
It might be better to separate that part.

Looking again the ALTER TABLE ALL IN TABLESPACE SET LOGGED patch, I
noticed that it doesn't implement OWNED BY part and doesn't have test
and documenttaion (it was PoC). Added all of them to 0002.

At Tue, 11 Jan 2022 09:33:55 +0000, Jakub Wartak <jakub(dot)wartak(at)tomtom(dot)com> wrote in
> The following review has been posted through the commitfest application:
> make installcheck-world: tested, passed
> Implements feature: tested, passed
> Spec compliant: tested, passed
> Documentation: not tested
>
> I've retested v15 of the patch with everything that came to my mind. The patch passes all my tests (well, there's this just windows / cfbot issue). Patch looks good to me. I haven't looked in-depth at the code, so patch might still need review.

Thanks for checking.

> FYI, about potential usage of this patch: the most advanced test that I did was continually bouncing wal_level - it works. So chain of :
> 1. wal_level=replica->minimal
> 2. alter table set unlogged and load a lot of data, set logged
> 3. wal_level=minimal->replica
> 4. barman incremental backup # rsync(1) just backups changed files in steps 2 and 3 (not whole DB)
> 5. some other (logged) work
> The idea is that when performing mass-alterations to the DB (think nightly ETL/ELT on TB-sized DBs), one could skip backing up most of DB and then just quickly backup only the changed files - during the maintenance window - e.g. thanks to local-rsync barman mode. This is the output of barman show-backups after loading data to unlogged table each such cycle:
> mydb 20220110T100236 - Mon Jan 10 10:05:14 2022 - Size: 144.1 GiB - WAL Size: 16.0 KiB
> mydb 20220110T094905 - Mon Jan 10 09:50:12 2022 - Size: 128.5 GiB - WAL Size: 80.2 KiB
> mydb 20220110T094016 - Mon Jan 10 09:40:17 2022 - Size: 109.1 GiB - WAL Size: 496.3 KiB
> And dedupe ratio of the last one: Backup size: 144.1 GiB. Actual size on disk: 36.1 GiB (-74.96% deduplication ratio).

Ah, The patch skips duping relation files. This is advantageous that
that not only eliminates the I/O activities the duping causes but also
reduce the size of incremental backup. I didn't noticed only the
latter advantage.

> The only thing I've found out that bouncing wal_level also forces max_wal_senders=X -> 0 -> X which in turn requires dropping replication slot for pg_receievewal (e.g. barman receive-wal --create-slot/--drop-slot/--reset). I have tested the restore using barman recover afterwards to backup 20220110T094905 and indeed it worked OK using this patch too.

Year, it is irrelevant to this patch but I'm annoyed by the
restriction. I think it would be okay that max_wal_senders is
forcibly set to 0 while wal_level=minimal..

> The new status of this patch is: Needs review

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
v16-0001-In-place-table-persistence-change.patch text/x-patch 82.6 KB
v16-0002-New-command-ALTER-TABLE-ALL-IN-TABLESPACE-SET-LO.patch text/x-patch 20.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Julien Rouhaud 2022-01-14 02:44:02 Re: Schema variables - new implementation for Postgres 15
Previous Message Masahiko Sawada 2022-01-14 02:25:02 Re: Skipping logical replication transactions on subscriber side