Re: [PATCH]make pg_rewind to not copy useless WAL files

From: chenhj <chjischj(at)163(dot)com>
To: "Alexander Korotkov" <a(dot)korotkov(at)postgrespro(dot)ru>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH]make pg_rewind to not copy useless WAL files
Date: 2017-09-29 07:07:46
Message-ID: 2966b75f.c896.15ecc766ac3.Coremail.chjischj@163.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2017-09-29 05:31:51, "Alexander Korotkov" <a(dot)korotkov(at)postgrespro(dot)ru> wrote:

On Thu, Sep 28, 2017 at 10:52 PM, chenhj <chjischj(at)163(dot)com> wrote:

On 2017-09-29 00:43:18,"Alexander Korotkov" <a(dot)korotkov(at)postgrespro(dot)ru> wrote:

On Thu, Sep 28, 2017 at 6:44 PM, chenhj <chjischj(at)163(dot)com> wrote:

On 2017-09-28 01:29:29,"Alexander Korotkov" <a(dot)korotkov(at)postgrespro(dot)ru> wrote:

It appears that your patch conflicts with fc49e24f. Please, rebase it.

Yes, i had rebased it, Please check the new patch.

Good, now it applies cleanly.

else if (strncmp(path, XLOGDIR"/", strlen(XLOGDIR"/")) == 0 &&
IsXLogFileName(path + strlen(XLOGDIR"/")) &&
(strcmp(path + strlen(XLOGDIR"/") + 8, divergence_wal_filename + 8) < 0 ||
strcmp(path + strlen(XLOGDIR"/") + 8, last_source_wal_filename + 8) > 0))

According to our conding style, you should leave a space betwen XLOGDIF and "/".
Also, you do a trick by comparison xlog segment numbers using strcmp(). It's nice, but I would prefer seeing XLogFromFileName() here. It would improve code readability and be less error prone during further modifications.

Thanks for advice!
I had modified it.

OK. Patch becomes better.
I also have more general question. Why do we need upper bound for segment number (last_source_segno)? I understand the purpose of lower bound (divergence_segno) which save us from copying extra WAL files, but what is upper bound for? As I understood, we anyway need to replay most recent WAL records to reach consistent state after pg_rewind. I propose to remove last_source_segno unless I'm missing something.

Thanks for relay!
When checkpoint occurs, some old WAL files will be renamed as future WAL files for later use.
The upper bound for segment number (last_source_segno) is used to avoid copying these extra WAL files.

When the parameter max_wal_size or max_min_size is large,these may be many renamed old WAL files for reused.

For example, I have just looked at one of our production systems (max_wal_size = 64GB, min_wal_size = 2GB),
the total size of WALs is about 30GB, and contains about 4GB renamed old WAL files.

[postgres(at)hostxxx pg_xlog]$ ll
...
-rw------- 1 postgres postgres 16777216 Sep 29 14:05 0000000100000BCF00000078
-rw------- 1 postgres postgres 16777216 Sep 29 14:05 0000000100000BCF00000079
-rw------- 1 postgres postgres 16777216 Sep 29 14:05 0000000100000BCF0000007A
-rw------- 1 postgres postgres 16777216 Sep 29 14:05 0000000100000BCF0000007B
-rw------- 1 postgres postgres 16777216 Sep 29 14:05 0000000100000BCF0000007C
-rw------- 1 postgres postgres 16777216 Sep 29 14:05 0000000100000BCF0000007D
-rw------- 1 postgres postgres 16777216 Sep 29 11:22 0000000100000BCF0000007E //after this, there are about 4GB WALs for reuse
-rw------- 1 postgres postgres 16777216 Sep 29 11:08 0000000100000BCF0000007F
-rw------- 1 postgres postgres 16777216 Sep 29 11:06 0000000100000BCF00000080
-rw------- 1 postgres postgres 16777216 Sep 29 12:05 0000000100000BCF00000081
-rw------- 1 postgres postgres 16777216 Sep 29 11:28 0000000100000BCF00000082
-rw------- 1 postgres postgres 16777216 Sep 29 11:06 0000000100000BCF00000083
...

-----
Best Regards,
Chen Huajun

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message amul sul 2017-09-29 07:09:21 Re: UPDATE of partition key
Previous Message Fabien COELHO 2017-09-29 06:48:30 Re: pgbench stuck with 100% cpu usage