pg_rewind copies

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Dimitri Fontaine <dim(at)tapoueh(dot)org>
Subject: pg_rewind copies
Date: 2020-11-13 09:46:01
Message-ID: f67feb24-5833-88cb-1020-19a4a2b83ac7@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

If a file is modified and becomes larger in the source system while
pg_rewind is running, pg_rewind can leave behind a partial copy of file.
That's by design, and it's OK for relation files because they're
replayed from WAL. But it can cause trouble for configuration files.

I ran into this while playing with pg_auto_failover. After failover,
pg_auto_failover would often launch pg_rewind, and run ALTER SYSTEM on
the primary while pg_rewind was running. The resulting rewound system
would fail to start up:

Nov 13 09:24:42 pg-node-a pg_autoctl[2217]: 09:24:42 2220 ERROR
2020-11-13 09:24:32.547 GMT [2246] LOG: syntax error in file
"/data/pgdata/postgresql.auto.conf" line 4, near token "'"
Nov 13 09:24:42 pg-node-a pg_autoctl[2217]: 09:24:42 2220 ERROR
2020-11-13 09:24:32.547 GMT [2246] FATAL: configuration file
"postgresql.auto.conf" contains errors

Attached is a patch to mitigate that. It changes pg_rewind so that when
it copies a whole file, it ignores the original file size. It's not a
complete cure: it still believes the original size for files larger than
1 MB. That limit was just expedient given the way the chunking logic in
libpq_source.c works, but should be enough for configuration files.

There's another race condition that this doesn't try to fix: If a file
is modified while it's being copied, you can have a torn file with one
half of the file from the old version, and one half from the new. That's
a much more narrow window, though, and pg_basebackup has the same problem.

- Heikki

Attachment Content-Type Size
0001-pg_rewind-Fetch-small-files-according-to-new-size.patch text/x-patch 10.0 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2020-11-13 09:48:58 Re: logical streaming of xacts via test_decoding is broken
Previous Message Magnus Hagander 2020-11-13 09:39:51 Re: [PATCH] remove deprecated v8.2 containment operators