Re: Refactor pg_rewind code and make it work against a standby

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: hlinnaka(at)iki(dot)fi
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Refactor pg_rewind code and make it work against a standby
Date: 2020-09-18 07:41:50
Message-ID: 20200918.164150.1688011206252014871.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello.

It needed rebasing. (Attached)

At Tue, 25 Aug 2020 16:32:02 +0300, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote in
> On 20/08/2020 11:32, Kyotaro Horiguchi wrote:
> > 0002: Rewording that old->target and new->source makes the meaning far
>
> Good idea! I changed the patch that way.

Looks Good.

> > 0003: Thomas is propsing sort template. It could be used if committed.

+ * If the block is beyond the EOF in the source system, or the file doesn't
+ * doesn'exist in the source at all, we're going to truncate/remove it away

"the file doesn't doesn'exist"

I don't think filemap_finalize needs to iterate over filemap twice.

hash_string_pointer is a copy of that of pg_verifybackup.c. Is it
worth having in hashfn.h or .c?

> --- a/src/bin/pg_rewind/pg_rewind.c
> +++ b/src/bin/pg_rewind/pg_rewind.c
> ...
> + filemap_t *filemap;
> ..
> + filemap_init();
> ...
> + filemap = filemap_finalize();

I'm a bit confused by this, and realized that what filemap_init
initializes is *not* the filemap, but the filehash. So for example,
the name of the functions might should be something like this?

filehash_init()
filemap = filehash_finalyze()/create_filemap()

> > 0004:
> > The names of many of the functions gets an additional word "local"
> > but I don't get the meaning clearly. but its about linguistic sense
> > and I'm not fit to that..
> > -rewind_copy_file_range(const char *path, off_t begin, off_t end, bool
> > -trunc)
> > +local_fetch_file_range(rewind_source *source, const char *path,
> > uint64 off,
> > The function actually copying the soruce range to the target file. So
> > "fetch" might give somewhat different meaning, but its about
> > linguistic (omitted..).
>
> Hmm. It is "fetching" the range from the source server, and writing it
> to the target. The term makes more sense with a libpq source. Perhaps
> this function should be called "local_copy_range" or something, but
> it'd also be nice to have "fetch" in the name because the function
> pointer it's assigned to is called "queue_fetch_range".

Thanks. Yeah libpq_fetch_file makes sense. I agree to the name.
The refactoring looks good to me.

> > I'm going to continue reviewing this later.
>
> Thanks! Attached is a new set of patches. The only meaningful change
> is in the 2nd patch, which I modified per your suggestion. Also, I
> moved the logic to decide each file's fate into a new subroutine
> called decide_file_action().

That's looks good.

0005:

+ /*
+ * We don't intend do any updates. Put the connection in read-only mode
+ * to keep us honest.
+ */
run_simple_command(conn, "SET default_transaction_read_only = off");

The comment is wrong since the time it was added by 0004 but that's
not a problem since it was to be fixed by 0005. However, we need the
variable turned on in order to be really honest:p

> /*
> * Also check that full_page_writes is enabled. We can get torn pages if
> * a page is modified while we read it with pg_read_binary_file(), and we
> * rely on full page images to fix them.
> */
> str = run_simple_query(conn, "SHOW full_page_writes");
> if (strcmp(str, "on") != 0)
> pg_fatal("full_page_writes must be enabled in the source server");
> pg_free(str);

This is a part not changed by this patch set. But If we allow to
connect to a standby, this check can be tricked by setting off on the
primary and "on" on the standby (FWIW, though). Some protection
measure might be necessary. (Maybe standby should be restricted to
have the same value with the primary.)

+ thislen = Min(len, CHUNK_SIZE - prev->length);
+ src->request_queue[src->num_requests - 1].length += thislen;

prev == &src->request_queue[src->num_requests - 1] here.

+ if (chunksize > rq->length)
+ {
+ pg_fatal("received more than requested for file \"%s\"",
+ rq->path);
+ /* receiving less is OK, though */

Don't we need to truncate the target file, though?

+ * Source is a local data directory. It should've shut down cleanly,
+ * and we must to the latest shutdown checkpoint.

"and we must to the" => "and we must replay to the" ?

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
v2_5-0001-pg_rewind-Move-syncTargetDirectory-to-file_ops.c.patch text/x-patch 4.4 KB
v2_5-0002-Refactor-pg_rewind-for-more-clear-decision-making.patch text/x-patch 29.5 KB
v2_5-0003-pg_rewind-Replace-the-hybrid-list-array-data-stru.patch text/x-patch 21.2 KB
v2_5-0004-pg_rewind-Refactor-the-abstraction-to-fetch-from-.patch text/x-patch 41.9 KB
v2_5-0005-Allow-pg_rewind-to-use-a-standby-server-as-the-so.patch text/x-patch 17.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2020-09-18 07:55:30 Re: pgbench calculates summary numbers a wrong way.
Previous Message Alexander Kukushkin 2020-09-18 07:17:23 Re: Concurrency issue in pg_rewind