Re: Refactor pg_rewind code and make it work against a standby

From: Soumyadeep Chakraborty <soumyadeep2007(at)gmail(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: hlinnaka(at)iki(dot)fi, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Refactor pg_rewind code and make it work against a standby
Date: 2020-09-20 20:44:28
Message-ID: CAE-ML+-uLy_DiS4VSkbsjMEY9ZBob5LqBXPc31_dQ-EsSesgEw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hey Heikki,

Thanks for refactoring and making the code much easier to read!

Before getting into the code review for the patch, I wanted to know why
we don't use a Bitmapset for target_modified_pages?

Code review:

1. We need to update the comments for process_source_file and
process_target_file. We don't decide the action on the file until later.

2. Rename target_modified_pages to target_pages_to_overwrite?
target_modified_pages can lead to confusion as to whether it includes pages
that were modified on the target but not even present in the source and
the other exclusionary cases. target_pages_to_overwrite is much clearer.

3.

> /*
> * If this is a relation file, copy the modified blocks.
> *
> * This is in addition to any other changes.
> */
> iter = datapagemap_iterate(&entry->target_modified_pages);
> while (datapagemap_next(iter, &blkno))
> {
> offset = blkno * BLCKSZ;
>
> source->queue_fetch_range(source, entry->path, offset, BLCKSZ);
> }
> pg_free(iter);

Can we put this hunk into a static function overwrite_pages()?

4.

> * block that have changed in the target system. It makes note of all the
> * changed blocks in the pagemap of the file.

Can we replace the above with:

> * block that has changed in the target system. It decides if the given
blkno in the target relfile needs to be overwritten from the source.

5.

> /*
> * Doesn't exist in either server. Why does it have an entry in the
> * first place??
> */
> return FILE_ACTION_NONE;

Can we delete the above hunk and add the following assert to the very
top of decide_file_action():

Assert(entry->target_exists || entry->source_exists);

6.

> pg_fatal("unexpected page modification for directory or symbolic link \"%s\"",
> entry->path);

Can we replace above with:

pg_fatal("unexpected page modification for non-regular file \"%s\"",
entry->path);

This way we can address the undefined file type.

7. Please address the FIXME for the symlink case:
/* FIXME: Check if it points to the same target? */

8.

* it anyway. But there's no harm in copying it now.)

and

* copy them here. But we don't know which scenario we're
* dealing with, and there's no harm in copying the missing
* blocks now, so do it now.

Could you add a line or two explaining why there is "no harm" in these
two hunks above?

9. Can we add pg_control, /pgsql_tmp/... and .../pgsql_tmp.* and PG_VERSION
files to check_file_excluded()?

10.

- * block that have changed in the target system. It makes note of all the
+ * block that have changed in the target system. It makes note of all the

Whitespace typo

11.

> * If the block is beyond the EOF in the source system, or the file doesn't
> * doesn'exist

Typo: Two doesnt's

12.

> /*
> * This represents the final decisions on what to do with each file.
> * 'actions' array contains an entry for each file, sorted in the order
> * that their actions should executed.
> */
> typedef struct filemap_t
> {
> /* Summary information, filled by calculate_totals() */
> uint64 total_size; /* total size of the source cluster */
> uint64 fetch_size; /* number of bytes that needs to be copied */
>
> int nactions; /* size of 'actions' array */
> file_entry_t *actions[FLEXIBLE_ARRAY_MEMBER];
> } filemap_t;

Replace nactions/actions with nentries/entries..clearer in intent as
it is easier to reconcile the modified pages stuff to an entry rather
than an action. It could be:

/*
* This contains the final decisions on what to do with each file.
* 'entries' array contains an entry for each file, sorted in the order
* that their actions should executed.
*/
typedef struct filemap_t
{
/* Summary information, filled by calculate_totals() */
uint64 total_size; /* total size of the source cluster */
uint64 fetch_size; /* number of bytes that needs to be copied */
int nentries; /* size of 'entries' array */
file_entry_t *entries[FLEXIBLE_ARRAY_MEMBER];
} filemap_t;

13.

> filehash = filehash_create(1000, NULL);

Use a constant for the 1000 in (FILEMAP_INITIAL_SIZE):

14. queue_overwrite_range(), finish_overwrite() instead of
queue_fetch_range(), finish_fetch()? Similarly update\
*_fetch_file_range() and *_finish_fetch()

15. Let's have local_source.c and libpq_source.c instead of *_fetch.c

16.

> conn = PQconnectdb(connstr_source);
>
> if (PQstatus(conn) == CONNECTION_BAD)
> pg_fatal("could not connect to server: %s",
> PQerrorMessage(conn));
>
> if (showprogress)
> pg_log_info("connected to server");

The above hunk should be part of init_libpq_source(). Consequently,
init_libpq_source() should take a connection string instead of a conn.

17.

> if (conn)
> {
> PQfinish(conn);
> conn = NULL;
> }

The hunk above should be part of libpq_destroy()

18.

> /*
> * Files are fetched max CHUNK_SIZE bytes at a time, and with a
> * maximum of MAX_CHUNKS_PER_QUERY chunks in a single query.
> */
> #define CHUNK_SIZE (1024 * 1024)

Can we rename CHUNK_SIZE to MAX_CHUNK_SIZE and update the comment?

19.

> typedef struct
> {
> const char *path; /* path relative to data directory root */
> uint64 offset;
> uint32 length;
> } fetch_range_request;

offset should be of type off_t

20.

> * Request to fetch (part of) a file in the source system, and write it
> * the corresponding file in the target system.

Can we change the above hunk to?

> * Request to fetch (part of) a file in the source system, specified
> * by an offset and length, and write it to the same offset in the
> * corresponding target file.

21.

> * Fetche all the queued chunks and writes them to the target data directory.

Typo in word "fetch".

Regards,
Soumyadeep

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-09-20 21:10:05 Re: Yet another fast GiST build
Previous Message Tom Lane 2020-09-20 20:41:12 Re: Yet another fast GiST build