Re: pg_rewind in contrib

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Satoshi Nagayasu <snaga(at)uptime(dot)jp>, PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>, Michael Paquier <mpaquier(at)vmware(dot)com>
Subject: Re: pg_rewind in contrib
Date: 2015-01-14 09:43:00
Message-ID: 54B63A24.7010708@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 01/07/2015 06:19 PM, Alvaro Herrera wrote:

Fixed most of the issues you listed. More on a few remaining ones below.

> Please don't name files with generic names such as util.c/h. It's
> annoying later; for instance when I want to open analyze.c I have to
> choose the one in parser or commands. (pg_upgrade in particular is
> already a mess; let's not copy that.)

There was only one function in util.c, with only one caller, so I just
moved it to the calling file and removed util.c and util.h altogether.

There are few other files with fairly generic names: fetch.c, timeline.c
and (a new file I just added) logging.c. I agree it's annoying when
there are multiple files with same name, although I don't think it's
reasonable to totally avoid it either. We could call e.g. logging.c
pg_rewind_logging.c instead, but I'm not convinced that that is better.

> Is the final destiny of this still contrib? There are several votes for
> putting it in src/bin/ right from the start, which sounds reasonable to
> me as well.

Seems that the majority opinion is to put this straight into src/bin.
Works for me, moved.

> If we do that, we will need more attention to translatability markers of
> user-visible error messages; if you want to continue using fprintf()
> then you will need _() around the literals, but it might be wise to use
> another function instead so that you can avoid them (say, have something
> like pg_upgrade's pg_fatal() which you can then list in nls.mk).
> ...
> Uh, I notice that you have _() in some places such as slurpFile().

I copy-pasted pg_fatal and pg_log from pg_upgrade, replaced all the
fprintf(stderr) calls with them, and created an nls.mk file. It's now
ready for translation.

I also removed the --verbose option and replaced it with --progress,
which shows a progress indicator similar to pg_basebackup's, and
--debug, which spews out the list of actions on each file and other
debugging output.

> +void
> +close_target_file(void)
> +{
> + if (close(dstfd) != 0)
> + {
> + fprintf(stderr, "error closing destination file \"%s\": %s\n",
> + dstpath, strerror(errno));
> + exit(1);
> + }
> +
> + dstfd = -1;
> + /* fsync? */
> +}
>
> Did you find an answer for the above question?

No. The question is, should pg_rewind fsync() every file that it
modifies? It would be a reasonable thing to do, to make sure that if you
crash immediately after running pg_rewind, the cluster is in a
consistent state. However, pg_basebackup for example doesn't do that.
initdb does, but that was added fairly recently.

I'm not thrilled about sprinkling fsync() calls everywhere that we touch
files. But I guess that would be the right thing to do. I'm planning to
do that as an add-on patch later, fixing also pg_basebackup and any
other utilities that need it.

> +/*
> + * Does it look like a relation data file?
> + */
> +static bool
> +isRelDataFile(const char *path)
>
> This doesn't consider forks ... why not? If correct, probably it deserves a comment.

Added a comment. (Non-main forks are always copied in toto, so they are
not considered as relation data files for pg_rewind's purposes.)

> +struct filemap_t
> +{
> + /*
> + * New entries are accumulated to a linked list, in process_remote_file
> + * and process_local_file.
> + */
> + file_entry_t *first;
> + file_entry_t *last;
> + int nlist;
>
> Uh, this is like the seventh open-coded list implementation in frontend
> code. Can't we have this using ilist for a change?

ilist is backend code. I'm not eager to move it to src/common. A linked
list is a trivial data structure, I don't think it's too bad to
re-invent that wheel.

In this case, it might make sense to get rid of the linked list
altogether. pg_rewind currently accumulates new entries in a list, and
turns that into a sorted array after all remote files have been
accumulated. But it could be rewritten to use an array to begin with.

> This FRONTEND game is pretty nasty -- why do you need it? Can we fix
> the headers to avoid needing it?
>
> +/*-------------------------------------------------------------------------
> + *
> + * parsexlog.c
> + * Functions for reading Write-Ahead-Log
> + *
> + * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
> + * Portions Copyright (c) 1996-2008, Nippon Telegraph and Telephone Corporation
> + * Portions Copyright (c) 1994, Regents of the University of California
> + *
> + *-------------------------------------------------------------------------
> + */
> +
> +#define FRONTEND 1
> +#include "c.h"
> +#undef FRONTEND
> +#include "postgres.h"
> +
> +#include "pg_rewind.h"
> +#include "filemap.h"
> +
> +#include <unistd.h>
> +
> +#include "access/rmgr.h"
> +#include "access/xlog_internal.h"
> +#include "access/xlogreader.h"
> +#include "catalog/pg_control.h"
> +#include "catalog/storage_xlog.h"
> +#include "commands/dbcommands.h"
>
> (I think the answer is in dbcommands.h; we could split it to a new
> dbcommands_xlog.h)

Ah, nice, that was the only thing left that needed the FRONTEND hack. Done.

Here is a new version. There are now a fair amount of code changes
compared to the version on github, like the logging and progress
information, and a lot of comment changes. I also improved the
documentation.

This patch is also available at my git repository at
git://git.postgresql.org/git/users/heikki/postgres.git, branch
"pg_rewind". The commit history of that isn't very clean, so don't pay
too much attention to that.

- Heikki

Attachment Content-Type Size
pg_rewind-bin-5.patch.gz application/gzip 31.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2015-01-14 09:53:40 Re: pg_rewind in contrib
Previous Message Dean Rasheed 2015-01-14 08:43:25 Re: INSERT ... ON CONFLICT UPDATE and RLS