Re: pg_rewind failure by file deletion in source server

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_rewind failure by file deletion in source server
Date: 2015-06-23 14:21:12
Message-ID: 55896B58.1000906@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 06/23/2015 05:03 PM, Fujii Masao wrote:
> On Tue, Jun 23, 2015 at 9:19 PM, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
>> On 06/23/2015 07:51 AM, Michael Paquier wrote:
>>>
>>> So... Attached are a set of patches dedicated at fixing this issue:
>>
>>
>> Thanks for working on this!
>>
>>> - 0001, add if_not_exists to pg_tablespace_location, returning NULL if
>>> path does not exist
>>> - 0002, same with pg_stat_file, returning NULL if file does not exist
>>> - 0003, same with pg_read_*file. I added them to all the existing
>>> functions for consistency.
>>> - 0004, pg_ls_dir extended with if_not_exists and include_dot_dirs
>>> (thanks Robert for the naming!)
>>> - 0005, as things get complex, a set of regression tests aimed to
>>> covering those things. pg_tablespace_location is platform-dependent,
>>> so there are no tests for it.
>>> - 0006, the fix for pg_rewind, using what has been implemented before.
>>
>>
>> With thes patches, pg_read_file() will return NULL for any failure to open
>> the file, which makes pg_rewind to assume that the file doesn't exist in the
>> source server, and will remove the file from the destination. That's
>> dangerous, those functions should check specifically for ENOENT.
>
> I'm wondering if using pg_read_file() to copy the file from source server
> is reasonable. ISTM that it has two problems as follows.
>
> 1. It cannot read very large file like 1GB file. So if such large file was
> created in source server after failover, pg_rewind would not be able
> to copy the file. No?

pg_read_binary_file() handles large files just fine. It cannot return
more than 1GB in one call, but you can call it several times and
retrieve the file in chunks. That's what pg_rewind does, except for
reading the control file, which is known to be small.

> 2. Many users may not allow a remote client to connect to the
> PostgreSQL server as a superuser for some security reasons. IOW,
> there would be no entry in pg_hba.conf for such connection.
> In this case, pg_rewind always fails because pg_read_file() needs
> superuser privilege. No?
>
> I'm tempting to implement the replication command version of
> pg_read_file(). That is, it reads and sends the data like BASE_BACKUP
> replication command does...

Yeah, that would definitely be nice. Peter suggested it back in January
(http://www.postgresql.org/message-id/54AC4801.7050300@gmx.net). I think
it's way too late to do that for 9.5, however. I'm particularly worried
that if we design the required API in a rush, we're not going to get it
right, and will have to change it again soon. That might be difficult in
a minor release. Using pg_read_file() and friends is quite flexible,
even though we just find out that they're not quite flexible enough
right now (the ENOENT problem).

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2015-06-23 14:23:16 Re: SSL TAP tests and chmod
Previous Message Robert Haas 2015-06-23 14:18:45 Re: get_relation_info comment out of sync