Re: Better support of exported snapshots with pg_dump

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Petr Jelinek <petr(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Bernd Helmle <mailings(at)oopsware(dot)de>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Better support of exported snapshots with pg_dump
Date: 2014-10-14 22:09:10
Message-ID: CAB7nPqRJfThS902YsjJnL07u9a0GhbYJ5X+KkXuUsTrU_No7xg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Oct 14, 2014 at 11:55 PM, Petr Jelinek <petr(at)2ndquadrant(dot)com> wrote:
> On 22/09/14 02:24, Michael Paquier wrote:
>>
>> On Thu, Sep 4, 2014 at 11:33 PM, Michael Paquier
>>
>> Taking a dump consistent with a replication slot is useful for online
>> upgrade cases first, because you can simply run pg_dump, have a slot
>> created, and get as well a state of the database consistent with the
>> slot creation before replaying changes in a way or another. Using
>> that, a decoder that generates raw queries, and a receiver able to
>> apply changes on a remote Postgres server, it is possible to get a
>> kind of live migration solution from a Postgres instance to another
>> for a single database, as long as the origin server uses 9.4. Making
>> the receiver report write and flush positions makes also possible the
>> origin server to use synchronous replication protocol to be sure that
>> changes got applied on remote before performing a switch from the
>> origin to the remote (that may actually explain why multi-syncrep
>> would be useful here for multiple databases). Also, I imagine that
>> users could even use this tool in pg_dump for example to do some post
>> processing on the data dumped in accordance to the decoder plugin
>> before applying changes to a remote source.
>>
>> Now, this is done with the addition of two options in pg_dump to
>> control the logical slot creation:
>> - --slot to define the name of the slot being created
>> - --plugin-name, to define the name of the decoder plugin
>> And then you can of course do things like that:
>> # Raw data dump on a slot
>> $ pg_dump --slot bar --plugin-name test_decoding
>> # Existing parallel dump not changed:
>> $ pg_dump -j 4 -f data -F d
>> # Parallel dump on a slot
>> $ pg_dump -j 4 --slot bar --plugin-name test_decoding -f data -F d
>>
>
> Wouldn't it be better to have the slot handling done outside of pg_dump by
> whatever replication solution you use and just have pg_dump accept the
> snapshot as input parameter? I am not sure how much I like pg_dump creating
> the slot. I am aware that you need to have the replication connection open
> but that's IMHO just matter of scripting it together.
The whole point of the operation is to reduce the amount of time the
external snapshot is taken to reduce the risk of race conditions that
may be induced by schema changes. See for example discussions related
to why we do not authorize specifying a snapshot name as an option of
pg_dump.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2014-10-14 22:09:28 Re: replicating DROP commands across servers
Previous Message Robert Haas 2014-10-14 21:53:10 Re: WIP: dynahash replacement for buffer table