Re: WIP patch for parallel pg_dump

From: Koichi Suzuki <koichi(dot)szk(at)gmail(dot)com>
To: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, marcin mank <marcin(dot)mank(at)gmail(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, Joachim Wieland <joe(at)mcknight(dot)de>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP patch for parallel pg_dump
Date: 2010-12-07 08:23:19
Message-ID: AANLkTi=8Luv--1E3kHL0tp1NHgGQAuHEHWf7vSHTgC=7@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

This is what Postgres-XC is doing between a coordinator and a
datanode. Coordinator may correspond to poolers/loadbalancers.
Does anyone think it makes sense to extract XC implementation of
snapshot shipping to PostgreSQL itself?

Cheers;
----------
Koichi Suzuki

2010/12/7 Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>:
> On 12/07/2010 01:22 AM, Tom Lane wrote:
>> Josh Berkus <josh(at)agliodbs(dot)com> writes:
>>>> However, if you were doing something like parallel pg_dump you could
>>>> just run the parent and child instances all against the slave, so the
>>>> pg_dump scenario doesn't seem to offer much of a supporting use-case for
>>>> worrying about this.  When would you really need to be able to do it?
>>
>>> If you had several standbys, you could distribute the work of the
>>> pg_dump among them.  This would be a huge speedup for a large database,
>>> potentially, thanks to parallelization of I/O and network.  Imagine
>>> doing a pg_dump of a 300GB database in 10min.
>>
>> That does sound kind of attractive.  But to do that I think we'd have to
>> go with the pass-the-snapshot-through-the-client approach.  Shipping
>> internal snapshot files through the WAL stream doesn't seem attractive
>> to me.
>
> this kind of functionality would also be very useful/interesting for
> connection poolers/loadbalancers that are trying to distribute load
> across multiple hosts and could use that to at least give some sort of
> consistency guarantee.
>
>
>
> Stefan
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Vlad Arkhipov 2010-12-07 08:36:59 Slow BLOBs restoring
Previous Message Tatsuo Ishii 2010-12-07 07:27:54 Re: WIP patch for parallel pg_dump