Re: Logical replication and multimaster

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Logical replication and multimaster
Date: 2015-12-11 10:12:55
Message-ID: CAMsr+YFf4hA=hLTe2hvPZ95-Fxf3bT80-_8CWwrDpDzva3LTOw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 10 December 2015 at 03:19, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Sun, Dec 6, 2015 at 10:24 PM, Craig Ringer <craig(at)2ndquadrant(dot)com>
> wrote:
>

> > * A way to securely make a libpq connection from a bgworker without
> messing
> > with passwords etc. Generate one-time cookies, sometihng like that.
>
> Why would you have the bgworker connect to the database via TCP
> instead of just doing whatever it wants to do directly?

pg_dump and pg_restore, mainly, for copying the initial database state.

PostgreSQL doesn't have SQL-level function equivalents, nor
pg_get_tabledef() etc, and there's been strong opposition to adding
anything of the sort when it's been raised before. We could read a dump in
via pg_restore's text conversion and run the appropriate queries over the
SPI, doing the query splitting, COPY parsing and loading, etc ourselves in
a bgworker. It'd be ugly and duplicate a lot, but it'd work. However, it
wouldn't be possible to do restores in parallel that way, and that's
necessary to get good restore performance on big DBs. For that we'd also
basically rewrite pg_restore's parallel functionality using a bgworker
pool.

The alternative is a massive rewrite of pg_dump and pg_restore to allow
them to be used as libraries, and let them use either libpq or the SPI for
queries, presumably via some level of abstraction layer. As well as further
abtraction for pipelining parallel work. Not very practical, and IIRC
whenever library-ifing pg_dump and pg_restore has been discussed before
it's been pretty firmly rejected.

Also, parallelism at apply time. There are two ways to do apply work in
parallel - a pool of bgworkers that each use the SPI, or using regular
backends managing async libpq connections. At this point I think
Konstantin's approach, with a bgworker pool that processes a work queue, is
probably better for this, and want to explore making that a re-usable
extension for 9.5 and possibly a core part of 9.6 or 9.7.

So it's mainly for pg_restore.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2015-12-11 10:16:08 Re: Logical replication and multimaster
Previous Message Aleksander Alekseev 2015-12-11 10:07:43 Re: Patch: ResourceOwner optimization for tables with many partitions