Re: [PATCH 16/16] current version of the design document

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH 16/16] current version of the design document
Date: 2012-06-13 16:03:14
Message-ID: 201206131803.15123.andres@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Wednesday, June 13, 2012 05:39:36 PM Merlin Moncure wrote:
> On Wed, Jun 13, 2012 at 9:40 AM, Andres Freund <andres(at)2ndquadrant(dot)com>
wrote:
> >> Let's take the case where I have N small-ish schema identical database
> >> shards that I want to aggregate into a single warehouse -- something
> >> that HS/SR currently can't do.
> >> There's a lot of ways to do that obviously but assuming the warehouse
> >> would have to have a unique schema, could it be done in your
> >> architecture?
> >
> > Not sure what you mean by the warehouse having a unique schema? It has
> > the same schema as the OLTP counterparts? That would obviously be the
> > easy case if you take care and guarantee uniqueness of keys upfront.
> > That basically would be trivial ;)
>
> by unique I meant 'not the same as the shards' -- presumably this
> would mean one of
> a) each shard's data would be in a private schema folder
> or
> b) you'd have one set of tables but decorated with an extra shard
> identifying column that would to be present in all keys to get around
> uniqueness issues
I think it would have to mean a) and that you have N of those logical import
processes hanging around. We really need an identical TupleDesc to do the
decoding.

> > It gets a bit more complex if you need to transform the data for the
> > warehouse. I don't plan to put in work to make that possible without some
> > C coding (filling out the callbacks and doing the work in there). It
> > shouldn't need much though.
> >
> > Does that answer your question?
> yes. Do you envision it would be possible to wrap the ApplyCache
> callbacks in a library that could be exposed as an extension? For
> example, a library that would stick the replication data into a queue
> that a userland (non C) process could walk, transform, etc? I know
> that's vague -- my general thrust here is that I find the
> transformation features particularly interesting and I'm wondering how
> much C coding would be needed to access them in the long term.
I can definitely imagine the callbacks calling some wrapper around a higher-
level language. Not sure how that fits into an extension (if you mean it as in
CREATE EXTENSION) though. I don't think you will be able to start the
replication process from inside a normal backend. I imagine something like
specifying a shared object + parameters in the config or such.

Andres
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2012-06-13 16:16:04 Re: [RFC][PATCH] Logical Replication/BDR prototype and architecture
Previous Message Robert Haas 2012-06-13 16:00:43 Re: [COMMITTERS] pgsql: Mark JSON error detail messages for translation.