Re: Selective Replication help

From: Payal Singh <payal(at)omniti(dot)com>
To: Hailey Eckstrand <haileyeckstrand(at)gmail(dot)com>
Cc: "pgsql-novice(at)postgresql(dot)org" <pgsql-novice(at)postgresql(dot)org>
Subject: Re: Selective Replication help
Date: 2015-05-29 15:01:12
Message-ID: CANUg7LD2_4W-ehM=pbMDEq86scziq7gzOM1gW7avr5nXpqUPfA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

>
> Option 2) I've also been looking into a master-slave replication software
> called bucardo but it works table to table and so if I did that.. I think I
> would have to create local versions of the slave table (which would be
> storing duplicated data), update them by triggers (when new data was
> inserted into the 5 tables via web scraping scripts) and then set up
> replication on those tables to the slave databases. However, with
> replication comes a steep learning curve. I'm not sure if what I'm trying
> to do is complicated enough to warrant replication.
>

If you can create a local copy of the final table, you don't need to use
any other tool but normal postgres replication (WAL or streaming). You can
probably use your current SQL function to generate the table, just the
location of the table will be different, and you won't be using fdw (since
it will be on the same database). But yes, you won't be able to just
replicate a single table. The cluster as a whole will be replicated (this
won't be any more complicated than what you're currently doing, perhaps
even simpler).

Payal Singh,
Database Administrator,
OmniTI Computer Consulting Inc.
Phone: 240.646.0770 x 253

On Thu, May 28, 2015 at 6:13 PM, Hailey Eckstrand <haileyeckstrand(at)gmail(dot)com
> wrote:

> I have one local master database and 4 remote slave databases. Every hour
> I send, around 1000 rows from the master to each of the slaves (4 columns).
> For context, my master database is where I do web scraping. After the data
> is scraped, I send it out to the slave databases.
>
> Of the 1000 rows, there are some duplicates in the primary key of the
> slave databases so part of the push involves deleting the primary key
> overlaps and then inserting all of the data.
>
> On the master database, the data I send to the slaves is compiled from a
> view which is a subset of 5 tables. The reason I am explaining this is
> because it is not a simple master table, slave table replication. The data
> I send out is from a view of multiple tables in the master and inserted
> into one table on the slave.
>
> Currently, the way I push the data from local to remote is,
> I've created an SQL function which operates on a slave database table (via
> a foreign data wrapper) that checks if the row exists and if it does, it
> updates the values and if it doesn't exist, it inserts the new row.
>
> I find that this is very slow and I'm guessing inefficient but I'm not
> very good at testing that. The amount of rows that I'm sending out is about
> to increase to 10,000 and I'd like to look into other solutions.
>
> Option 1) In my research, I've found the function '
> dblink_build_sql_insert' that states it 'can be useful in doing selective
> replication of a local table to a remote database' which sounds like what
> I'm doing but I've not found any examples online.
>
> Option 2) I've also been looking into a master-slave replication software
> called bucardo but it works table to table and so if I did that.. I think I
> would have to create local versions of the slave table (which would be
> storing duplicated data), update them by triggers (when new data was
> inserted into the 5 tables via web scraping scripts) and then set up
> replication on those tables to the slave databases. However, with
> replication comes a steep learning curve. I'm not sure if what I'm trying
> to do is complicated enough to warrant replication.
>
> Can anyone suggest other options which would be fast and perhaps more
> efficient or let me know if 1 or 2 are a good idea?
> I know enough python and php to get by and can figure out triggers. I am
> not amazing at plpgsql but could figure it out.
>
> Thank you,
> Hailey
>

In response to

Browse pgsql-novice by date

  From Date Subject
Next Message Albe Laurenz 2015-06-01 11:00:09 Re: [NOVICE] psql readline Tab insert tab
Previous Message Hans Ginzel 2015-05-29 11:57:33 Re: psql readline Tab insert tab