Re: Data split -- Creating a copy of database without outage

From: Craig James <cjames(at)emolecules(dot)com>
To: Igor Shmain <igor(dot)shmain(at)gmail(dot)com>
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: Data split -- Creating a copy of database without outage
Date: 2012-05-30 15:38:13
Message-ID: CAFwQ8rfbT_nEemJUirAonYdrc74nJhJbX0dND0KWC+U-ggS1sA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

On Wed, May 30, 2012 at 8:14 AM, Igor Shmain <igor(dot)shmain(at)gmail(dot)com> wrote:

> Thank you, Liu, for your suggestion. ****
>
> ** **
>
> I might be missing something (I am new to postgres), but it seems that
> your suggestion will not help much in my case. Since the number of db
> requests will grow with time (more users will come into the system), the
> current server will become incapable of serving all the requests quickly
> enough.****
>
> ** **
>
> The idea is to increase overall calculation capacity of the system by
> adding more servers to it. Throwing more hardware to a single main server
> is not an option in my case. Creating multiple replicas (slaves) is also
> not an good option – it would be way more efficient to have a group of db
> servers, each serving only some subset of users and hosting data for those
> users only. Buying new servers in advance is not an option too.****
>
> ** **
>
> What I am looking for is switching some of the users to another db server
> when the capacity of the existing server(s) is not enough. The point is to
> do it without interrupting the users’ work (so they do not see that
> horrible “Sorry! This site is under maintenance…”).
>

Londiste is very good at replicating a database. It's normally used for
real-time backup or to load-balance read-only applications, but it can also
be used for the situation you describe. At some point when you decide it's
necessary to split your database, you would create a new database on a
second server, create an empty duplicate schema (using pg_dump's
schema-only feature) then install Londiste. Londiste would replicate your
database in real time up to the point where you were ready to make the
actual switch.

To switch, you'd simply stop Londiste and simultaneously reconfigure your
load-balancing system so that a subset of your users were directed to the
new database. Then you could uninstall Londiste, and clean out each
database by removing the user data that is for users on the other server.

You might also consider partitioning. If you know in advance that you're
going to be discarding large subsets of the data, it will be much more
efficient if you partition it at the outset. When you split your system
in two, the cleanup process will be nothing more than dropping partitions.
You won't be plagued by bloated indexes and files.

Craig

> ****
>
> ** **
>
> If I missed something it would be very kind of you to point this out.****
>
> ** **
>
> Thank you once again,****
>
> -igorS****
>
> ** **
>
> ** **
>
> *From:* Haifeng Liu [mailto:liuhaifeng(at)live(dot)com]
> *Sent:* May-29-12 9:13 PM
> *To:* Igor Shmain
> *Cc:* pgsql-admin(at)postgresql(dot)org
> *Subject:* Re: [ADMIN] Data split -- Creating a copy of database without
> outage****
>
> ** **
>
> Why not use a partitioned table? You can write a trigger to control which
> partition the coming data should be inserted.****
>
> ** **
>
> Regards,****
>
> ----****
>
> Liu Haifeng****
>
> Home: http://liuhaifeng.com****
>
> ** **
>
>
>
> ****
>
> ** **
>
> On May 30, 2012, at 4:13 AM, Igor Shmain wrote:****
>
>
>
> ****
>
> Can you please help with advice?****
>
> I need to design a solution for a database which will grow and will
> require a horizontal split at some moment.****
>
> Here is how I am planning to do it: Every record in every table has a
> shard number. After the database becomes too large (gets too many
> requests), the tables need to be horizontally split. It means that for
> every table all records with some specific shard numbers need to be moved
> to a new database.****
>
> My plan is to (1) create a copy of the database on a new server
> (backup/restore?), (2) synchronize the databases (how?), and then (3)
> forward all requests for the moved shards to the new database. (I will get
> rid of the unneeded shards later). An important requirement: there should
> be no outage for the users.****
>
> What are the ways to achieve it?****
>
> ****
>
> Thank you in advance,
> -igorS****
>
> ****
>
> ** **
>

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Amador Alvarez 2012-05-30 17:12:17 Re: Data split -- Creating a copy of database without outage
Previous Message Igor Shmain 2012-05-30 15:14:02 Re: Data split -- Creating a copy of database without outage