Re: Horizontal scalability/sharding

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>
Cc: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Horizontal scalability/sharding
Date: 2015-09-03 13:40:40
Message-ID: 55E84DD8.6000703@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 09/03/2015 05:02 AM, Amit Kapila wrote:
> On Thu, Sep 3, 2015 at 8:28 AM, Bruce Momjian <bruce(at)momjian(dot)us
> <mailto:bruce(at)momjian(dot)us>> wrote:
> >
> > On Wed, Sep 2, 2015 at 07:50:25PM -0700, Joshua Drake wrote:
> > > >Can you explain why logical replication is better than binary
> > > >replication for this use-case?
> > > >
> > >
> > > Selectivity?
> >
>> I was assuming you would just create identical slaves to handle
>> failure, rather than moving selected data around.
> >
>
> Yes, I also think so, otherwise when the shard goes down and it's
> replica has to take the place of shard, it will take more time to
> make replica available as it won't have all the data as original
> shard had.

Not really, the idea is that you don't need to create the replica
immediately. The system recognizes that primary shard location is
unavailable and redirects the tasks to the "replicas." So the time to
recreate the failed node is not that critical.

It needs to be done in a smart way to prevent some typical issues like
suddenly doubling the load on replicas due to failure of the primary
location. By using different group of nodes for each "data segment" you
can eliminate this, because the group of nodes to handle the additional
load will be larger.

The other issue then of course is that the groups of nodes must not be
entirely random, otherwise the cluster would suffer data loss in case of
outage of arbitrary group of K nodes (where K is the number of replicas
for each piece of data).

It's also non-trivial to do this when you have to consider racks, data
centers etc.

With regular slaves you can't do any of this - no matter what you do,
you have to load balance the additional load only on the slaves.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2015-09-03 13:48:34 Re: Fwd: Core dump with nested CREATE TEMP TABLE
Previous Message Fujii Masao 2015-09-03 13:33:30 Re: max_worker_processes on the standby