Re: Horizontal scalability/sharding

From: Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To: "'Etsuro Fujita *EXTERN*'" <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Bruce Momjian <bruce(at)momjian(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, Mason S <masonlists(at)gmail(dot)com>, Oleg Bartunov <obartunov(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Horizontal scalability/sharding
Date: 2015-09-02 08:58:02
Message-ID: A737B7A37273E048B164557ADEF4A58B50F9F5A4@ntex2010i.host.magwien.gv.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Etsuro Fujita wrote:
> On 2015/09/02 16:40, Amit Langote wrote:
>> On 2015-09-02 PM 04:07, Albe Laurenz wrote:
>>> Amit Langote wrote:
>>>> On 2015-09-02 PM 03:25, Amit Kapila wrote:
>>>>> Will it handle deadlocks across different table partitions. Consider
>>>>> a case as below:
>>>>>
>>>>> T1
>>>>> 1. Updates row R1 of T1 on shard S1
>>>>> 2. Updates row R2 of T2 on shard S2
>>>>>
>>>>> T2
>>>>> 1. Updates row R2 of T2 on shard S2
>>>>> 2. Updates row R1 of T1 on shard S1
>>>
>>>> As long as shards are processed in the same order in different
>>>> transactions, ISTM, this issue should not arise? I can imagine it becoming
>>>> a concern if parallel shard processing enters the scene. Am I missing
>>>> something?
>>>
>>> That would only hold for a single query, right?
>>>
>>> If 1. and 2. in the above example come from different queries within one
>>> transaction, you cannot guarantee that shards are processed in the same order.
>>>
>>> So T1 and T2 could deadlock.
>
>> Sorry, I failed to see why that would be the case. Could you elaborate?
>
> I think Laurenz would assume that the updates 1. and 2. in the above
> transactions are performed *in a non-inherited manner*. If that's
> right, T1 and T2 could deadlock, but I think we assume here to run
> transactions over shards *in an inherited manner*.

Yes, but does every update affect all shards?

If I say "UPDATE t1 SET col = 1 WHERE id = 42" and the row with id 42
happens to be on shard S1, the update would only affect that shard, right?

Now if "UPDATE t2 SET col = 1 WHERE id = 42" would only take place on
shard S2, and two transactions issue both updates in different order,
one transaction would be waiting for a lock on shard S1, while the other
would be waiting for a lock on shard S2, right?

But maybe I'm missing something fundamental.

Yours,
Laurenz Albe

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Shulgin, Oleksandr 2015-09-02 09:01:00 Re: On-demand running query plans using auto_explain and signals
Previous Message Shulgin, Oleksandr 2015-09-02 08:52:45 Re: Proposal: Implement failover on libpq connect level.