From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
---|---|
To: | Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com> |
Cc: | Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Interval for launching the table sync worker |
Date: | 2017-04-14 10:18:27 |
Message-ID: | CAD21AoDcgDC2+K=V9R7UNAYgbVGKY17MwkTUiJ7CNNnMQ-1ECg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Apr 14, 2017 at 7:09 AM, Petr Jelinek
<petr(dot)jelinek(at)2ndquadrant(dot)com> wrote:
> On 13/04/17 12:23, Masahiko Sawada wrote:
>> On Thu, Apr 13, 2017 at 11:53 AM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>>> On Wed, Apr 12, 2017 at 11:46 PM, Peter Eisentraut
>>> <peter(dot)eisentraut(at)2ndquadrant(dot)com> wrote:
>>>> On 4/12/17 00:48, Masahiko Sawada wrote:
>>>>> On Wed, Apr 12, 2017 at 1:28 PM, Peter Eisentraut
>>>>>> Perhaps instead of a global last_start_time, we store a per relation
>>>>>> last_start_time in SubscriptionRelState?
>>>>>
>>>>> I was thinking the same. But a problem is that the list of
>>>>> SubscriptionRelState is refreshed whenever the syncing table state
>>>>> becomes invalid (table_state_valid = false). I guess we need to
>>>>> improve these logic including GetSubscriptionNotReadyRelations().
>>>>
>>>> The table states are invalidated on a syscache callback from
>>>> pg_subscription_rel, which happens roughly speaking when a table
>>>> finishes the initial sync. So if we're worried about failing tablesync
>>>> workers relaunching to quickly, this would only be a problem if a
>>>> tablesync of another table finishes right in that restart window. That
>>>> doesn't seem a terrible issue to me.
>>>>
>>>
>>> I think the table states are invalidated whenever the table sync
>>> worker starts, because the table sync worker updates its status of
>>> pg_subscription_rel and commits it before starting actual copy. So we
>>> cannot rely on that. I thought we can store last_start_time into
>>> pg_subscription_rel but it might be overkill. I'm now thinking to
>>> change GetSubscriptionNotReadyRealtions so that last_start_time in
>>> SubscriptionRelState is taken over to new list.
>>>
>>
>> Attached the latest patch. It didn't actually necessary to change
>> GetSubscriptionNotReadyRelations. I just changed the logic refreshing
>> the sync table state list.
>> Please give me feedback.
>>
>
> Hmm this might work. Although I was actually wondering if we could store
> the last start timestamp in the worker shared memory and do some magic
> with that (ie, not clearing subid and relid and try to then do rate
> limiting based on subid+relid+timestamp stored in shmem). That would
> then work same way for the main apply workers as well. It would have the
> disadvantage that if some tables were consistently failing, no other
> tables could get synchronized as the amount of workers is limited.
Hmm I guess that it's not a good design that a table sync worker and a
apply worker for a relation takes sole possession of a worker slot
until it successes. It would bother each other.
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
From | Date | Subject | |
---|---|---|---|
Next Message | Masahiko Sawada | 2017-04-14 10:57:56 | Logical replication launcher uses wal_retrieve_retry_interval |
Previous Message | Kyotaro HORIGUCHI | 2017-04-14 08:28:40 | Re: Should pg_current_wal_location() become pg_current_wal_lsn() |