Re: logical replication busy-waiting on a lock

From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org,Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>,Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Subject: Re: logical replication busy-waiting on a lock
Date: 2017-05-29 19:28:24
Message-ID: 90A0E15D-4D52-4197-BFF7-A1814699A2E4@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On May 29, 2017 12:25:35 PM PDT, Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com> wrote:
>On 29/05/17 21:21, Petr Jelinek wrote:
>> On 29/05/17 20:59, Andres Freund wrote:
>>>
>>>
>>> On May 29, 2017 11:58:05 AM PDT, Petr Jelinek
><petr(dot)jelinek(at)2ndquadrant(dot)com> wrote:
>>>> On 27/05/17 17:17, Andres Freund wrote:
>>>>>
>>>>>
>>>>> On May 27, 2017 9:48:22 AM EDT, Petr Jelinek
>>>> <petr(dot)jelinek(at)2ndquadrant(dot)com> wrote:
>>>>>> Actually, I guess it's the pid 47457 (COPY process) who is
>actually
>>>>>> running the xid 73322726. In that case that's the same thing
>>>> Masahiko
>>>>>> Sawada reported [1]. Which basically is result of snapshot
>builder
>>>>>> waiting for transaction to finish, that's normal if there is a
>long
>>>>>> transaction running when the snapshot is being created (and the
>COPY
>>>> is
>>>>>> a long transaction).
>>>>>
>>>>> Hm. I suspect the issue is that the exported snapshot needs an
>xid
>>>> for some crosscheck, and that's what we're waiting for. Could you
>>>> check what happens if you don't assign one and just content the
>error
>>>> checks out? Not at my computer, just theorizing.
>>>>>
>>>>
>>>> I don't think that's it, in my opinion it's the parallelization of
>>>> table
>>>> data copy where we create snapshot for one process but then the
>next
>>>> one
>>>> has to wait for the first one to finish. Before we fixed the
>>>> snapshotting, the second one would just use the ondisk snapshot so
>it
>>>> would work fine (except the snapshot was corrupted of course). I
>wonder
>>>> if we could somehow give it a hint to ignore the read-only txes,
>but
>>>> then we have no way to enforce the txes to stay read-only so it
>does
>>>> not
>>>> seem safe.
>>>
>>> Read-only txs have no xid ...
>>>
>>
>> That's what I mean by hinting, normally they don't but building
>initial
>> snapshot in snapshot builder calls GetTopTransactionId() (see
>> SnapBuildInitialSnapshot()) which will assign it xid.
>>
>
>Looking at the code more, the xid is only used as parameter for
>SnapBuildBuildSnapshot() which never does anything with that parameter,
>I wonder if it's really needed then.

Not at a computer, but by memory that'll trigger the snapshot export routine to include it. Import in turn requires the xid to check if the source is still alive. But there's better ways, e.g. using the virtual xactid.

Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Christoph Berg 2017-05-29 19:30:30 Re: psql: Activate pager only for height, not width
Previous Message Petr Jelinek 2017-05-29 19:27:47 Re: logical replication busy-waiting on a lock