Re: Improving connection scalability: GetSnapshotData()

From: Ian Barwick <ian(dot)barwick(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Daniel Gustafsson <daniel(at)yesql(dot)se>, "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Peter Geoghegan <pg(at)bowt(dot)ie>, Bruce Momjian <bruce(at)momjian(dot)us>, David Rowley <dgrowleyml(at)gmail(dot)com>
Subject: Re: Improving connection scalability: GetSnapshotData()
Date: 2020-09-09 08:02:58
Message-ID: d5f14b0e-58d2-56ab-0938-ab57ecc89d77@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2020/09/08 13:23, Ian Barwick wrote:
> On 2020/09/08 13:11, Andres Freund wrote:
>> Hi,
>>
>> On 2020-09-08 13:03:01 +0900, Ian Barwick wrote:
> (...)
>>> I wonder if it's possible to increment "xactCompletionCount"
>>> during replay along these lines:
>>>
>>>      *** a/src/backend/access/transam/xact.c
>>>      --- b/src/backend/access/transam/xact.c
>>>      *************** xact_redo_commit(xl_xact_parsed_commit *
>>>      *** 5915,5920 ****
>>>      --- 5915,5924 ----
>>>               */
>>>              if (XactCompletionApplyFeedback(parsed->xinfo))
>>>                      XLogRequestWalReceiverReply();
>>>      +
>>>      +       LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
>>>      +       ShmemVariableCache->xactCompletionCount++;
>>>      +       LWLockRelease(ProcArrayLock);
>>>        }
>>>
>>> which seems to work (though quite possibly I've overlooked something I don't
>>> know that I don't know about and it will all break horribly somewhere,
>>> etc. etc.).
>>
>> We'd also need the same in a few more places. Probably worth looking at
>> the list where we increment it on the primary (particularly we need to
>> also increment it for aborts, and 2pc commit/aborts).
>
> Yup.
>
>> At first I was very confused as to why none of the existing tests have
>> found this significant issue. But after thinking about it for a minute
>> that's because they all use psql, and largely separate psql invocations
>> for each query :(. Which means that there's no cached snapshot around...
>>
>> Do you want to try to write a patch?
>
> Sure, I'll give it a go as I have some time right now.

Attached, though bear in mind I'm not very familiar with parts of this,
particularly 2PC stuff, so consider it educated guesswork.

Regards

Ian Barwick

--
Ian Barwick https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
snapshot-cache-standby-fix.v1.patch text/x-patch 1.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2020-09-09 08:08:06 Re: Since '2001-09-09 01:46:40'::timestamp microseconds are lost when extracting epoch
Previous Message Kyotaro Horiguchi 2020-09-09 08:01:09 Re: shared-memory based stats collector