Re: Improving connection scalability: GetSnapshotData()

From: Andres Freund <andres(at)anarazel(dot)de>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Ian Barwick <ian(dot)barwick(at)2ndquadrant(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Daniel Gustafsson <daniel(at)yesql(dot)se>, "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Peter Geoghegan <pg(at)bowt(dot)ie>, Bruce Momjian <bruce(at)momjian(dot)us>, David Rowley <dgrowleyml(at)gmail(dot)com>
Subject: Re: Improving connection scalability: GetSnapshotData()
Date: 2020-09-08 17:53:52
Message-ID: 20200908175352.2wby2rb5aonhbcwa@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2020-09-08 16:44:17 +1200, Thomas Munro wrote:
> On Tue, Sep 8, 2020 at 4:11 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > At first I was very confused as to why none of the existing tests have
> > found this significant issue. But after thinking about it for a minute
> > that's because they all use psql, and largely separate psql invocations
> > for each query :(. Which means that there's no cached snapshot around...
>
> I prototyped a TAP test patch that could maybe do the sort of thing
> you need, in patch 0006 over at [1]. Later versions of that patch set
> dropped it, because I figured out how to use the isolation tester
> instead, but I guess you can't do that for a standby test (at least
> not until someone teaches the isolation tester to support multi-node
> schedules, something that would be extremely useful...).

Unfortunately proper multi-node isolationtester test basically is
equivalent to building a global lock graph. I think, at least? Including
a need to be able to correlate connections with their locks between the
nodes.

But for something like the bug at hand it'd probably sufficient to just
"hack" something with dblink. In session 1) insert a row on the primary
using dblink, return the LSN, wait for the LSN to have replicated and
finally in session 2) check for row visibility.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2020-09-08 17:56:26 Re: Rare deadlock failure in create_am test
Previous Message Thomas Munro 2020-09-08 17:37:54 Re: Optimising compactify_tuples()