Re: Add an optional timeout clause to isolationtester step.

From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Add an optional timeout clause to isolationtester step.
Date: 2020-03-10 13:53:36
Message-ID: 20200310135336.zi3mgmq6fub3jfek@nol
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 10, 2020 at 12:09:12AM -0400, Tom Lane wrote:
> Michael Paquier <michael(at)paquier(dot)xyz> writes:
> > On Mon, Mar 09, 2020 at 10:32:27PM -0400, Tom Lane wrote:
> >> It strikes me to wonder whether we could improve matters by teaching
> >> isolationtester to watch for particular values in a connected backend's
> >> pg_stat_activity.wait_event_type/wait_event columns. Those columns
> >> didn't exist when isolationtester was designed, IIRC, so it's not
> >> surprising that they're not used in the current design. But we could
> >> use them perhaps to detect that a backend has arrived at some state
> >> that's not a heavyweight-lock-wait state.
>
> > Interesting idea. So that would be basically an equivalent of
> > PostgresNode::poll_query_until but for the isolation tester?
>
> No, more like the existing isolationtester wait query, which watches
> for something being blocked on a heavyweight lock. Right now, that
> one depends on a bespoke function pg_isolation_test_session_is_blocked(),
> but it used to be a query on pg_stat_activity/pg_locks.

Ah interesting indeed!

> > In short
> > we gain a meta-command that runs a SELECT query that waits until the
> > query defined in the command returns true. The polling interval may
> > be tricky to set though.
>
> I think it'd be just the same as the polling interval for the existing
> wait query. We'd have to have some way to mark a script step to say
> what to check to decide that it's blocked ...

So basically we could just change pg_isolation_test_session_is_blocked() to
also return the wait_event_type and wait_event, and adding something like

step "<name>" { SQL } [ cancel on "<wait_event_type>" "<wait_event>" ]

to the step definition should be enough. I'm attaching a POC patch for that.
On my laptop, the full test now complete in about 400ms.

FTR the REINDEX TABLE CONCURRENTLY case is eventually locked on a virtualxid,
I'm not sure if that's could lead to too early cancellation.

Attachment Content-Type Size
v2-0001-Add-an-optional-cancel-on-clause-to-isolationtest.patch text/x-diff 13.1 KB
v2-0002-Add-regression-tests-for-failed-REINDEX-TABLE-CON.patch text/x-diff 4.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Steele 2020-03-10 14:07:02 Re: WIP: System Versioned Temporal Table
Previous Message Ashutosh Bapat 2020-03-10 13:50:23 Re: [PATCH] Erase the distinctClause if the result is unique by definition