Re: Add an optional timeout clause to isolationtester step.

From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Add an optional timeout clause to isolationtester step.
Date: 2020-03-07 06:16:15
Message-ID: 20200307061615.c47sju2bytuiaazv@nol
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Mar 07, 2020 at 10:41:42AM +0900, Michael Paquier wrote:
> On Fri, Mar 06, 2020 at 02:15:47PM +0100, Julien Rouhaud wrote:
> > Here's a patch to add an optional "timeout val" clause to isolationtester's
> > step definition. When used, isolationtester will actively wait on the query
> > rather than continuing with the permutation next step, and will issue a cancel
> > once the defined timeout is reached. I also added as a POC the previous
> > regression tests for invalid TOAST indexes, updated to use this new
> > infrastructure (which won't pass as long as the original bug for invalid TOAST
> > indexes isn't fixed).
>
> One problem with this approach is that it does address the stability
> of the test on very slow machines, and there are some of them in the
> buildfarm. Taking your patch, I can make the test fail by applying
> the following sleep because the query would be cancelled before some
> of the indexes are marked as invalid:
> --- a/src/backend/commands/indexcmds.c
> +++ b/src/backend/commands/indexcmds.c
> @@ -3046,6 +3046,8 @@ ReindexRelationConcurrently(Oid relationOid, int
> options)
> CommitTransactionCommand();
> StartTransactionCommand();
>
> + pg_usleep(100000L * 10); /* 10s */
> +
> /*
> * Phase 2 of REINDEX CONCURRENTLY
>
> Another problem is that on faster machines the test is slow because of
> the timeout used. What are your thoughts about having instead a
> cancel meta-command instead?

Looking at timeouts.spec and e.g. a7921f71a3c, it seems that we already chose
to fix this problem by having a timeout long enough to statisfy the slower
buildfarm members, even when running on fast machines, so I assumed that the
same approach could be used here.

I agree that the 1s timeout I used is maybe too low, but that's easy enough to
change. Another point is that it's possible to have a close behavior without
this patch by using a statement_timeout (the active wait does change things
though), but the spec files would be more verbose.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2020-03-07 07:25:22 Remove utils/acl.h from catalog/objectaddress.h
Previous Message Dilip Kumar 2020-03-07 05:45:27 Re: Fastpath while arranging the changes in LSN order in logical decoding