Re: Simplify backend terminate and wait logic in postgres_fdw test

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
Subject: Re: Simplify backend terminate and wait logic in postgres_fdw test
Date: 2021-05-03 22:42:51
Message-ID: 3854538.1620081771@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Michael Paquier <michael(at)paquier(dot)xyz> writes:
> On Tue, Apr 13, 2021 at 04:39:58PM +0900, Michael Paquier wrote:
>> Looks fine to me. Let's wait a bit first to see if Fujii-san has any
>> objections to this cleanup as that's his code originally, from
>> 32a9c0bd.

> And hearing nothing, done. The tests of postgres_fdw are getting much
> faster for me now, from basically 6s to 4s (actually that's roughly
> 1.8s of gain as pg_wait_until_termination waits at least 100ms,
> twice), so that's a nice gain.

The buildfarm is showing that one of these test queries is not stable
under CLOBBER_CACHE_ALWAYS:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hyrax&dt=2021-05-01%2007%3A44%3A47

of which the relevant part is:

diff -U3 /home/buildfarm/buildroot/HEAD/pgsql.build/contrib/postgres_fdw/expected/postgres_fdw.out /home/buildfarm/buildroot/HEAD/pgsql.build/contrib/postgres_fdw/results/postgres_fdw.out
--- /home/buildfarm/buildroot/HEAD/pgsql.build/contrib/postgres_fdw/expected/postgres_fdw.out 2021-05-01 03:44:45.022300613 -0400
+++ /home/buildfarm/buildroot/HEAD/pgsql.build/contrib/postgres_fdw/results/postgres_fdw.out 2021-05-03 09:11:24.051379288 -0400
@@ -9215,8 +9215,7 @@
WHERE application_name = 'fdw_retry_check';
pg_terminate_backend
----------------------
- t
-(1 row)
+(0 rows)

-- This query should detect the broken connection when starting new remote
-- transaction, reestablish new connection, and then succeed.

I can reproduce that locally by setting

alter system set debug_invalidate_system_caches_always = 1;

and running "make installcheck" in contrib/postgres_fdw.
(It takes a good long time to run the whole test script
though, so you might want to see if running just these few
queries is enough.)

There's no evidence of distress in the postmaster log,
so I suspect this might just be a timing instability,
e.g. remote process already gone before local process
looks. If so, it's probably hopeless to make this
test stable as-is. Perhaps we should just take it out.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2021-05-03 23:02:22 Re: Performance Evaluation of Result Cache by using TPC-DS
Previous Message Peter Geoghegan 2021-05-03 22:07:22 Re: MaxOffsetNumber for Table AMs