Re: POC: Better infrastructure for automated testing of concurrency issues

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: Better infrastructure for automated testing of concurrency issues
Date: 2020-12-04 21:20:27
Message-ID: CAPpHfduENSKoCOobPmk_JXxAKGj3ETnL2fH9WWSUOMMYrMRUWw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Dec 4, 2020 at 9:57 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> On Wed, Nov 25, 2020 at 6:11 AM Alexander Korotkov <aekorotkov(at)gmail(dot)com> wrote:
> > While the postgres community does a great job on investigating and fixing the problems, our ability to reproduce concurrency issues in the source code test suites is limited.
>
> +1. This seems really cool.
>
> > For sure, evaluation of stop events causes a CPU overhead. This is why it's controlled by enable_stopevents GUC, which is off by default. I expect the overhead with enable_stopevents = off shouldn't be observable. Even if it would be observable, we could enable stop events only by specific configure parameter. There is also trace_stopevents GUC, which traces all the stop events to the log with debug2 level.
>
> But why even risk adding noticeable overhead when "enable_stopevents =
> off "? Even if it's a very small risk? We can still get most of the
> benefit by enabling it only on certain builds and buildfarm animals.
> It will be a bit annoying to not have stop events enabled in all
> builds, but it avoids the problem of even having to think about the
> overhead, now or in the future. I think that that trade-off is a good
> one. Even if the performance trade-off is judged perfectly for the
> first few tests you add, what are the chances that it will stay that
> way as the infrastructure is used in more and more places? What if you
> need to add a test to the back branches? Since we don't anticipate any
> direct benefit for users (right?), I think that this question is
> simple.
>
> I am not arguing for not enabling stop events on standard builds
> because the infrastructure isn't useful -- it's *very* useful. Useful
> enough that it would be nice to be able to use it extensively without
> really thinking about the performance hit each time. I know that I'll
> be *far* more likely to use it if I don't have to waste time and
> energy on that aspect every single time.

Thank you for your feedback. We probably can't think over everything
in advance. We can start with configure option enabled for developers
and some buildfarm animals. That causes no risk of overhead in
standard builds. After some time, we may reconsider to enable stop
events even in standard build if we see they cause no regression.

------
Regards,
Alexander Korotkov

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2020-12-04 21:25:05 Re: Improving spin-lock implementation on ARM.
Previous Message Alexander Korotkov 2020-12-04 21:15:15 Re: POC: Better infrastructure for automated testing of concurrency issues