Re: test_autovacuum/001_parallel_autovacuum is broken

From: Sami Imseih <samimseih(at)gmail(dot)com>
To: Daniil Davydov <3danissimo(at)gmail(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: test_autovacuum/001_parallel_autovacuum is broken
Date: 2026-04-07 18:02:49
Message-ID: CAA5RZ0tExiffcu7qvrUbpq_qqz=zCD2aJ5_Qigo6eP2kgTx3eQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Tue, Apr 7, 2026 at 11:33 PM Sami Imseih <samimseih(at)gmail(dot)com> wrote:
> >
> > > The proposed patch fixes the problem, but I am thinking about possible new
> > > tests for parallel a/v. What if some of them will require both injection points
> > > and wait_for_autovacuum_complete call?
> >
> > Yeah, we could use dynamically named injection points, which I don't
> > see a precedent for. For example, we could construct a name like
> > "autovacuum-test-<relname>" and have the autovacuum code path
> > generate the injection point name dynamically based on the relation
> > being vacuumed. That way, the test only blocks on the specific table
> > it cares about.
> >
>
> I am afraid that this would be too rough a workaround for this problem..

Perhaps, but I don't see it being unreasonable for injection points.

I guess we can also think about expanding InjectionPointCondition to
handle other types of conditions, maybe OID??, to filter when running
the point.

/*
* Conditions related to injection points. This tracks in shared memory the
* runtime conditions under which an injection point is allowed to run,
* stored as private_data when an injection point is attached, and passed as
* argument to the callback.
*
* If more types of runtime conditions need to be tracked, this structure
* should be expanded.
*/
typedef enum InjectionPointConditionType
{
INJ_CONDITION_ALWAYS = 0, /* always run */
INJ_CONDITION_PID, /* PID restriction */
} InjectionPointConditionType;

typedef struct InjectionPointCondition
{
/* Type of the condition */
InjectionPointConditionType type;

/* ID of the process where the injection point is allowed to run */
int pid;
} InjectionPointCondition;

> We also have an "autovacuum_parallel_workers" reloption that can additionally
> limit the number of parallel workers for the table. Default value of the
> reloption is "-1" which means "use the GUC parameter's value". I.e. when we are
> setting the GUC parameter to N, then every table automatically allows N
> parallel a/v workers. If autovacuum_max_parallel_workers = 0 then no one can
> launch parallel workers for autovacuum, even if reloption is > 0. Thus,
> autovacuum_max_parallel_workers is the main limiter during the number of
> parallel workers calculation.

autovacuum_max_parallel_workers being the limiter is a desirable
attribute, otherwise
it will allow users to disable the GUC and set whatever they want on a
per table level,
only guarded by max_parallel_workers. That to me sounds pretty easy to
misconfigure
and manage.

> But I suggest an alternative idea - allow reloption to override GUC parameter.
> So even if autovacuum_max_parallel_workers is 0 we still can enable parallel
> a/v for a particular table via reloption.
>
> This approach allows us to rework the test as follows :
> 1) Keep the default value of GUC parameter which means that no table allows
> parallel a/v.
> 2) Set reloption of a particular table to N (allow parallel a/v for this and
> only this table).
>
> This approach may also be very useful in large productions. You can read
> discussion about it from here [1] up to the end of the thread. Since the
> question is still open, all feedback is welcome!
>
> [1] https://www.postgresql.org/message-id/CAJDiXgj3A%3DwNC-S0z3TixmnVUkifs%3D07yLLHJ7_%2BdDsakft1tA%40mail.gmail.com

Thanks!

--
Sami

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2026-04-07 18:12:37 Re: EXPLAIN: showing ReadStream / prefetch stats
Previous Message Andres Freund 2026-04-07 17:59:48 Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc?