Re: Sub-millisecond [autovacuum_]vacuum_cost_delay broken

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Nathan Bossart <nathandbossart(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Stephen Frost <sfrost(at)snowman(dot)net>
Subject: Re: Sub-millisecond [autovacuum_]vacuum_cost_delay broken
Date: 2023-03-10 00:46:39
Message-ID: 705469.1678409199@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thomas Munro <thomas(dot)munro(at)gmail(dot)com> writes:
> Erm, but maybe I'm just looking at this too myopically. Is there
> really any point in letting people set it to 0.5, if it behaves as if
> you'd set it to 1 and doubled the cost limit? Isn't it just more
> confusing? I haven't read the discussion from when fractional delays
> came in, where I imagine that must have come up...

At [1] I argued

>> The reason is this: what we want to do is throttle VACUUM's I/O demand,
>> and by "throttle" I mean "gradually reduce". There is nothing gradual
>> about issuing a few million I/Os and then sleeping for many milliseconds;
>> that'll just produce spikes and valleys in the I/O demand. Ideally,
>> what we'd have it do is sleep for a very short interval after each I/O.
>> But that's not too practical, both for code-structure reasons and because
>> most platforms don't give us a way to so finely control the length of a
>> sleep. Hence the design of sleeping for awhile after every so many I/Os.
>>
>> However, the current settings are predicated on the assumption that
>> you can't get the kernel to give you a sleep of less than circa 10ms.
>> That assumption is way outdated, I believe; poking around on systems
>> I have here, the minimum delay time using pg_usleep(1) seems to be
>> generally less than 100us, and frequently less than 10us, on anything
>> released in the last decade.
>>
>> I propose therefore that instead of increasing vacuum_cost_limit,
>> what we ought to be doing is reducing vacuum_cost_delay by a similar
>> factor. And, to provide some daylight for people to reduce it even
>> more, we ought to arrange for it to be specifiable in microseconds
>> not milliseconds. There's no GUC_UNIT_US right now, but it's time.

That last point was later overruled in favor of keeping it measured in
msec to avoid breaking existing configuration files. Nonetheless,
vacuum_cost_delay *is* an actual time to wait (conceptually at least),
not just part of a unitless ratio; and there seem to be good arguments
in favor of letting people make it small.

I take your point that really short sleeps are inefficient so far as the
scheduling overhead goes. But on modern machines you probably have to get
down to a not-very-large number of microseconds before that's a big deal.

regards, tom lane

[1] https://www.postgresql.org/message-id/28720.1552101086%40sss.pgh.pa.us

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2023-03-10 01:13:09 Re: Add pg_walinspect function with block info columns
Previous Message Tom Lane 2023-03-10 00:26:31 Re: Date-Time dangling unit fix