Quick Links

Option to not use ringbuffer in VACUUM, using it in failsafe mode

From:	Andres Freund <andres(at)anarazel(dot)de>
To:	pgsql-hackers(at)postgresql(dot)org, Peter Geoghegan <pg(at)bowt(dot)ie>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Melanie Plageman <melanieplageman(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>
Subject:	Option to not use ringbuffer in VACUUM, using it in failsafe mode
Date:	2023-01-11 18:27:20
Message-ID:	20230111182720.ejifsclfwymw2reb@awork3.anarazel.de
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi,

The use of the ringbuffer in VACUUM often causes very substantial slowdowns.

The primary reason for that is that often most/all the buffers in the
ringbuffer have been dirtied when they were processed, with an associated WAL
record. When we then reuse the buffer via the (quite small) ringbuffer, we
need to flush the WAL before reclaiming the buffer. A synchronous flush of
the WAL every few buffers ends up as a very significant bottleneck, unless you
have local low-latency durable storage (i.e. local storage with some sort of
power protection).

The slowdown caused by the frequent WAL flushes is very significant.

A secondary issue, when we end up doing multiple passes, we'll have to re-read
data into shared_buffers, when we just wrote it out and evicted it.

An example:

On a local SSD with decently fast fdatasync([1]). Table size is 3322MB, with ~10%
updated, ~30% deleted tuples, and a single index. m_w_m is large enough to do
this in one pass. I used pg_prewarm() of another relation to ensure the
vacuumed table isn't in s_b (otherwise ringbuffers aren't doing anything).

s_b ringbuffer enabled time wal_syncs wal_sync_time
128MB 1 77797ms
128MB 0 13676ms 241 2538ms
8GB 1 72834ms 23976 51989ms
8GB 0 9544ms 150 1634ms

see [2] for logs / stats of the 8GB run. All the data here is in the OS page
cache, so we don't even pay the real-price for reading the data multiple
times.

On cloud hardware with higher fsync latency I've seen > 15x time differences
between using the ringbuffers and avoiding them by using pg_prewarm.

Of course there's a good reason we have the ringbuffer - we don't want
maintenance operations to completely disturb the buffer pool and harm the
production workload. But if, e.g., the database isn't available due to
anti-wraparound measures, there's no other workload to protect, and the
ringbuffer substantially reduces availability. Initial data loads could
similarly benefit.

Therefore I'd like to add an option to the VACUUM command to use to disable
the use of the ringbuffer. Not sure about the name yet.

I think we should auto-enable that mode once we're using the failsafe mode,
similar to [auto]vacuum cost delays getting disabled
(c.f. lazy_check_wraparound_failsafe()). If things are bad enough that we're
soon going to shut down, we want to be aggressive.

Greetings,

Andres Freund

[1] according to pg_test_fsync:
fdatasync 769.189 ops/sec 1300 usecs/op

[2]

For the s_b 128MB case:

ringbuffers enabled:

2023-01-11 10:24:58.726 PST [355353][client backend][2/19:0][psql] INFO: aggressively vacuuming "postgres.public.copytest_0"
2023-01-11 10:26:19.488 PST [355353][client backend][2/19:0][psql] INFO: finished vacuuming "postgres.public.copytest_0": index scans: 1
pages: 0 removed, 424975 remain, 424975 scanned (100.00% of total)
tuples: 4333300 removed, 6666700 remain, 0 are dead but not yet removable
removable cutoff: 2981, which was 0 XIDs old when operation ended
new relfrozenxid: 2981, which is 102 XIDs ahead of previous value
frozen: 424975 pages from table (100.00% of total) had 6666700 tuples frozen
index scan needed: 424975 pages from table (100.00% of total) had 4325101 dead item identifiers removed
index "copytest_0_id_idx": pages: 10032 in total, 0 newly deleted, 0 currently deleted, 0 reusable
I/O timings: read: 2284.292 ms, write: 4325.009 ms
avg read rate: 83.203 MB/s, avg write rate: 83.199 MB/s
buffer usage: 425044 hits, 860113 misses, 860074 dirtied
WAL usage: 1709902 records, 434990 full page images, 2273501683 bytes
system usage: CPU: user: 11.62 s, system: 11.86 s, elapsed: 80.76 s

┌─────────────┬─────────┬────────────┬──────────────────┬───────────┬──────────┬────────────────┬───────────────┬───────────────────────────────┐
│ wal_records │ wal_fpi │ wal_bytes │ wal_buffers_full │ wal_write │ wal_sync │ wal_write_time │ wal_sync_time │ stats_reset │
├─────────────┼─────────┼────────────┼──────────────────┼───────────┼──────────┼────────────────┼───────────────┼───────────────────────────────┤
│ 8092795 │ 443356 │ 2999358740 │ 1569 │ 28651 │ 27081 │ 1874.391 │ 59895.674 │ 2023-01-11 10:24:58.664859-08 │
└─────────────┴─────────┴────────────┴──────────────────┴───────────┴──────────┴────────────────┴───────────────┴───────────────────────────────┘

ringbuffers disabled:

2023-01-11 10:23:05.081 PST [355054][client backend][2/19:0][psql] INFO: aggressively vacuuming "postgres.public.copytest_0"
2023-01-11 10:23:18.755 PST [355054][client backend][2/19:0][psql] INFO: finished vacuuming "postgres.public.copytest_0": index scans: 1
pages: 0 removed, 424979 remain, 424979 scanned (100.00% of total)
tuples: 4333300 removed, 6666700 remain, 0 are dead but not yet removable
removable cutoff: 2879, which was 0 XIDs old when operation ended
new relfrozenxid: 2879, which is 102 XIDs ahead of previous value
frozen: 424979 pages from table (100.00% of total) had 6666700 tuples frozen
index scan needed: 424979 pages from table (100.00% of total) had 4325176 dead item identifiers removed
index "copytest_0_id_idx": pages: 10032 in total, 0 newly deleted, 0 currently deleted, 0 reusable
I/O timings: read: 1247.366 ms, write: 2888.756 ms
avg read rate: 491.485 MB/s, avg write rate: 491.395 MB/s
buffer usage: 424927 hits, 860242 misses, 860083 dirtied
WAL usage: 1709918 records, 434994 full page images, 2273503049 bytes
system usage: CPU: user: 5.42 s, system: 6.26 s, elapsed: 13.67 s

┌─────────────┬─────────┬────────────┬──────────────────┬───────────┬──────────┬────────────────┬───────────────┬───────────────────────────────┐
│ wal_records │ wal_fpi │ wal_bytes │ wal_buffers_full │ wal_write │ wal_sync │ wal_write_time │ wal_sync_time │ stats_reset │
├─────────────┼─────────┼────────────┼──────────────────┼───────────┼──────────┼────────────────┼───────────────┼───────────────────────────────┤
│ 8092963 │ 443362 │ 2999373996 │ 212190 │ 212333 │ 241 │ 1209.516 │ 2538.706 │ 2023-01-11 10:23:05.004783-08 │
└─────────────┴─────────┴────────────┴──────────────────┴───────────┴──────────┴────────────────┴───────────────┴───────────────────────────────┘

For the s_b 8GB case:

ringbuffers enabled:

2023-01-11 10:04:12.479 PST [352665][client backend][2/19:0][psql] INFO: aggressively vacuuming "postgres.public.copytest_0"
2023-01-11 10:05:25.312 PST [352665][client backend][2/19:0][psql] INFO: finished vacuuming "postgres.public.copytest_0": index scans: 1
pages: 0 removed, 424977 remain, 424977 scanned (100.00% of total)
tuples: 4333300 removed, 6666700 remain, 0 are dead but not yet removable
removable cutoff: 2675, which was 0 XIDs old when operation ended
new relfrozenxid: 2675, which is 102 XIDs ahead of previous value
frozen: 424977 pages from table (100.00% of total) had 6666700 tuples frozen
index scan needed: 424977 pages from table (100.00% of total) had 4325066 dead item identifiers removed
index "copytest_0_id_idx": pages: 10032 in total, 0 newly deleted, 0 currently deleted, 0 reusable
I/O timings: read: 2610.875 ms, write: 4177.842 ms
avg read rate: 81.688 MB/s, avg write rate: 87.515 MB/s
buffer usage: 523611 hits, 761552 misses, 815868 dirtied
WAL usage: 1709910 records, 434992 full page images, 2273502729 bytes
system usage: CPU: user: 11.00 s, system: 11.86 s, elapsed: 72.83 s
┌─────────────┬─────────┬────────────┬──────────────────┬───────────┬──────────┬────────────────┬───────────────┬───────────────────────────────┐
│ wal_records │ wal_fpi │ wal_bytes │ wal_buffers_full │ wal_write │ wal_sync │ wal_write_time │ wal_sync_time │ stats_reset │
├─────────────┼─────────┼────────────┼──────────────────┼───────────┼──────────┼────────────────┼───────────────┼───────────────────────────────┤
│ 8092707 │ 443358 │ 2999354090 │ 42259 │ 66227 │ 23976 │ 2050.963 │ 51989.099 │ 2023-01-11 10:04:12.404054-08 │
└─────────────┴─────────┴────────────┴──────────────────┴───────────┴──────────┴────────────────┴───────────────┴───────────────────────────────┘

ringbuffers disabled:

2023-01-11 10:08:48.414 PST [353287][client backend][3/19:0][psql] INFO: aggressively vacuuming "postgres.public.copytest_0"
2023-01-11 10:08:57.956 PST [353287][client backend][3/19:0][psql] INFO: finished vacuuming "postgres.public.copytest_0": index scans: 1
pages: 0 removed, 424977 remain, 424977 scanned (100.00% of total)
tuples: 4333300 removed, 6666700 remain, 0 are dead but not yet removable
removable cutoff: 2777, which was 0 XIDs old when operation ended
new relfrozenxid: 2777, which is 102 XIDs ahead of previous value
frozen: 424977 pages from table (100.00% of total) had 6666700 tuples frozen
index scan needed: 424976 pages from table (100.00% of total) had 4325153 dead item identifiers removed
index "copytest_0_id_idx": pages: 10032 in total, 0 newly deleted, 0 currently deleted, 0 reusable
I/O timings: read: 1040.230 ms, write: 0.000 ms
avg read rate: 312.016 MB/s, avg write rate: 356.242 MB/s
buffer usage: 904078 hits, 381084 misses, 435101 dirtied
WAL usage: 1709908 records, 434992 full page images, 2273499663 bytes
system usage: CPU: user: 5.57 s, system: 2.26 s, elapsed: 9.54 s

┌─────────────┬─────────┬────────────┬──────────────────┬───────────┬──────────┬────────────────┬───────────────┬───────────────────────────────┐
│ wal_records │ wal_fpi │ wal_bytes │ wal_buffers_full │ wal_write │ wal_sync │ wal_write_time │ wal_sync_time │ stats_reset │
├─────────────┼─────────┼────────────┼──────────────────┼───────────┼──────────┼────────────────┼───────────────┼───────────────────────────────┤
│ 8092933 │ 443358 │ 2999364596 │ 236354 │ 236398 │ 150 │ 1166.314 │ 1634.408 │ 2023-01-11 10:08:48.350328-08 │
└─────────────┴─────────┴────────────┴──────────────────┴───────────┴──────────┴────────────────┴───────────────┴───────────────────────────────┘

Responses

Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode at 2023-01-11 18:35:19 from Peter Geoghegan
Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode at 2023-01-11 18:54:45 from Andres Freund

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Jacob Champion	2023-01-11 18:27:29	Re: Can we let extensions change their dumped catalog schemas?
Previous Message	Robert Haas	2023-01-11 18:24:31	Re: pgsql: Add new GUC createrole_self_grant.