Re: A concurrent VACUUM FULL?

From: "DINESH NAIR" <Dinesh_Nair(at)iitmpravartak(dot)net>
To: Erik Nordström <erik(at)timescale(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: A concurrent VACUUM FULL?
Date: 2025-06-30 15:23:13
Message-ID: PN4P287MB4381BAC1AFF24AC8A3CC6D449C46A@PN4P287MB4381.INDP287.PROD.OUTLOOK.COM
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Eric,

Currently the first suggested approach "Rebuild indexes on the temporary heap in TX1 and then swap in the new indexes along with the new heap in TX2." sound good.

It would be great if we are able to perform concurrent reads and writes.
In OLTP environments will it lead to slowing of the queries or query performance issues !!!!

Thanks

Dinesh Nair

________________________________
From: Erik Nordström <erik(at)timescale(dot)com>
Sent: Monday, June 30, 2025 3:19 PM
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: A concurrent VACUUM FULL?

You don't often get email from erik(at)timescale(dot)com(dot) Learn why this is important<https://aka.ms/LearnAboutSenderIdentification>
Caution: This email was sent from an external source. Please verify the sender’s identity before clicking links or opening attachments.
Hi hackers,

I've been looking at the code for CLUSTER/VACUUM FULL, and whether it is possible to do a concurrent version of it using a multi-transactional approach similar to concurrent reindexing and partition detach.

The idea would be to hold weaker locks in TX1 when doing the heap rewrite (essentially allow reads but prevent writes), and then do the actual heap swap in a second TX2 transaction.

Having experimented a bit with this approach, I found that reindexing is an issue because that happens after the new heap has been swapped in. The current reindex during a heap swap effectively blocks reads so if one starts a new transaction after swapping heaps, it will block reads for a long time.

This made me think about two ways to handle this:

1. Rebuild indexes on the temporary heap in TX1 and then swap in the new indexes along with the new heap in TX2.

2. Do a concurrent index rebuild after the heap swap.

Of the two approaches above, (2) seems easiest to implement, but the downside is that indexes would be invalid while indexes are rebuilt. Therefore, (1) seems to be the more desirable one because all the heavy lifting would be done in TX1 on the temporary heap.

Does anyone have a sense of whether approach (1) is feasible or whether there are any major blockers?

Is this worth pursuing at all or am I missing something?

Best regards,

Erik
--
Database Architect, Timescale

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Maxim Orlov 2025-06-30 15:49:20 Re: Issue with custom operator in simple case
Previous Message Sami Imseih 2025-06-30 15:19:10 Re: pg_get_multixact_members not documented