Re: Journal based VACUUM FULL

From: Andreas Karlsson <andreas(at)proxel(dot)se>
To: Ryan David Sheasby <ryan27968(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Journal based VACUUM FULL
Date: 2019-02-21 15:27:06
Message-ID: 4fecdac5-131b-f0c7-06a1-2576734dec09@proxel.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2/21/19 12:16 AM, Ryan David Sheasby wrote:
> I was reading on VACUUM and VACUUM FULL and saw that the current
> implementation of VACUUM FULL writes to an entirely new file and then
> switches to it, as opposed to rewriting the current file in-place. I
> assume the reason for this is safety in the event the database shuts
> down in the middle of the vacuum, the most you will lose is the progress
> of the vacuum and have to start from scratch but otherwise the database
> will retain its integrity and not become corrupted. This seems to be an
> effective but somewhat rudimentary system that could be improved. Most
> notably, the rewriting of almost the entire database takes up basically
> double the storage during the duration of the rewrite which can become
> expensive or even simply inconvenient in IaaS(and probably other)
> installations where the drive sizes are scaled on demand. Even if the
> VACUUM FULL doesn't need to run often, having to reallocate drive space
> for an effective duplication is not ideal. My suggestion is a journal
> based in-place rewrite of the database files.

Hi,

VACUUM FULL used to modify the table in-place in PostgreSQL 8.4 and
earlier but that solution was slow and did often cause plenty of index
bloat while moving the rows around in the table. Which is why PostgreSQL
9.0 switched it to rewiring the whole table and its indexes.

I have not heard many requests for bringing back the old behavior, but
I could easily have missed them. Either way I do not think there would
be much demand for an in-place VACUUM FULL unless the index bloat
problem is also solved.

Additionally I do not think that the project would want a whole new kind
of infrastructure just to solve this very narrow case. PostgreSQL
already has its own journal (the write-ahead log) which is used to
ensure crash safety, and I think any proposed solution for this would
need to use the WAL.

Andreas

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-02-21 15:33:36 Re: libpq host/hostaddr/conninfo inconsistencies
Previous Message Fabien COELHO 2019-02-21 15:18:16 Re: libpq host/hostaddr/conninfo inconsistencies