Re: Journal based VACUUM FULL

From: Ryan David Sheasby <ryan27968(at)gmail(dot)com>
To: Andreas Karlsson <andreas(at)proxel(dot)se>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Journal based VACUUM FULL
Date: 2019-02-21 17:12:12
Message-ID: CANTTaes8UOKiWqEEqWAXyST0mJ6qVzjSUc8zqA_XVhjJ3khDoQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks for getting back to me. I had a small discussion with @sfrost on the
slack team and understand the issue better now. I must admit I didn't
realize that the scope of WAL extended to VACUUM operations which is why I
suggested a new journaling system. I realize now the issue is not safety(as
the WAL already sorts out that issue), but performance. I will rethink my
suggestion and let you know if I come up with a useful/performant way of
doing this.

*ThanksRyan Sheasby*

On Thu, Feb 21, 2019 at 5:27 PM Andreas Karlsson <andreas(at)proxel(dot)se> wrote:

> On 2/21/19 12:16 AM, Ryan David Sheasby wrote:
> > I was reading on VACUUM and VACUUM FULL and saw that the current
> > implementation of VACUUM FULL writes to an entirely new file and then
> > switches to it, as opposed to rewriting the current file in-place. I
> > assume the reason for this is safety in the event the database shuts
> > down in the middle of the vacuum, the most you will lose is the progress
> > of the vacuum and have to start from scratch but otherwise the database
> > will retain its integrity and not become corrupted. This seems to be an
> > effective but somewhat rudimentary system that could be improved. Most
> > notably, the rewriting of almost the entire database takes up basically
> > double the storage during the duration of the rewrite which can become
> > expensive or even simply inconvenient in IaaS(and probably other)
> > installations where the drive sizes are scaled on demand. Even if the
> > VACUUM FULL doesn't need to run often, having to reallocate drive space
> > for an effective duplication is not ideal. My suggestion is a journal
> > based in-place rewrite of the database files.
>
> Hi,
>
> VACUUM FULL used to modify the table in-place in PostgreSQL 8.4 and
> earlier but that solution was slow and did often cause plenty of index
> bloat while moving the rows around in the table. Which is why PostgreSQL
> 9.0 switched it to rewiring the whole table and its indexes.
>
> I have not heard many requests for bringing back the old behavior, but
> I could easily have missed them. Either way I do not think there would
> be much demand for an in-place VACUUM FULL unless the index bloat
> problem is also solved.
>
> Additionally I do not think that the project would want a whole new kind
> of infrastructure just to solve this very narrow case. PostgreSQL
> already has its own journal (the write-ahead log) which is used to
> ensure crash safety, and I think any proposed solution for this would
> need to use the WAL.
>
> Andreas
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2019-02-21 17:18:29 Re: list append syntax for postgresql.conf
Previous Message Peter Eisentraut 2019-02-21 17:11:27 Re: insensitive collations