I thought the drive behind full_page_writes = off was to reduce the
amount of data being written to pg_xlog, not to shrink the size of a
PITR log archive.
ISTM that if you want to shrink a PITR log archive you'd be able to
get good results by (b|g)zip'ing the WAL files in the archive. I
quick test on my laptop shows over a 4x reduction in size. Presumably
that'd be even larger if you increased the size of WAL segments.
On Jan 29, 2007, at 2:15 AM, Koichi Suzuki wrote:
> This is a proposal for archive log compression keeping physical log
> in WAL.
> In PotgreSQL 8.2, full-page_writes option came back to cut out
> log both from WAL and archive log. To deal with the partial write
> during the online backup, physical log is written only during the
> Although this dramatically reduces the log size, it can risk the crash
> recovery. If any page is inconsisitent because of the fault, crash
> recovery doesn't work because full page images are necessary to
> the page in such case. For critical use, especially in commercial
> we don't like to risk the crash recovery chance, while reducing the
> archive log size will be crucial too for larger databases. WAL size
> itself may be less critical, because they're reused cyclickly.
> Here, I have a simple idea to reduce archive log size while keeping
> physical log in xlog:
> 1. Create new GUC: full_page_compress,
> 2. Turn on both the full_page_writes and full_page_compress: physical
> log will be written to WAL at the first write to a page after the
> checkpoint, just as conventional full_page_writes ON.
> 3. Unless physical log is written during the online backup, this
> can be
> removed from the archive log. One bit in XLR_BKP_BLOCK_MASK
> (XLR_BKP_REMOVABLE) is available to indicate this (out of four, only
> three of them are in use) and this mark can be set in XLogInsert().
> With the both full_page_writes and full_page_compress on, both logical
> log and physical log will also be written to WAL with
> flag on. Having both physical and logical log in a same WAL is not
> harmful in the crash recovery. In the crash recovery, physical log is
> used if it's available. Logical log is used in the archive
> recovery, as
> the corresponding physical log will be removed.
> 4. The archive command (separate binary), removes physical logs if
> XLR_BKP_REMOVABLE flag is on. Physical logs will be replaced by a
> minumum information of very small size, which is used to restore the
> physical log to keep other log records's LSN consistent.
> 5. The restore command (separate binary) restores removed physical log
> using the dummy record and restores LSN of other log records.
> 6. We need to rewrite redo functions so that they ignore the dummy
> record inserted in 5. The amount of code modification will be very
> As a result, size of the archive log becomes as small as the case with
> full_page_writes off, while the physical log is still available in the
> crash recovery, maintaining the crash recovery chance.
> Comments, questions and any input is welcome.
> Koichi Suzuki, NTT Open Source Center
> Koichi Suzuki
> ---------------------------(end of
> TIP 6: explain analyze is your friend
Jim Nasby jim(at)nasby(dot)net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
In response to
pgsql-hackers by date
|Next:||From: Jim Nasby||Date: 2007-02-02 04:23:00|
|Subject: Re: Proposal: Commit timestamp|
|Previous:||From: Bruce Momjian||Date: 2007-02-02 03:50:12|
|Subject: Re: Enums patch v2|