Re: [REVIEW] Re: Compression of full-page-writes

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
Cc: Rahila Syed <rahilasyed90(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, Rahila Syed <rahilasyed(dot)90(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [REVIEW] Re: Compression of full-page-writes
Date: 2014-08-05 12:55:29
Message-ID: CAHGQGwHXvT4eYOZ7G-wcBa-s43KHb9O0XauqRxfH4R8ZT36jjA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jul 23, 2014 at 5:21 PM, Pavan Deolasee
<pavan(dot)deolasee(at)gmail(dot)com> wrote:
> 1. Need for compressing full page backups:
> There are good number of benchmarks done by various people on this list
> which clearly shows the need of the feature. Many people have already voiced
> their agreement on having this in core, even as a configurable parameter.

Yes!

> Having said that, IMHO we should go one step at a time. We are using pglz
> for compressing toast data for long, so we can continue to use the same for
> compressing full page images. We can simultaneously work on adding more
> algorithms to core and choose the right candidate for different scenarios
> such as toast or FPW based on test evidences. But that work can happen
> independent of this patch.

This gradual approach looks good to me. And, if the additional compression
algorithm like lz4 is always better than pglz for every scenarios, we can just
change the code so that the additional algorithm is always used. Which would
make the code simpler.

> 3. Compressing one block vs all blocks:
> Andres suggested that compressing all backup blocks in one go may give us
> better compression ratio. This is worth trying. I'm wondering what would the
> best way to do so without minimal changes to the xlog insertion code. Today,
> we add more rdata items for backup block header(s) and backup blocks
> themselves (if there is a "hole" then 2 per backup block) beyond what the
> caller has supplied. If we have to compress all the backup blocks together,
> then one approach is to copy the backup block headers and the blocks to a
> temp buffer, compress that and replace the rdata entries added previously
> with a single rdata.

Basically sounds reasonable. But, how does this logic work if there are
multiple rdata and only some of them are backup blocks?

If a "hole" is not copied to that temp buffer, ISTM that we should
change backup block header so that it contains the info for a
"hole", e.g., location that a "hole" starts. No?

Regards,

--
Fujii Masao

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kevin Grittner 2014-08-05 13:50:49 Re: PostrgeSQL vs oracle doing 1 million sqrts am I doing it wrong?
Previous Message Michael Paquier 2014-08-05 12:50:21 Re: WAL format and API changes (9.5)