Re: [PATCHES] Full page writes improvement, code update

From: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
To: "Koichi Suzuki" <suzuki(dot)koichi(at)oss(dot)ntt(dot)co(dot)jp>
Cc: <pgsql-hackers(at)postgresql(dot)org>, <pgsql-patches(at)postgresql(dot)org>
Subject: Re: [PATCHES] Full page writes improvement, code update
Date: 2007-03-29 10:07:12
Message-ID: 1175162832.4386.534.camel@silverbirch.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

On Thu, 2007-03-29 at 17:50 +0900, Koichi Suzuki wrote:

> Not only full-page-writes are written as WAL record. In my proposal,
> both full-page-writes and logical log are written in a WAL record, which
> will make WAL size slightly bigger (five percent or so). If
> full_page_compress = off, only a full-page-write will be written in a
> WAL record. I thought someone will not be happy with this size growth.

OK, I see what you're doing now and agree with you that we do need a
parameter. Not sure about the name you've chosen though - it certainly
confused me until you explained.

A parameter called ..._compress indicates to me that it would reduce
something in size whereas what it actually does is increase the size of
WAL slightly. We should have a parameter name that indicates what it
actually does, otherwise some people will choose to use this parameter
even when they are not using archive_command with pg_compresslog.

Some possible names...

additional_wal_info = 'COMPRESS'
add_wal_info
wal_additional_info
wal_auxiliary_info
wal_extra_data
attach_wal_info
...
others?

I've got some ideas for the future for adding additional WAL info for
various purposes, so it might be useful to have a parameter that can
cater for multiple types of additional WAL data. Or maybe we go for
something more specific like

wal_add_compress_info = on
wal_add_XXXX_info ...

> > In recovery.conf, I'd like to see a parameter such as
> >
> > dummy_backup_blocks = off (default) | on
> >
> > to explicitly indicate to the recovery process that backup blocks are
> > present, yet they are garbage and should be ignored. Having garbage data
> > within the system is potentially dangerous and I want to be told by the
> > user that they were expecting that and its OK to ignore that data.
> > Otherwise I want to throw informative errors. Maybe it seems OK now, but
> > the next change to the system may have unintended consequences and it
> > may not be us making the change. "It's OK the Alien will never escape
> > from the lab" is the starting premise for many good sci-fi horrors and I
> > want to watch them, not be in one myself. :-)
> >
> > We can call it other things, of course. e.g.
> > ignore_dummy_blocks
> > decompressed_blocks
> > apply_backup_blocks
>
> So far, we don't need any modification to the recovery and redo
> functions. They ignore the dummy and apply logical logs. Also, if
> there are both full page writes and logical log, current recovery
> selects full page writes to apply.
>
> I agree to introduce this option if 8.3 code introduces any conflict to
> the current. Or, we could introduce this option for future safety. Do
> you think we should introduce this option?

Yes. You are skipping a correctness test and that should be by explicit
command only. It's no problem to include that as well, since you are
already having to specify pg_... decompress... but the recovery process
doesn't know whether or not you've done that.

> Anyway, could you try to run pg_standby with pg_compresslog and
> pg_decompresslog?

After freeze, yes.

> ----
> Additional recomment on page header removal:
>
> I found that it is not simple to keep page header in the compressed
> archive log. Because we eliminate unmarked full page writes and shift
> the rest of the WAL file data, it is not simple to keep page header as
> the page header in the compressed archive log. It is much simpler to
> remove page header as well and rebuild them. I'd like to keep current
> implementation in this point.

OK.

This is a good feature. Thanks for your patience with my comments.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Florian G. Pflug 2007-03-29 10:14:44 Re: CREATE INDEX and HOT - revised design
Previous Message Simon Riggs 2007-03-29 10:03:14 Re: Patch queue concern

Browse pgsql-patches by date

  From Date Subject
Next Message Magnus Hagander 2007-03-29 13:30:27 bgwriter stats
Previous Message Michael Meskes 2007-03-29 09:18:02 Re: ecpg threading vs win32