Skip site navigation (1) Skip section navigation (2)

Re: [HACKERS] Full page writes improvement, code update

From: Koichi Suzuki <suzuki(dot)koichi(at)oss(dot)ntt(dot)co(dot)jp>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, pgsql-patches(at)postgresql(dot)org
Subject: Re: [HACKERS] Full page writes improvement, code update
Date: 2007-04-24 01:15:15
Message-ID: 462D5A23.9060706@oss.ntt.co.jp (view raw or flat)
Thread:
Lists: pgsql-hackerspgsql-patches
Hi,

Sorry, because of so many comments/questions, I'll write inline....

Josh Berkus wrote:
> Hackers,
> 
>> Writing lots of additional code simply to remove a parameter that
>> *might* be mis-interpreted doesn't sound useful to me, especially when
>> bugs may leak in that way. My take is that this is simple and useful
>> *and* we have it now; other ways don't yet exist, nor will they in time
>> for 8.3.
> 
> How about naming the parameter wal_compressable?  That would indicate pretty 
> clearly that the parameter is intended to be used with wal_compress and 
> nothing else.

Hmm, it sounds nicer.

> 
> However, I do agree with Andreas that anything which adds to WAL volume, even 
> 3%, seems like going in the wrong direction.  We already have higher log 
> output than any comparable database (higher than InnoDB by 3x) and we should 
> be looking for output to trim as well as compression.
> 
> So the relevant question is whether the patch in its current form provides 
> enough benefit to make it worthwhile for 8.3, or whether we should wait for 
> 8.4.  Questions:
> 

Before answering questions below, I'd like to say that archive log 
optimization has to be address different point of views to the current 
(upto 8.2) settings.

1) To deal with partial/inconsisitent write to the data file at crash 
recovery, we need full page writes at the first modification to pages 
after each checkpoint.   It consumes much of WAL space.

2) 1) is not necessary for archive recovery (PITR) and full page writes 
can be removed for this purpose.   However, we need full page writes 
during hot backup to deal with partial writes by backup commands.  This 
is implemented in 8.2.

3) To maintain crash recovery chance and reduce the amount of archive 
log, removal of  unnecessary full page writes from archive logs is a 
good choice.   To do this, we need both logical log and full page writes 
in WAL.

I don't think there should be only one setting.   It depend on how 
database is operated.   Leaving wal_add_optiomization_info = off default 
does not bring any change in WAL and archive log handling.   I 
understand some people may not be happy with additional 3% or so 
increase in WAL size, especially people who dosn't need archive log at 
all.   So I prefer to leave the default off.

For users, I think this is simple enough:

1) For people happy with 8.2 settings:
    No change is needed to move to 8.3 and there's really no change.

2) For people who need to reduce archive log size but like to leave full 
page writes to WAL (to maintain crash recovery chance):
    a) Add GUC parameter: wal_add_optiomization_info=on
    b) Change archive command from "cp" to "pg_compresslog"
    c) Change restore command from "cp" to "pg_decompresslog"

Archive log can be stored and restored as done in older releases.

> 1) is there any throughput benefit for platforms with fast CPU but contrained 
> I/O (e.g. 2-drive webservers)?  Any penalty for servers with plentiful I/O?

I've only run benchmarks with archive process running, because 
wal_add_optimization_info=on does not make sense if we don't archive 
WAL.   In this situation, total I/O decreases because writes to archive 
log decreases.   Because of 3% or so increase in WAL size, there will be 
increase in WAL write, but decrease in archive writes makes it up.

> 
> 2) Will this patch make attempts to reduce WAL volume in the future 
> significantly harder?

Yes, I'd like to continue to work to reduce the WAL size.   It's still 
an issue when database size becomes several handreds of gigabytes in 
size.   Anyway, I think WAL size reduction has to be done in 
XLogInsert() or XLogWrite().   We need much more discussion for this. 
The issue will be how to maintain crash recovery chance by inconsistent 
writes (by full_page_writes=off, we have to give it up).   On the other 
hand we have to keep examining each WAL record.

> 
> 3) How is this better than command-line compression for log-shipping?  e.g. 
> why do we need it in the database?

I don't fully understand what command-line compression means.   Simon 
suggested that this patch can be used with log-shipping and I agree. 
If we compare compression with gzip or other general purpose 
compression, compression ratio, CPU usage and I/O by pg_compresslog are 
all quite better than those in gzip.

Please let me know if you intended defferently.

Regards;

-- 
-------------
Koichi Suzuki

In response to

Responses

pgsql-hackers by date

Next:From: Robert TreatDate: 2007-04-24 01:15:38
Subject: Re: [HACKERS] Wild idea: 9.0?
Previous:From: Gregory StarkDate: 2007-04-24 00:11:25
Subject: Re: Better error message for select_common_type()

pgsql-patches by date

Next:From: Tom LaneDate: 2007-04-24 02:42:38
Subject: Re: BUG #3245: PANIC: failed to re-find shared loc k o b j ect
Previous:From: Heikki LinnakangasDate: 2007-04-23 22:15:24
Subject: Re: BUG #3245: PANIC: failed to re-find shared loc k o b j ect

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group