Skip site navigation (1) Skip section navigation (2)

Re: [HACKERS] Full page writes improvement, code update

From: Koichi Suzuki <suzuki(dot)koichi(at)oss(dot)ntt(dot)co(dot)jp>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Zeugswetter Andreas ADI SD <ZeugswetterA(at)spardat(dot)at>, pgsql-patches(at)postgresql(dot)org
Subject: Re: [HACKERS] Full page writes improvement, code update
Date: 2007-04-27 00:50:13
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-hackerspgsql-patches

Josh Berkus wrote:
> Koichi, Andreas,
>> 1) To deal with partial/inconsisitent write to the data file at crash
>> recovery, we need full page writes at the first modification to pages
>> after each checkpoint.   It consumes much of WAL space.
> We need to find a way around this someday.  Other DBs don't do this; it may be 
> becuase they're less durable, or because they fixed the problem.

Maybe both.   Fixing the problem may need some means to detect 
partial/inconsistent writes to the data files, which may needs 
additional CPU resource.

>> I don't think there should be only one setting.   It depend on how
>> database is operated.   Leaving wal_add_optiomization_info = off default
>> does not bring any change in WAL and archive log handling.   I
>> understand some people may not be happy with additional 3% or so
>> increase in WAL size, especially people who dosn't need archive log at
>> all.   So I prefer to leave the default off.
> Except that, is there any reason to turn this off if we are archiving?  Maybe 
> it should just be slaved to archive_command ... if we're not using PITR, it's 
> off, if we are, it's on.

Hmm, this sounds to work.  On the other hand, existing users, who are 
happy with the current archiving and would not like to change current 
archiving command to pg_compresslog or archive log size will increase a 
bit.  I'd like to hear some more on this.

>>> 1) is there any throughput benefit for platforms with fast CPU but
>>> contrained I/O (e.g. 2-drive webservers)?  Any penalty for servers with
>>> plentiful I/O?
>> I've only run benchmarks with archive process running, because
>> wal_add_optimization_info=on does not make sense if we don't archive
>> WAL.   In this situation, total I/O decreases because writes to archive
>> log decreases.   Because of 3% or so increase in WAL size, there will be
>> increase in WAL write, but decrease in archive writes makes it up.
> Yeah, I was just looking for a way to make this a performance feature.  I see 
> now that it can't be.  ;-)

As to the performance feature, I tested the patch against 8.3HEAD. 
With pgbench, throughput was as follows:
Case1. Archiver: cp command, wal_add_optimization_info = off,
Case2. Archiver: pg_compresslog, wal_add_optimization_info = on,
DB Size: 1.65GB, Total transaction:1,000,000

Throughput was:
Case1: 632.69TPS
Case2: 653.10TPS ... 3% gain.

Archive Log Size:
Case1: 1.92GB
Case2: 0.57GB (about 30% of the Case1)... Before compression, the size 
was 1.92GB.  Because this is based on the number of WAL segment file 
size, there will be at most 16MB error in the measurement.  If we count 
this, the increase in WAL I/O will be less than 1%.

>>> 3) How is this better than command-line compression for log-shipping? 
>>> e.g. why do we need it in the database?
>> I don't fully understand what command-line compression means.   Simon
>> suggested that this patch can be used with log-shipping and I agree.
>> If we compare compression with gzip or other general purpose
>> compression, compression ratio, CPU usage and I/O by pg_compresslog are
>> all quite better than those in gzip.
> OK, that answered my question.
>> This is why I don't like Josh's suggested name of wal_compressable
>> eighter.
>> WAL is compressable eighter way, only pg_compresslog would need to be
>> more complex if you don't turn off the full page optimization. I think a
>> good name would tell that you are turning off an optimization.
>> (thus my wal_fullpage_optimization on/off)
> Well, as a PG hacker I find the name wal_fullpage_optimization quite baffling 
> and I think our general user base will find it even more so.  Now that I have 
> Koichi's explanation of the problem, I vote for simply slaving this to the 
> PITR settings and not having a separate option at all.

Could I have more specific suggestion on this?


Koichi Suzuki

In response to

pgsql-hackers by date

Next:From: Bruce MomjianDate: 2007-04-27 00:50:17
Subject: Re: Reviewers Guide to Deferred Transactions/Transaction Guarantee
Previous:From: Bruce MomjianDate: 2007-04-27 00:37:49
Subject: Re: Modifying TOAST thresholds

pgsql-patches by date

Next:From: Bruce MomjianDate: 2007-04-27 00:50:17
Subject: Re: Reviewers Guide to Deferred Transactions/Transaction Guarantee
Previous:From: Bruce MomjianDate: 2007-04-27 00:39:46
Subject: Re: dropping role w/dependent objects

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group