Re: [HACKERS] Full page writes improvement, code update

From: Koichi Suzuki <suzuki(dot)koichi(at)oss(dot)ntt(dot)co(dot)jp>
To: Zeugswetter Andreas ADI SD <ZeugswetterA(at)spardat(dot)at>
Cc: Hannu Krosing <hannu(at)skype(dot)net>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, josh(at)agliodbs(dot)com, pgsql-hackers(at)postgresql(dot)org, pgsql-patches(at)postgresql(dot)org
Subject: Re: [HACKERS] Full page writes improvement, code update
Date: 2007-04-23 00:17:21
Message-ID: 462BFB11.7000409@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

Hi,

I don't insist the name and the default of the GUC parameter. I'm
afraid wal_fullpage_optimization = on (default) makes some confusion
because the default behavior becomes a bit different on WAL itself.

I'd like to have some more opinion on this.

Zeugswetter Andreas ADI SD wrote:
>> With DBT-2 benchmark, I've already compared the amount of WAL. The
>> result was as follows:
>>
>> Amount of WAL after 60min. run of DBT-2 benchmark
>> wal_add_optimization_info = off (default) 3.13GB
>
> how about wal_fullpage_optimization = on (default)
>
>> wal_add_optimization_info = on (new case) 3.17GB -> can be
>> optimized to 0.31GB by pg_compresslog.
>>
>> So the difference will be around a couple of percents. I think this
> is
>> very good figure.
>>
>> For information,
>> DB Size: 12.35GB (120WH)
>> Checkpoint timeout: 60min. Checkpoint occured only once in the run.
>
> Unfortunately I think DBT-2 is not a good benchmark to test the disabled
> wal optimization.
> The test should contain some larger rows (maybe some updates on large
> toasted values), and maybe more frequent checkpoints. Actually the poor
> ratio between full pages and normal WAL content in this benchmark is
> strange to begin with.
> Tom fixed a bug recently, and it would be nice to see the new ratio.
>
> Have you read Tom's comment on not really having to be able to
> reconstruct all record types from the full page image ? I think that
> sounded very promising (e.g. start out with only heap insert/update).
>
> Then:
> - we would not need the wal optimization switch (the full page flag
> would always be added depending only on backup)
> - pg_compresslog would only remove such "full page" images where it
> knows how to reconstruct a "normal" WAL record from
> - with time and effort pg_compresslog would be able to compress [nearly]
> all record types's full images (no change in backend)
>
>> I don't think replacing LSN works fine. For full recovery to
>> the current time, we need both archive log and WAL.
>> Replacing LSN will make archive log LSN inconsistent with
>> WAL's LSN and the recovery will not work.
>
> WAL recovery would have had to be modified (decouple LSN from WAL
> position during recovery).
> An "archive log" would have been a valid WAL (with appropriate LSN
> advance records).
>
>> Reconstruction to regular WAL is proposed as
>> pg_decompresslog. We should be careful enough not to make
>> redo routines confused with the dummy full page writes, as
>> Simon suggested. So far, it works fine.
>
> Yes, Tom didn't like "LSN replacing" eighter. I withdraw my concern
> regarding pg_decompresslog.
>
> Your work in this area is extremely valuable and I hope my comments are
> not discouraging.
>
> Thank you
> Andreas
>

--
-------------
Koichi Suzuki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2007-04-23 02:53:28 Re: BUG #3244: problem with PREPARE
Previous Message Bruce Momjian 2007-04-22 23:45:26 Re: Postgres SQL Parser

Browse pgsql-patches by date

  From Date Subject
Next Message ITAGAKI Takahiro 2007-04-23 03:31:32 Re: Dead Space Map version 3 (simplified)
Previous Message Bruce Momjian 2007-04-22 21:20:15 Re: [COMMITTERS] pgsql: Some further performance tweaks for planning large inheritance