Re: [HACKERS] Full page writes improvement, code update

From: "Zeugswetter Andreas ADI SD" <ZeugswetterA(at)spardat(dot)at>
To: "Koichi Suzuki" <suzuki(dot)koichi(at)oss(dot)ntt(dot)co(dot)jp>
Cc: "Hannu Krosing" <hannu(at)skype(dot)net>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Simon Riggs" <simon(at)2ndquadrant(dot)com>, <josh(at)agliodbs(dot)com>, <pgsql-hackers(at)postgresql(dot)org>, <pgsql-patches(at)postgresql(dot)org>
Subject: Re: [HACKERS] Full page writes improvement, code update
Date: 2007-04-20 08:16:15
Message-ID: E1539E0ED7043848906A8FF995BDA57901F3FC4B@m0143.s-mxs.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches


> With DBT-2 benchmark, I've already compared the amount of WAL. The
> result was as follows:
>
> Amount of WAL after 60min. run of DBT-2 benchmark
> wal_add_optimization_info = off (default) 3.13GB

how about wal_fullpage_optimization = on (default)

> wal_add_optimization_info = on (new case) 3.17GB -> can be
> optimized to 0.31GB by pg_compresslog.
>
> So the difference will be around a couple of percents. I think this
is
> very good figure.
>
> For information,
> DB Size: 12.35GB (120WH)
> Checkpoint timeout: 60min. Checkpoint occured only once in the run.

Unfortunately I think DBT-2 is not a good benchmark to test the disabled
wal optimization.
The test should contain some larger rows (maybe some updates on large
toasted values), and maybe more frequent checkpoints. Actually the poor
ratio between full pages and normal WAL content in this benchmark is
strange to begin with.
Tom fixed a bug recently, and it would be nice to see the new ratio.

Have you read Tom's comment on not really having to be able to
reconstruct all record types from the full page image ? I think that
sounded very promising (e.g. start out with only heap insert/update).

Then:
- we would not need the wal optimization switch (the full page flag
would always be added depending only on backup)
- pg_compresslog would only remove such "full page" images where it
knows how to reconstruct a "normal" WAL record from
- with time and effort pg_compresslog would be able to compress [nearly]
all record types's full images (no change in backend)

> I don't think replacing LSN works fine. For full recovery to
> the current time, we need both archive log and WAL.
> Replacing LSN will make archive log LSN inconsistent with
> WAL's LSN and the recovery will not work.

WAL recovery would have had to be modified (decouple LSN from WAL
position during recovery).
An "archive log" would have been a valid WAL (with appropriate LSN
advance records).

> Reconstruction to regular WAL is proposed as
> pg_decompresslog. We should be careful enough not to make
> redo routines confused with the dummy full page writes, as
> Simon suggested. So far, it works fine.

Yes, Tom didn't like "LSN replacing" eighter. I withdraw my concern
regarding pg_decompresslog.

Your work in this area is extremely valuable and I hope my comments are
not discouraging.

Thank you
Andreas

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2007-04-20 08:26:24 Re: parser dilemma
Previous Message Magnus Hagander 2007-04-20 08:12:14 Re: BUG #3242: FATAL: could not unlock semaphore: error code 298

Browse pgsql-patches by date

  From Date Subject
Next Message Andrew Dunstan 2007-04-20 08:26:24 Re: parser dilemma
Previous Message Pavan Deolasee 2007-04-20 07:57:26 Re: HOT Patch - Ready for review