Re: Reviewers Guide to Deferred Transactions/Transaction Guarantee

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: Reviewers Guide to Deferred Transactions/Transaction Guarantee
Date: 2007-04-27 00:50:17
Message-ID: 200704270050.l3R0oH316679@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches


Your patch has been added to the PostgreSQL unapplied patches list at:

http://momjian.postgresql.org/cgi-bin/pgpatches

It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.

---------------------------------------------------------------------------

Simon Riggs wrote:
> transaction_guarantee.v11.patch
> - keep current, cleanup, more comments and docs
>
> Brief Performance Analysis
> --------------------------
>
> I've tested 3 scenarios:
> 1. normal
> 2. wal_writer_delay = 100ms
> 3. wal_writer_delay = 100ms and transaction_guarantee = off
>
> On my laptop, with a scale=1 pgbench database with 1 connection I
> consistently get around 85 tps in mode (1), with a slight performance
> drop in mode (2). In mode (3) I get anywhere from 200tps - 900 tps,
> depending upon how well cached everything is, with 700 tps being fairly
> typical. fsync = on gives around 900tps.
>
> Also good speedups with multiple session tests.
>
> make installcheck passes in 120 sec in mode (3), though 155 sec in mode
> (1) and 158 sec in mode (2).
>
> Basic Implementation
> --------------------
>
> xact.c
> xact.h
>
> The basic implementation simply records the LSN of the xlog commit
> record in a shared memory area, the deferred fsync cache.
>
> ipci.c
>
> The cache is protected by an LWlock called DeferredFsyncLock.
>
> lwlock.h
>
> A WALWriter process wakes up regularly to perform a background flush of
> WAL up to the point of the highest LSN in the deferred fsync cache.
>
> walwriter.c
> walwriter.h
> postmaster.c
>
> WALWriter can be enabled only at server start.
> (All above same as March 11 version)
>
> Correctness
> -----------
>
> postgres.c
>
> Only certain code paths can execute transaction_guarantee = off
> transactions, though the main code paths for OLTP allow it.
>
> xlog.c
>
> CreateCheckpoint() must protect against starting a checkpoint when
> commits are not yet flushed, so an additional flush must occur here.
>
> vacuum.c
>
> VACUUM FULL cannot move tuples until their states are all known, so this
> command triggers a background flush also.
>
> clog.c
> clog.h
> slru.c
> slru.h
>
> Changes to Clog and SLRU enforce the basic rule of WAL-before-data,
> which otherwise might allow the record of a commit to reach disk before
> the flush of the WAL. This is implemented by storing an LSN for each
> clog page.
>
> transam.c
> transam.h
> twophase.c
> xact.c
>
> The above files have API changes that allow the LSN at transaction
> commit to be passed through to the Clog.
>
> tqual.c
> tqual.h
> multixact.c
> multixact.h
>
> Visibility hint bits must also not be set before the transaction is
> flushed, so other changes are required to ensure we store the LSN of
> each transaction, not just the maximum LSN. Changes to tqual.c appear
> extensive, though this is just refactoring to allow us to make
> additional function calls before setting bits - there are no functional
> changes to any HeapTupleSatisfies... functions.
>
> xact.c
>
> Contains the module for the Deferred Transaction functions and in
> particular the deferred transaction cache. This could be a separate
> module, since there is only a slight link with the other xact.c code.
>
> User Interface
> --------------
>
> guc.c
> postgresql.conf.sample
> guc_table.h
>
> New parameters have been added, with a new parameter grouping of
> WAL_COMMITS created to control the various commit parameters.
>
> Performance Tuning
> ------------------
>
> The WALWriter wakes up each eal_writer_delay milliseconds. There are two
> protections against mis-setting this parameter.
>
> pmsignal.h
>
> The WALWriter will also be woken by a signal if the DF cache has nearly
> filled and flushing would be desirable.
>
> The WALWriter will also loop without any delay if the number of
> transactions committed while it was writing WAL is above a threshold
> value.
>
> Docs
> ----
> The fsync parameter has been removed from postgresql.conf.sample and the
> docs, though it still exists in this patch to allow performance testing
> during Beta. It is suggested that fsync=on should mean the same thing as
> transaction_guarantee = off, wal_writer_delay = 100ms, if it is
> specified in postgresql.conf or on the server command line.
>
> A new section in wal.sgml willd escribe this in more detail, later.
>
> Open Questions
> --------------
>
> 1. Should the DFC use a standard hash table? Custom code allows both
> additional speed and the ability to signal when it fills.
>
> 2. Should tqual.c update the LSN of a heap page with the LSN of the
> transaction commit that it can read from the DF cache?
>
> 3. Should the WALWriter also do the wal_buffers half-full write at the
> start of XLogInsert() ?
>
> 4. The recent changes to remove CheckpointStartLock haven't changed the
> code path for deferred transactions, so a similar solution might be
> possible there also.
>
> 5. Is it correct to do WAL-before-flush for clog only, or should this
> be multixact also?
>
> All of the above are fairly minor changes.
>
> Any other thoughts/comments/tests welcome.
>
> --
> Simon Riggs
> EnterpriseDB http://www.enterprisedb.com
>

[ Attachment, skipping... ]

>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2007-04-27 00:54:51 Re: [PATCHES] Fix for large file support
Previous Message Koichi Suzuki 2007-04-27 00:50:13 Re: [HACKERS] Full page writes improvement, code update

Browse pgsql-patches by date

  From Date Subject
Next Message Bruce Momjian 2007-04-27 00:54:51 Re: [PATCHES] Fix for large file support
Previous Message Koichi Suzuki 2007-04-27 00:50:13 Re: [HACKERS] Full page writes improvement, code update