Skip site navigation (1) Skip section navigation (2)

Re: Reviewers Guide to Deferred Transactions/Transaction Guarantee

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: Reviewers Guide to Deferred Transactions/Transaction Guarantee
Date: 2007-04-27 00:50:17
Message-ID: 200704270050.l3R0oH316679@momjian.us (view raw or flat)
Thread:
Lists: pgsql-hackerspgsql-patches
Your patch has been added to the PostgreSQL unapplied patches list at:

	http://momjian.postgresql.org/cgi-bin/pgpatches

It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.

---------------------------------------------------------------------------


Simon Riggs wrote:
> transaction_guarantee.v11.patch 
> - keep current, cleanup, more comments and docs
> 
> Brief Performance Analysis
> --------------------------
> 
> I've tested 3 scenarios:
> 1. normal
> 2. wal_writer_delay = 100ms
> 3. wal_writer_delay = 100ms and transaction_guarantee = off
> 
> On my laptop, with a scale=1 pgbench database with 1 connection I
> consistently get around 85 tps in mode (1), with a slight performance
> drop in mode (2). In mode (3) I get anywhere from 200tps - 900 tps,
> depending upon how well cached everything is, with 700 tps being fairly
> typical. fsync = on gives around 900tps. 
> 
> Also good speedups with multiple session tests.
> 
> make installcheck passes in 120 sec in mode (3), though 155 sec in mode
> (1) and 158 sec in mode (2).
> 
> Basic Implementation
> --------------------
> 
> xact.c
> xact.h
> 
> The basic implementation simply records the LSN of the xlog commit
> record in a shared memory area, the deferred fsync cache. 
> 
> ipci.c
> 
> The cache is protected by an LWlock called DeferredFsyncLock.
> 
> lwlock.h
> 
> A WALWriter process wakes up regularly to perform a background flush of
> WAL up to the point of the highest LSN in the deferred fsync cache.
> 
> walwriter.c
> walwriter.h
> postmaster.c
> 
> WALWriter can be enabled only at server start.
> (All above same as March 11 version)
> 
> Correctness
> -----------
> 
> postgres.c
> 
> Only certain code paths can execute transaction_guarantee = off
> transactions, though the main code paths for OLTP allow it.
> 
> xlog.c 
> 
> CreateCheckpoint() must protect against starting a checkpoint when
> commits are not yet flushed, so an additional flush must occur here.
> 
> vacuum.c 
> 
> VACUUM FULL cannot move tuples until their states are all known, so this
> command triggers a background flush also.
> 
> clog.c
> clog.h
> slru.c
> slru.h
> 
> Changes to Clog and SLRU enforce the basic rule of WAL-before-data,
> which otherwise might allow the record of a commit to reach disk before
> the flush of the WAL. This is implemented by storing an LSN for each
> clog page.
> 
> transam.c
> transam.h
> twophase.c
> xact.c
> 
> The above files have API changes that allow the LSN at transaction
> commit to be passed through to the Clog.
> 
> tqual.c
> tqual.h
> multixact.c
> multixact.h
> 
> Visibility hint bits must also not be set before the transaction is
> flushed, so other changes are required to ensure we store the LSN of
> each transaction, not just the maximum LSN. Changes to tqual.c appear
> extensive, though this is just refactoring to allow us to make
> additional function calls before setting bits - there are no functional
> changes to any HeapTupleSatisfies... functions.
> 
> xact.c
> 
> Contains the module for the Deferred Transaction functions and in
> particular the deferred transaction cache. This could be a separate
> module, since there is only a slight link with the other xact.c code. 
> 
> User Interface
> --------------
> 
> guc.c
> postgresql.conf.sample
> guc_table.h
> 
> New parameters have been added, with a new parameter grouping of
> WAL_COMMITS created to control the various commit parameters.
> 
> Performance Tuning
> ------------------
> 
> The WALWriter wakes up each eal_writer_delay milliseconds. There are two
> protections against mis-setting this parameter.
> 
> pmsignal.h
> 
> The WALWriter will also be woken by a signal if the DF cache has nearly
> filled and flushing would be desirable.
> 
> The WALWriter will also loop without any delay if the number of
> transactions committed while it was writing WAL is above a threshold
> value.
> 
> Docs
> ----
> The fsync parameter has been removed from postgresql.conf.sample and the
> docs, though it still exists in this patch to allow performance testing
> during Beta. It is suggested that fsync=on should mean the same thing as
> transaction_guarantee = off, wal_writer_delay = 100ms, if it is
> specified in postgresql.conf or on the server command line.
> 
> A new section in wal.sgml willd escribe this in more detail, later.
> 
> Open Questions
> --------------
> 
> 1. Should the DFC use a standard hash table? Custom code allows both
> additional speed and the ability to signal when it fills.
> 
> 2. Should tqual.c update the LSN of a heap page with the LSN of the
> transaction commit that it can read from the DF cache?
> 
> 3. Should the WALWriter also do the wal_buffers half-full write at the
> start of XLogInsert() ?
> 
> 4. The recent changes to remove CheckpointStartLock haven't changed the
> code path for deferred transactions, so a similar solution might be
> possible there also.
> 
> 5. Is it correct to do WAL-before-flush for clog only, or should this
> be multixact also?
> 
> All of the above are fairly minor changes.
> 
> Any other thoughts/comments/tests welcome.
> 
> -- 
>   Simon Riggs             
>   EnterpriseDB   http://www.enterprisedb.com
> 

[ Attachment, skipping... ]

> 
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
>        message can get through to the mailing list cleanly

-- 
  Bruce Momjian  <bruce(at)momjian(dot)us>          http://momjian.us
  EnterpriseDB                               http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

In response to

pgsql-hackers by date

Next:From: Bruce MomjianDate: 2007-04-27 00:54:51
Subject: Re: [PATCHES] Fix for large file support
Previous:From: Koichi SuzukiDate: 2007-04-27 00:50:13
Subject: Re: [HACKERS] Full page writes improvement, code update

pgsql-patches by date

Next:From: Bruce MomjianDate: 2007-04-27 00:54:51
Subject: Re: [PATCHES] Fix for large file support
Previous:From: Koichi SuzukiDate: 2007-04-27 00:50:13
Subject: Re: [HACKERS] Full page writes improvement, code update

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group