Re: Reworking WAL locking

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Paul van den Bogaard <Paul(dot)Vandenbogaard(at)Sun(dot)COM>
Subject: Re: Reworking WAL locking
Date: 2008-03-23 00:05:16
Message-ID: 200803230005.m2N05Gq26940@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Added to TODO:

* Improve WAL concurrency by increasing lock granularity

http://archives.postgresql.org/pgsql-hackers/2008-02/msg00556.php

---------------------------------------------------------------------------

Simon Riggs wrote:
>
> Paul van den Bogaard (Sun) suggested to me that we could use more than
> two WAL locks to improve concurrency. I think its possible to introduce
> such a scheme with some ease. All mods within xlog.c
>
> The scheme below requires an extra LWlock per WAL buffer.
>
> Locking within XLogInsert() would look like this:
>
> Calculate length of data to be inserted.
> Calculate initial CRC
>
> LWLockAcquire(WALInsertLock, LW_EXCLUSIVE)
>
> Reserve space to write into.
> LSN = current Insert pointer
> Move pointer forward by length of data to be inserted, acquiring
> WALWriteLock if required to ensure space is available.
>
> LWLockAcquire(LSNGetWALPageLockId(LSN), LW_SHARED);
>
> Note that we don't lock every page, just the first one of the set we
> want, but we hold it until all page writes are complete.
>
> LWLockRelease(WALInsertLock);
>
> finish calculating CRC
> write xlog into reserved space
>
> LWLockRelease(LSNGetWALPageLockId(LSN));
>
> XLogWrite() will then try to get a conditional LW_EXCLUSIVE lock
> sequentially on each page it plans to write. It keeps going until it
> fails to get the lock, then writes. Callers of XLogWrite will never be
> able to pass a backend currently performing the wal buffer fill.
>
> We write whole page at a time.
>
> Next time, we do a regular lock wait on the same page, so that we always
> get a page eventually.
>
> This requires us to get 2 locks for an XLogInsert rather than just one.
> However the second lock is always acquired with zero-wait time when the
> wal_buffers are sensibly sized. Overall this should reduce wait time for
> the WALInsertLock since it seems likely that each actual filling of WAL
> buffers will effect different cache lines and are very likely to be able
> to be performed in parallel.
>
> Sounds good to me.
>
> Any objections/comments before this can be tried out?
>
> --
> Simon Riggs
> 2ndQuadrant http://www.2ndQuadrant.com
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
> choose an index scan if your joining column's datatypes do not
> match

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://postgres.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2008-03-23 00:27:03 Re: [HACKERS] quote_literal with NULL
Previous Message Bruce Momjian 2008-03-22 23:47:12 Re: Idea for minor tstore optimization