Re: Analysis of ganged WAL writes

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Hannu Krosing <hannu(at)tm(dot)ee>
Cc: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, Curtis Faith <curtis(at)galtair(dot)com>, Pgsql-Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Analysis of ganged WAL writes
Date: 2002-10-07 17:42:53
Message-ID: 24856.1034012573@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> That says that the best possible throughput on this test scenario is 5
> transactions per disk rotation --- the CPU is just not capable of doing
> more. I am actually getting about 4 xact/rotation for 10 or more
> clients (in fact it seems to reach that plateau at 8 clients, and be
> close to it at 7).

After further thought I understand why it takes 8 clients to reach full
throughput in this scenario. Assume that we have enough CPU oomph so
that we can process four transactions, but not five, in the time needed
for one revolution of the WAL disk. If we have five active clients
then the behavior will be like this:

1. Backend A becomes ready to commit. It locks WALWriteLock and issues
a write/flush that will only cover its own commit record. Assume that
it has to wait one full disk revolution for the write to complete (this
will be the steady-state situation).

2. While A is waiting, there is enough time for B, C, D, and E to run
their transactions and become ready to commit. All eventually block on
WALWriteLock.

3. When A finishes its write and releases WALWriteLock, B will acquire
the lock and initiate a write that (with my patch) will cover C, D, and
E's commit records as well as its own.

4. While B is waiting for the disk to spin, A receives a new transaction
from its client, processes it, and becomes ready to commit. It blocks
on WALWriteLock.

5. When B releases the lock, C, D, E acquire it and quickly fall
through, seeing that they need do no work. Then A acquires the lock.
GOTO step 1.

So with five active threads, we alternate between committing one
transaction and four transactions on odd and even disk revolutions.

It's pretty easy to see that with six or seven active threads, we
will alternate between committing two or three transactions and
committing four. Only when we get to eight threads do we have enough
backends to ensure that four transactions are available to commit on
every disk revolution. This must be so because the backends that are
released at the end of any given disk revolution will not be able to
participate in the next group commit, if there is already at least
one backend ready to commit.

So this solution isn't perfect; it would still be nice to have a way to
delay initiation of the WAL write until "just before" the disk is ready
to accept it. I dunno any good way to do that, though.

I went ahead and committed the patch for 7.3, since it's simple and does
offer some performance improvement. But maybe we can think of something
better later on...

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Neil Conway 2002-10-07 17:48:00 Re: [HACKERS] Hot Backup
Previous Message Jan Wieck 2002-10-07 17:42:03 Re: Parallel Executors [was RE: Threaded Sorting]