Re: Group Commit

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Group Commit
Date: 2007-05-17 17:21:19
Message-ID: 200705171721.l4HHLJY25455@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


This is not ready for 8.3.

This has been saved for the 8.4 release:

http://momjian.postgresql.org/cgi-bin/pgpatches_hold

---------------------------------------------------------------------------

Heikki Linnakangas wrote:
> It's been known for years that commit_delay isn't very good at giving us
> group commit behavior. I did some experiments with this simple test
> case: "BEGIN; INSERT INTO test VALUES (1); COMMIT;", with different
> numbers of concurrent clients and with and without commit_delay.
>
> Summary for the impatient:
> 1. Current behavior sucks.
> 2. commit_delay doesn't help with # of clients < ~10. It does help with
> higher numbers, but it still sucks.
> 3. I'm working on a patch.
>
>
> I added logging to show how many commit records are flushed on each
> fsync. The output with otherwise unpatched PG head looks like this, with
> 5 clients:
>
> LOG: Flushed 4 out of 5 commits
> LOG: Flushed 1 out of 5 commits
> LOG: Flushed 4 out of 5 commits
> LOG: Flushed 1 out of 5 commits
> LOG: Flushed 4 out of 5 commits
> LOG: Flushed 1 out of 5 commits
> LOG: Flushed 4 out of 5 commits
> LOG: Flushed 1 out of 5 commits
> LOG: Flushed 3 out of 5 commits
> LOG: Flushed 2 out of 5 commits
> LOG: Flushed 3 out of 5 commits
> LOG: Flushed 2 out of 5 commits
> LOG: Flushed 3 out of 5 commits
> LOG: Flushed 2 out of 5 commits
> LOG: Flushed 3 out of 5 commits
> ...
>
> Here's what's happening:
>
> 1. Client 1 issues fsync (A)
> 2. Clients 2-5 write their commit record, and try to fsync, but they
> have to wait for fsync (A) to finish.
> 3. fsync (A) finishes, freeing client 1.
> 4. One of clients 2-5 starts the next fsync (B), which will flush
> commits of clients 2-5 to disk
> 5. Client 1 begins new transaction, inserts commit record and tries to
> fsync. Needs to wait for previous fsync (B) to finish
> 6. fsync B finishes, freeing clients 2-5
> 7. Client 1 issues fsync (C)
> 8. ...
>
> The 2-3-2-3 pattern can be explained with similar unfortunate
> "resonance", but with two clients instead of client 1 in the above
> possibly running in separate cores (test was run on a dual-core laptop).
>
> I also draw a diagram illustrating the above, attached.
>
> I wrote a quick & dirty patch for this that I'm going to refine further,
> but wanted to get the results out for others to look at first. I'm not
> posting the patch yet, but it basically adds some synchronization to the
> WAL flushes. It introduces a counter of inserted but not yet flushed
> commit records. Instead of the commit_delay, the counter is checked. If
> it's smaller than NBackends, the process waits until count reaches
> NBackends, or a timeout expires. There's two significant differences to
> commit_delay here:
> 1. Instead of waiting for commit_delay to expire, processes are woken
> and fsync is started immediately when we know there's no more commit
> records coming that we should wait for. Even though commit_delay is
> given in microseconds, the real granularity of the wait can be as high
> as 10 ms, which is in the same ball park as the fsync itself.
> 2. commit_delay is not used when there's less than commit_siblings
> non-idle backends in the system. With very short transactions, it's
> worthwhile to wait even if that's the case, because a client can begin
> and finish a transaction in much shorter time than it takes to fsync.
> This is what makes the commit_delay to not work at all in my test case
> with 2 clients.
>
> Here's a spreadsheet with the results of the tests I ran:
> http://community.enterprisedb.com/groupcommit-comparison.ods
>
> It contains a graph that shows that the patch works very well for this
> test case. It's not very good for real life as it is, though. An obvious
> flaw is that if you have a longer-running transaction, effect 1. goes
> away. Instead of waiting for NBackends commit records, we should try to
> guess the number of transactions that are likely to finish in a
> reasonably short time. I'm thinking of keeping a running average of
> commits per second, or # of transactions that finish while an fsync is
> taking place.
>
> Any thoughts?
>
> --
> Heikki Linnakangas
> EnterpriseDB http://www.enterprisedb.com

>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
> choose an index scan if your joining column's datatypes do not
> match

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

In response to

  • Group Commit at 2007-03-26 10:39:16 from Heikki Linnakangas

Browse pgsql-hackers by date

  From Date Subject
Next Message Joshua D. Drake 2007-05-17 17:21:50 Re: Patch queue triage
Previous Message Joshua D. Drake 2007-05-17 17:20:13 Re: Patch queue triage