Skip site navigation (1) Skip section navigation (2)

Re: Separating bgwriter and checkpointer

From: "Dickson S(dot) Guedes" <listas(at)guedesoft(dot)net>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Separating bgwriter and checkpointer
Date: 2011-10-05 04:10:09
Message-ID: CAHHcrepQNCN3HoXSWAOtsj3HJKp5NoepB2ky3dFgVA4JCXh-Qw@mail.gmail.com (view raw or flat)
Thread:
Lists: pgsql-hackers
2011/10/4 Simon Riggs <simon(at)2ndquadrant(dot)com>:
> The problem is the *same* one I fixed in v2, yet now I see I managed
> to somehow exclude that fix from the earlier patch. Slap. Anyway,
> fixed again now.

Ah ok! I started reviewing the v4 patch version, this is my comments:

Submission review
===============

1. The patch applies cleanly to current master (fa56a0c3e) but isn't
in context diff format;

Feature test
==========

1. Since I patched and make installed it, I can see the expected
processes: writer and checkpointer;

2. I did the following tests with the following results:

2.1 Running a long time pgbench didn't emit any assertion failure or
crash and the checkpoint works as before patch:

  LOG:  checkpoint starting: xlog
  LOG:  checkpoint complete: wrote 300 buffers (9.8%); 0 transaction
log file(s) added, 0 removed, 0 recycled; write=26.103 s, sync=6.492
s, total=34.000 s; sync files=13, longest=4.684 s, average=0.499 s
  LOG:  checkpoint starting: time
  LOG:  checkpoint complete: wrote 257 buffers (8.4%); 0 transaction
log file(s) added, 0 removed, 3 recycled; write=21.819 s, sync=9.610
s, total=32.076 s; sync files=7, longest=6.452 s, average=1.372 s

2.2 Forcing a checkpoint when filesystem has enough free space works
fine while pgbench is running:

  LOG:  checkpoint starting: immediate force wait
  LOG:  checkpoint complete: wrote 1605 buffers (52.2%); 0 transaction
log file(s) added, 0 removed, 2 recycled; write=0.344 s, sync=22.750
s, total=23.700 s; sync files=10, longest=15.586 s, average=2.275 s

2.3 Forcing a checkpoint when filesystem are full, works as expected:

  LOG:  checkpoint starting: immediate force wait time
  ERROR:  could not write to file "pg_xlog/xlogtemp.4380": Não há
espaço disponível no dispositivo
  ERROR:  checkpoint request failed
  HINT:  Consult recent messages in the server log for details.
  STATEMENT:  CHECKPOINT ;
  ...
  ERROR:  could not extend file "base/16384/16405": wrote only 4096 of
8192 bytes at block 10
  HINT:  Check free disk space.
  STATEMENT:  INSERT INTO pgbench_history (tid, bid, aid, delta,
mtime) VALUES (69, 3, 609672, -3063,  CURRENT_TIMESTAMP);
  PANIC:  could not write to file "pg_xlog/xlogtemp.4528": Não há
espaço disponível no dispositivo
  STATEMENT:  END;
  LOG:  server process (PID 4528) was terminated by signal 6: Aborted

Then I freed some space and started it again and the server ran properly:

   LOG:  database system was shut down at 2011-10-05 00:46:33 BRT
   LOG:  database system is ready to accept connections
   LOG:  autovacuum launcher started
   ...
   LOG:  checkpoint starting: immediate force wait
   LOG:  checkpoint complete: wrote 0 buffers (0.0%); 0 transaction
log file(s) added, 0 removed, 0 recycled; write=0.000 s, sync=0.000 s,
total=0.181 s; sync files=0, longest=0.000 s, average=0.000 s

2.2 Running a pgbench and interrupting postmaster a few seconds later,
seems to work as expected, returning the output:

... cut ...
LOG:  statement: SELECT abalance FROM pgbench_accounts WHERE aid = 148253;
^CLOG:  statement: UPDATE pgbench_tellers SET tbalance = tbalance +
934 WHERE tid = 85;
DEBUG:  postmaster received signal 2
LOG:  received fast shutdown request
LOG:  aborting any active transactions
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
... cut ...
LOG:  disconnection: session time: 0:00:14.917 user=guedes
database=bench host=[local]
LOG:  disconnection: session time: 0:00:14.773 user=guedes
database=bench host=[local]
DEBUG:  server process (PID 1910) exited with exit code 1
DEBUG:  server process (PID 1941) exited with exit code 1
LOG:  shutting down
LOG:  checkpoint starting: shutdown immediate
DEBUG:  SlruScanDirectory invoking callback on pg_multixact/offsets/0000
DEBUG:  SlruScanDirectory invoking callback on pg_multixact/members/0000
DEBUG:  checkpoint sync: number=1 file=base/16384/16398 time=1075.227 msec
DEBUG:  checkpoint sync: number=2 file=base/16384/16406 time=16.832 msec
DEBUG:  checkpoint sync: number=3 file=base/16384/16397 time=12306.204 msec
DEBUG:  checkpoint sync: number=4 file=base/16384/16397.1 time=2122.141 msec
DEBUG:  checkpoint sync: number=5 file=base/16384/16406_fsm time=32.278 msec
DEBUG:  checkpoint sync: number=6 file=base/16384/16385_fsm time=11.248 msec
DEBUG:  checkpoint sync: number=7 file=base/16384/16388 time=11.083 msec
DEBUG:  checkpoint sync: number=8 file=base/16384/11712 time=11.314 msec
DEBUG:  checkpoint sync: number=9 file=base/16384/16397_vm time=11.103 msec
DEBUG:  checkpoint sync: number=10 file=base/16384/16385 time=19.308 msec
DEBUG:  attempting to remove WAL segments older than log file
000000010000000000000000
DEBUG:  SlruScanDirectory invoking callback on pg_subtrans/0000
LOG:  checkpoint complete: wrote 1659 buffers (54.0%); 0 transaction
log file(s) added, 0 removed, 0 recycled; write=0.025 s, sync=15.617
s, total=15.898 s; sync files=10, longest=12.306 s, average=1.561 s
LOG:  database system is shut down

Then I started the server again and it ran properly.


Well, all the tests was running with the default postgresql.conf in my
laptop but I'll setup a more "real world" environment to test for
performance regression. Until now I couldn't notice any significant
difference in TPS before and after patch in a small environment. I'll
post something soon.


Best regards,
-- 
Dickson S. Guedes
mail/xmpp: guedes(at)guedesoft(dot)net - skype: guediz
http://guedesoft.net - http://www.postgresql.org.br

In response to

Responses

pgsql-hackers by date

Next:From: davegDate: 2011-10-05 04:12:22
Subject: timezone buglet?
Previous:From: Greg StarkDate: 2011-10-05 02:55:28
Subject: Re: Inlining comparators as a performance optimisation

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group