Proposed patch: Smooth replication during VACUUM FULL

From: Gabriele Bartolini <gabriele(dot)bartolini(at)2ndQuadrant(dot)it>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Proposed patch: Smooth replication during VACUUM FULL
Date: 2011-04-30 18:19:36
Message-ID: 4DBC52B8.6080202@2ndQuadrant.it
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi guys,

I have noticed that during VACUUM FULL on reasonably big tables,
replication lag climbs. In order to smooth down the replication lag, I
propose the attached patch which enables vacuum delay for VACUUM FULL.

Please find attached the patch and below more information on this
specific issue.

Cheers,
Gabriele

== Scenario

I have setup a simple SyncRep scenario with one master and one standby
on the same server.
On the master I have setup vacuum_cost_delay = 10 milliseconds.

I have created a scale 50 pgbench database, which produces a 640MB
pgbench_accounts table (about 82k pages). I have then launched a 60
seconds pgbench activity with 4 concurrent clients with the goal to make
some changes to the pgbench table (approximately 1800 changes on my laptop).

== Problem observed

Replication lag climbs during VACUUM FULL.

== Proposed change

Enable vacuum delay for VACUUM FULL (and CLUSTER).

== Test

I have then launched a VACUUM FULL operation on the pgbench_accounts
table and measured the lag in bytes every 5 seconds, by calculating the
difference between the current location and the sent location.

Here is a table with lag values. The first column (sec) is the sampling
time (every 5 seconds for the sake of simplicity here), the second
column (mlag) is the master lag on the current HEAD instance, the third
column (mlagpatch) is the lag measured on the patched Postgres instance.

sec | mlag | mlagpatch
-----+-----------+-----------
0 | 1896424 | 0
5 | 15654912 | 4055040
10 | 8019968 | 13893632
15 | 16850944 | 4177920
20 | 10969088 | 21102592
25 | 11468800 | 2277376
30 | 7995392 | 13893632
35 | 14811136 | 20660224
40 | 6127616 | 0
45 | 6914048 | 5136384
50 | 5996544 | 13500416
55 | 14155776 | 9043968
60 | 23298048 | 11722752
65 | 15400960 | 18202624
70 | 17858560 | 28049408
75 | 8560640 | 34865152
80 | 19628032 | 33161216
85 | 25526272 | 39976960
90 | 23183360 | 23683072
95 | 23265280 | 303104
100 | 24346624 | 3710976
105 | 24813568 | 0
110 | 32587776 | 7651328
115 | 42827776 | 12369920
120 | 50167808 | 14991360
125 | 60260352 | 3850240
130 | 62750720 | 5160960
135 | 68255744 | 9355264
140 | 60653568 | 14336000
145 | 68780032 | 16564224
150 | 74342400 | 5398528
155 | 84639744 | 11321344
160 | 92741632 | 16302080
165 | 70123520 | 20234240
170 | 13606912 | 23248896
175 | 20586496 | 29278208
180 | 16482304 | 1900544
185 | 0 | 0

As you can see, replication lag on HEAD's PostgreSQL reaches 92MB (160
seconds) before starting to decrease (when the operation terminates).
The test result is consistent with the expected behaviour of cost-based
vacuum delay.

--
Gabriele Bartolini - 2ndQuadrant Italia
PostgreSQL Training, Services and Support
gabriele(dot)bartolini(at)2ndQuadrant(dot)it | www.2ndQuadrant.it

Attachment Content-Type Size
vacuum_full_delay.patch text/plain 416 bytes

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2011-04-30 18:22:38 CLUSTER vs toast vacuuming: there's still a problem
Previous Message Tom Lane 2011-04-30 17:39:37 Re: Re: [COMMITTERS] pgsql: Update docs to say you need fsync to make sync rep work fast.