Re: Speedup twophase transactions

From: Stas Kelvich <s(dot)kelvich(at)postgrespro(dot)ru>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>, Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Speedup twophase transactions
Date: 2017-01-23 11:26:41
Message-ID: BECC988A-DB74-48D5-B5D5-A54551A6242A@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


> On 27 Dec 2016, at 07:31, Michael Paquier <michael(dot)paquier(at)gmail(dot)com> wrote:
>
> I think that it would be a good idea to actually test that in pure
> recovery time, aka no client, and just use a base backup and make it
> recover X prepared transactions that have created Y checkpoints after
> dropping cache (or restarting server).

I did tests with following setup:

* Start postgres initialised with pgbench
* Start pg_receivexlog
* Take basebackup
* Perform 1.5 M transactions
* Stop everything and apply wal files stored by pg_receivexlog to base backup.

All tests performed on a laptop with nvme ssd
number of transactions: 1.5M
start segment: 0x4

-master non-2pc:
last segment: 0x1b
average recovery time per 16 wal files: 11.8s
average total recovery time: 17.0s

-master 2pc:
last segment: 0x44
average recovery time per 16 wal files: 142s
average total recovery time: 568s

-patched 2pc (previous patch):
last segment: 0x44
average recovery time per 16 wal files: 5.3s
average total recovery time: 21.2s

-patched2 2pc (dlist_push_tail changed to dlist_push_head):
last segment: 0x44
average recovery time per 16 wal files: 5.2s
average total recovery time: 20.8s

So skipping unnecessary fsyncs gave us x25 speed increase even on ssd (on hdd difference should be bigger).
Pushing to list's head didn’t yield measurable results, but anyway seems to be conceptually better.

PS:
I’ve faced situation when pg_basebackup freezes until checkpoint happens (automatic or user-issued).
Is that expected behaviour?

--
Stas Kelvich
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
twophase_recovery_list_2.diff application/octet-stream 20.1 KB
master-non2pc.svg image/svg+xml 131.1 KB
patched-2pc.svg image/svg+xml 223.8 KB
patched2-2pc.svg image/svg+xml 173.0 KB
master-2pc.svg image/svg+xml 130.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ivan Kartyshov 2017-01-23 11:56:36 Re: make async slave to wait for lsn to be replayed
Previous Message Tomas Vondra 2017-01-23 10:26:01 Re: Checksums by default?