Re: Replication/backup defaults

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Replication/backup defaults
Date: 2017-01-03 22:56:00
Message-ID: d34ce5b5-131f-66ce-f7c5-eb406dbe026f@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 12/31/2016 04:00 PM, Magnus Hagander wrote:
> Cycling back to this topic again, but this time at the beginning of a CF.
>
> Here's an actual patch to change:
>
>
> wal_level=replica
> max_wal_senders=10
> max_replication_slots=20
>
> Based on feedback from last year
> (https://www.postgresql.org/message-id/CABUevEwfV7zDutescm2PHGvsJdYA0RWHFMTRGhwrJPGgSbzZDQ%40mail.gmail.com):
>
>
> There were requests for benchmarks of performance difference. Tomas
> has promised to run a couple of benchmarks on his standard
> benchmarking setups to give numbers on that. Thanks Tomas, please
> pipe in with your results when you have them!
>

As promised, I'm running some benchmarks, and I have some early results
to report. And perhaps we can discuss whether we need to test some
additional workloads.

I'm 100% on board with the idea that we should switch to wal_level which
allows taking backups or setting-up a streaming replica, as long as it
does not cause severe performance regression in common workloads.

So while it'd be trivial to construct workloads demonstrating the
optimizations in wal_level=minimal (e.g. initial loads doing CREATE
TABLE + COPY + CREATE INDEX in a single transaction), but that would be
mostly irrelevant I guess.

Instead, I've decided to run regular pgbench TPC-B-like workload on a
bunch of different scales, and measure throughput + some xlog stats with
each of the three wal_level options.

Note: I tweaked the code a bit to allow archiving with "minimal" WAL
level, to allow computing WAL stats on the archived segments (instead of
keeping all segments in the data directory).

As usual, I'm running it on two machines - a small old one (i5-2500k box
with 4 cores and 8GB of RAM) and a new one (2x e5-2620v4 with 16/32
cores, 64GB of RAM). Both machines have SSD-based storage.

The clusters on both machines were reasonably tuned, see 'settings.log'
for each run. The tests are fairly long, covering multiple checkpoints
etc. In other words, the results should be fairly stable.

The scripts/results/stats/configs are available here:

* https://bitbucket.org/tvondra/wal-levels-e2620-v4/src
* https://bitbucket.org/tvondra/wal-levels-i5/src

So far I only have results for the smallest data sets (50 on i5 and 100
on e5), which easily fits into shared_buffers in both cases, and the
numbers look like this:

minimal replica standby
------------------------------------------------
i5-2500k 5884 5896 5873
e5-2620v4 23968 24393 24259

So the performance penalty of replica/standby WAL levels on this
workload is pretty much non-existent - for the larger machine those
levels are actually a tad faster than 'minimal', but the difference is
within 2% (so might easily be noise).

I'll push results for larger ones once those tests complete (possibly
tomorrow).

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2017-01-03 23:16:10 Re: Replication/backup defaults
Previous Message Jim Nasby 2017-01-03 22:54:01 Re: Cluster wide option to control symbol case folding