Re: Replication/backup defaults

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Replication/backup defaults
Date: 2017-01-05 13:23:54
Message-ID: CABUevEwjKYAcvtbEkz2T_xRuSyQibs=8bcAHTXrRuMZEcgfkCw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 5, 2017 at 12:44 AM, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
wrote:

> On 01/03/2017 11:56 PM, Tomas Vondra wrote:
>
>> Hi,
>>
>> ...
>
>> I'll push results for larger ones once those tests complete (possibly
>> tomorrow).
>>
>>
> I just pushed additional results (from the additional scales) to the git
> repositories. On the larger (16/32-cores) machine with 2x e5-2620, the
> results look like this
>
> scale minimal replica logical
> -----------------------------------------------------
> 100 23968 24393 24393
> 1000 23412 23656 23794
> 10000 5283 5320 5197
>
> and on the smaller one (i5-2500k with 4 cores) I got this:
>
> scale minimal replica logical
> -----------------------------------------------------
> 50 5884 5896 5873
> 400 5324 5342 5478
> 1000 5341 5439 5425
>
> The scales were chosen so that the smallest one fits into shared buffers,
> the medium exceeds shared buffers but still fits into RAM, and the largest
> scale exceeds RAM.
>
> The results seem to confirm that for this workload (regular pgbench),
> there's very little difference between the different WAL levels. Actually,
> the 'replica' seems a tad faster than 'minimal', but the difference may be
> easily due to noise.
>
> I've also looked at the amount of WAL actually produced, by doing pgbench
> runs throttled to the same throughput, and counting the number of archived
> WAL segments & running pg_xlogdump. Interestingly enough, those two metrics
> differ quite a bit - for example for scale 1000 (on the 32-core machine),
> the 2h runs produced these number of WAL segments:
>
> minimal: 5515 (88.2GB)
> replica: 5587 (89.4GB)
> logical: 6058 (96.9GB)
>
> so 'replica' adds ~1.3% and 'logical' ~9.8%. But per pg_xlogdump, the WAL
> amounts are only 73.3GB, 73.9GB and 74.4GB - a difference of only ~1.5%
> between minimal and logical. The values are also much lower than raw WAL
> size, so I assume it's because pg_xlogdump ignores some extra overhead,
> present in the segments. Moreover, the sequential nature of WAL writes
> means even the +10% is not a big deal (unless it results in saturating the
> bandwidth, but running on >90% is a bad idea anyway).
>

If you are using log archiving, it also means your log archive grows by 10%
(well, 8% assuming it was 9.8% on top of 0, not on top of replica).

>
> My conclusion from these results is that using 'wal_level=replica' by
> default seems fine. Perhaps even wal_level=logical would be OK, but that's
> probably a too big step for 10.0.
>

I think it sounds like 'replica' is the safe default.

If we can make it possible to go replica<->logical without a restart, that
makes it easy enough to increase it if necessary, and the default still
applies to most people (most people take backups, most people probably
don't do logical replication).

> Any ideas how to construct a plausible workload where the differences are
> significantly larger? Running the tests on non-SSD storage might also be
> useful.
>
>
It's easy enough to construct a benchmark specifically to show the
difference, but of any actual "normal workload" for it. Typically the
optimization applies to things like bulk loading, which typically never
done alone and does not lend itself to that type of benchmarking very
easily.

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jesper Pedersen 2017-01-05 13:29:10 Re: pageinspect: Hash index support
Previous Message Pavan Deolasee 2017-01-05 12:59:17 Re: rewrite HeapSatisfiesHOTAndKey