Re: Questionaire: Common WAL write rates on busy servers.

From: Jerry Sievers <gsievers19(at)comcast(dot)net>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Questionaire: Common WAL write rates on busy servers.
Date: 2017-04-25 18:56:28
Message-ID: 87inlsco4j.fsf@jsievers.enova.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-performance

Andres Freund <andres(at)anarazel(dot)de> writes:

> Hi,
>
> I've lately seen more and more installations where the generation of
> write-ahead-log (WAL) is one of the primary bottlenecks. I'm curious
> whether that's primarily a "sampling error" of mine, or whether that's
> indeed more common.
>
> The primary reason I'm curious is that I'm pondering a few potential
> optimizations, and would like to have some guidance which are more and
> which are less important.

I have a busy warehouse spitting out about 400k
segments/week... ~10MB/second :-)

We have resorted to a rather complex batch/parallel compressor/shipper
to keep up with the volume.

>
> Questions (answer as many you can comfortably answer):
> - How many MB/s, segments/s do you see on busier servers?

Our busiest system Avg 10MB/second but very burst. Assume it'w many
times that during high churn periods.

> - What generates the bulk of WAL on your servers (9.5+ can use
> pg_xlogdump --stats to compute that)?

Simply warehouse incremental loading and/or full table delete/trunc and
reload, plus dirived data being created. Many of the transient tables
are on NVME and unlogged.

> - Are you seeing WAL writes being a bottleneck?OA
> - What kind of backup methods are you using and is the WAL volume a
> problem?

I do not know if basic local WAL writing itself is a problem of or not
but as mentioned, we are scarcely able to handle the necessary archiving
to make backups and PITR possible.

> - What kind of replication are you using and is the WAL volume a

Th;are 2 streamers both feeding directly from master. We use a fairly
large 30k keep-segments value to help avoid streamers falling behind and
then having to resort to remote archive fetching.

It does appear that since streaming WAL reception and application as
well as of course remote fetching are single threaded, this is a
bottleneck as well. That is, a totally unloded and well outfitted
(hardware wise) streamer can barely keep up with master.

> - What are your settings for wal_compression, max_wal_size (9.5+) /
> checkpoint_segments (< 9.5), checkpoint_timeout and wal_buffers?

checkpoint_timeout | 5min
max_wal_size | 4GB
wal_buffers | 16MB
wal_compression | off

> - Could you quickly describe your workload?

warehouse with user self-service reporting creation/storage allowed in
same system.

>
> Feel free to add any information you think is pertinent ;)

Great idea!! Thanks

>
> Greetings,
>
> Andres Freund

--
Jerry Sievers
Postgres DBA/Development Consulting
e: postgres(dot)consulting(at)comcast(dot)net
p: 312.241.7800

In response to

Browse pgsql-general by date

  From Date Subject
Next Message kerneltrick 2017-04-25 19:27:24 Re: FDW table doesn't exist
Previous Message Mike Blackwell 2017-04-25 18:23:14 Re: Surprising results from array concatenation

Browse pgsql-performance by date

  From Date Subject
Next Message Tomas Vondra 2017-04-25 23:35:38 Re: Delete, foreign key, index usage
Previous Message Claudio Freire 2017-04-25 16:07:41 Re: Questionaire: Common WAL write rates on busy servers.