Re: [HACKERS] WAL logging problem in 9.4.3?

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: noah(at)leadboat(dot)com
Cc: robertmhaas(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org, 9erthalion6(at)gmail(dot)com, andrew(dot)dunstan(at)2ndquadrant(dot)com, hlinnaka(at)iki(dot)fi, michael(at)paquier(dot)xyz
Subject: Re: [HACKERS] WAL logging problem in 9.4.3?
Date: 2019-12-03 11:51:46
Message-ID: 20191203.205146.1521643852457054060.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello.

At Thu, 28 Nov 2019 17:23:19 -0500, Noah Misch <noah(at)leadboat(dot)com> wrote in
> On Thu, Nov 28, 2019 at 09:35:08PM +0900, Kyotaro Horiguchi wrote:
> > I measured the performance with the latest patch set.
> >
> > > 1. Determine $DDL_COUNT, a number of DDL transactions that take about one
> > > minute when done via syncs.
> > > 2. Start "pgbench -rP1 --progress-timestamp -T180 -c10 -j10".
> > > 3. Wait 10s.
> > > 4. Start one DDL backend that runs $DDL_COUNT transactions.
> > > 5. Save DDL start timestamp, DDL end timestamp, and pgbench output.
>
> If you have the raw data requested in (5), please share them here so folks
> have the option to reproduce your graphs and calculations.

Sorry, I forgot to attach the scripts. The raw data was vanished into
unstable connection and the steps was quite crude. I prioritized on
showing some numbers at the time. I revised the scripts into more
automated way and will take numbers again.

> > > 2. Start server with wal_level = replica (all other variables are not
> > > changed) then run the attached ./bench.sh
> >
> > The bench.sh attachment was missing; please attach it. Please give the output
> > of this command:
> >
> > select name, setting from pg_settings where setting <> boot_val;

(I intentionally show all the results..)
=# select name, setting from pg_settings where setting<> boot_val;
name | setting
----------------------------+--------------------
application_name | psql
archive_command | (disabled)
client_encoding | UTF8
data_directory_mode | 0700
default_text_search_config | pg_catalog.english
lc_collate | en_US.UTF-8
lc_ctype | en_US.UTF-8
lc_messages | en_US.UTF-8
lc_monetary | en_US.UTF-8
lc_numeric | en_US.UTF-8
lc_time | en_US.UTF-8
log_checkpoints | on
log_file_mode | 0600
log_timezone | Asia/Tokyo
max_stack_depth | 2048
max_wal_senders | 0
max_wal_size | 10240
server_encoding | UTF8
shared_buffers | 16384
TimeZone | Asia/Tokyo
unix_socket_permissions | 0777
wal_buffers | 512
wal_level | minimal
(23 rows)

The result for "replica" setting in the benchmark script are used as
base numbers (or the denominator of the percentages).

> > 3. Restart server with wal_level = replica then run the bench.sh
> > twice.
>
> I assume this is wal_level=minimal, not wal_level=replica.

Oops! It's wrong I ran once with replica, then twice with minimal.

Anyway, I revised the benchmarking scripts and attached them. The
parameters written in benchmain.sh were decided as ./bench2.pl 5
<count> <pages> s with wal_level=minimal server takes around 60
seconds.

I'll send the complete data tomorrow (in JST). The attached f.txt is
the result of preliminary test only with pages=100 and 250 (with HDD).

The attached files are:
benchmain.sh - main script
bench2.sh - run a benchmark with a single set of parameters
bench1.pl - behchmark client program
summarize.pl - script to summarize benchmain.sh's output
f.txt.gz - result only for pages=100, DDL count = 2200 (not 2250)

How to run:

$ /..unpatched_path../initdb -D <unpatched_datadir>
(wal_level=replica, max_wal_senders=0, log_checkpoints=yes, max_wal_size=10GB)
$ /..patched_path../initdb -D <patched_datadir>
(wal_level=minimal, max_wal_senders=0, log_checkpoints=yes, max_wal_size=10GB)
$./benchmain.sh > <result_file> # output raw data
$./summarize.pl [-v] < <result_file> # show summary

With the attached f.txt, summarize.pl gives the following output.
WAL wins with the that pages.

$ cat f.txt | ./summarize.pl
## params: wal_level=replica mode=none pages=100 count=353 scale=20
(% are relative to "before")
before: tps 262.3 (100.0%), lat 39.840 ms (100.0%) (29 samples)
during: tps 120.7 ( 46.0%), lat 112.508 ms (282.4%) (35 samples)
after: tps 106.3 ( 40.5%), lat 163.492 ms (410.4%) (86 samples)
DDL time: 34883 ms ( 100.0% relative to mode=none)
## params: wal_level=minimal mode=sync pages=100 count=353 scale=20
(% are relative to "before")
before: tps 226.3 (100.0%), lat 48.091 ms (100.0%) (29 samples)
during: tps 83.0 ( 36.7%), lat 184.942 ms (384.6%) (100 samples)
after: tps 82.6 ( 36.5%), lat 196.863 ms (409.4%) (21 samples)
DDL time: 99239 ms ( 284.5% relative to mode=none)
## params: wal_level=minimal mode=WAL pages=100 count=353 scale=20
(% are relative to "before")
before: tps 240.3 (100.0%), lat 44.686 ms (100.0%) (29 samples)
during: tps 129.6 ( 53.9%), lat 113.585 ms (254.2%) (31 samples)
after: tps 124.5 ( 51.8%), lat 141.992 ms (317.8%) (90 samples)
DDL time: 30392 ms ( 87.1% relative to mode=none)
## params: wal_level=replica mode=none pages=250 count=258 scale=20
(% are relative to "before")
before: tps 266.3 (100.0%), lat 45.884 ms (100.0%) (29 samples)
during: tps 87.9 ( 33.0%), lat 148.433 ms (323.5%) (54 samples)
after: tps 105.6 ( 39.6%), lat 153.216 ms (333.9%) (67 samples)
DDL time: 53176 ms ( 100.0% relative to mode=none)
## params: wal_level=minimal mode=sync pages=250 count=258 scale=20
(% are relative to "before")
before: tps 225.1 (100.0%), lat 47.705 ms (100.0%) (29 samples)
during: tps 93.7 ( 41.6%), lat 143.231 ms (300.2%) (83 samples)
after: tps 93.8 ( 41.7%), lat 186.097 ms (390.1%) (38 samples)
DDL time: 82104 ms ( 154.4% relative to mode=none)
## params: wal_level=minimal mode=WAL pages=250 count=258 scale=20
(% are relative to "before")
before: tps 230.2 (100.0%), lat 48.472 ms (100.0%) (29 samples)
during: tps 90.3 ( 39.2%), lat 183.365 ms (378.3%) (48 samples)
after: tps 123.9 ( 53.8%), lat 131.129 ms (270.5%) (73 samples)
DDL time: 47660 ms ( 89.6% relative to mode=none)

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
f.txt.gz application/octet-stream 17.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2019-12-03 12:06:09 Allow relocatable extension to use @extschema@?
Previous Message Mahendra Singh 2019-12-03 11:51:23 Re: [HACKERS] Block level parallel vacuum