Re: shared-memory based stats collector

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: tomas(dot)vondra(at)2ndquadrant(dot)com
Cc: andres(at)anarazel(dot)de, a(dot)zakirov(at)postgrespro(dot)ru, alvherre(at)2ndquadrant(dot)com, ah(at)cybertec(dot)at, magnus(at)hagander(dot)net, robertmhaas(at)gmail(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-hackers(at)postgresql(dot)org
Subject: Re: shared-memory based stats collector
Date: 2019-05-17 05:27:22
Message-ID: 20190517.142722.139901807.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello.

At Wed, 10 Apr 2019 11:13:27 +0200, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote in <20190410091327(dot)fpnvjbuu74dzxizl(at)development>
> While reviewing the patch I've always had issue with evaluating how it
> behaves for various scenarios / workloads. The reviews generally did
> one
> specific benchmark, but I find that unsatisfactory. I wonder whether
> if
> we could develop a small set of more comprehensive workloads for this
> patch (i.e. different numbers of objects, access patterns, ...).

Indeed. I'm having difficulty also with catcache pruning
stuff. But I might have found a clue to that.

I took performance numbers after some amendment and polishment of
the patch.

I expected operf might work but it doesn't show meaningful
information with O2'ed binary. gprof slows binary to about one
third. But just running pgbench gave me rather stable numbers
(differently from catcache stuff..).

The numbers are tps for 300 minutes run and ratio between master
and patched.

[A-D]1 are just running stats-updator clients.

master-O2 patched patched/master-O2
A1: 13431.603208 13457.968950 100.1963
B1: 72284.737474 72535.169437 100.3465
C1: 19.963315 20.037411 100.3712
D1: 193.027074 196.651603 101.8777

[A-D]2 tests introduces stats-reader client.

master-O2 patched patched/master-O2
updator / reader updator / reader updator / reader
A2: 12929.418503/512.784200 13066.150297/584.686889 101.0575 / 114.0220
B2: 71673.804812/ 20.102687 71916.823242/ 22.109251 100.3391 / 109.9816
C2: 16.066719/485.788495 16.487942/577.930340 102.6217 / 118.9675
D2: 189.563306/ 36.252532 193.817075/ 44.661707 102.2440 / 123.1961

Case A1 is most simplest case: 1 client repeatedly updated stats
of pgbench_acconts (of scale-1, but that doesn't matter)

Case B1 is A1 from 100 concurrent clients.

Case C1 is massive(?) number of stats update: Concretely select
sum() on a partitioned table with 1000 children, from 1 client.

Case D1 doesn C1 from 97 concurrent clients.

A2-D2 are running a single stats referencing client while A1-D1
are running respectively. (select sum(seq_scan) from pg_stat_user_tables)

Perhaps the number will get worse having many rerefencing clients
but I think it's not realistic.

I'll run test with many databases (-100?) and expanded tabstat
entry cases.

The attached files are:

v19-0001-sequential-scan-for-dshash.patch:
v19-0002-Add-conditional-lock-feature-to-dshash.patch:
v19-0003-Make-archiver-process-an-auxiliary-process.patch:
v19-0005-Remove-the-GUC-stats_temp_directory.patch:

not changed since v18 except rebasing.

v19-0004-Shared-memory-based-stats-collector.patch:

Rebased. Fixed several bugs. Improved performance in some
cases. Made structs/code tidier. Added/rewrote comments.

run.sh : main test script
gencr.pl : partitioned table generator script generator
(perl gencr.pl | psql postgres to run)
tr.sql : stats-updator client script used by run.sh
ref.sql : stats-reader client script used by run.sh

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
v19-0001-sequential-scan-for-dshash.patch text/x-patch 10.6 KB
v19-0002-Add-conditional-lock-feature-to-dshash.patch text/x-patch 5.0 KB
v19-0003-Make-archiver-process-an-auxiliary-process.patch text/x-patch 11.9 KB
v19-0004-Shared-memory-based-stats-collector.patch text/x-patch 211.9 KB
v19-0005-Remove-the-GUC-stats_temp_directory.patch text/x-patch 10.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Sharma 2019-05-17 05:39:41 Re: Passing CopyMultiInsertInfo structure to CopyMultiInsertInfoNextFreeSlot()
Previous Message Amit Langote 2019-05-17 05:00:22 Re: behaviour change - default_tablesapce + partition table