Re: shared-memory based stats collector

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: tgl(at)sss(dot)pgh(dot)pa(dot)us
Cc: robertmhaas(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: shared-memory based stats collector
Date: 2018-07-05 11:33:52
Message-ID: 20180705.203352.69805633.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello.

At Thu, 05 Jul 2018 12:04:23 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote in <20180705(dot)120423(dot)49626073(dot)horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
> UNDO logs seems a bit promising. If we looking stats in a long
> transaction, the required memory for UNDO information easily
> reaches to the same amount with the whole-image snapshot. But I
> expect that it is not so common.
>
> I'll consider that apart from the current patch.

Done as a PoC. (Sorry for the format since filterdiff genearates
a garbage from the patch..)

The attached v3-0008 is that. PoC of UNDO logging of server
stats. It records undo logs only for table stats if any
transaction started acess to stats data. So the logging is rarely
performed. The undo logs are used at the first acess to each
relation's stats then cached. autovacuum and vacuum doesn't
request undoing since they just don't need it.

# v3-0007 is a trivial fix for v3-0003, which will be merged.

I see several arguable points on this feature.

- The undo logs are stored in a ring buffer with a fixed size,
currently 1000 entries. If it is filled up, the consistency
will be broken. Undo log is recorded just once after the
latest undo-recording transaction comes. It is likely to be
read in rather short-lived transactions and it's likely that
there's no more than several such transactions
simultaneously. It's possible to provide dynamic resizing
feature but it doesn't seem worth the complexity.

- Undoing runs linear search on the ring buffer. It is done at
the first time when the stats for every relation is
accessed. It can be (a bit) slow when many log entriess
resides. (Concurrent vacuum can generate many undo log
entries.)

- Undo logs for other stats doesn't seem to me to be needed,
but..

A=>: select relname, seq_scan from pg_stat_user_tables where relname = 't1';
relname | seq_scan
t1 | 0

A=> select relname, seq_scan from pg_stat_user_tables where relname = 't2';
relname | seq_scan
t2 | 0

A=> BEGIN;

-- This gives effect because no stats access has been made
B=> select * from t1;
B=> select * from t2;

A=> select relname, seq_scan from pg_stat_user_tables where relname = 't1';
relname | seq_scan
t1 | 1

-- This has no effect because undo logging is now working
B=> select * from t1;
B=> select * from t2;

<repeat two times>

-- This is the second time in this xact to request for t1,
-- just returns cached result.
A=> select relname, seq_scan from pg_stat_user_tables where relname = 't1';
relname | seq_scan
t1 | 1

-- This is the first time in this xact to request for t2. The
-- result is undoed one.
A=> select relname, seq_scan from pg_stat_user_tables where relname = 't2';
relname | seq_scan
t2 | 1
A=> COMMIT;

A=> select relname, seq_scan from pg_stat_user_tables where relname = 't1';
relname | seq_scan
t1 | 4
A=> select relname, seq_scan from pg_stat_user_tables where relname = 't2';
relname | seq_scan
t2 | 4

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
v3-0007-Fix-of-v3-0003-dshash-based-staas-collector.patch text/x-patch 710 bytes
v3-0008-PoC-implement-of-undo-log-implement.patch text/x-patch 23.2 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2018-07-05 11:52:35 Re: Generating partitioning tuple conversion maps faster
Previous Message Etsuro Fujita 2018-07-05 11:20:27 Re: Expression errors with "FOR UPDATE" and postgres_fdw with partition wise join enabled.