Re: [HACKERS] More stats about skipped vacuums

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: robertmhaas(at)gmail(dot)com
Cc: michael(dot)paquier(at)gmail(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, sawada(dot)mshk(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [HACKERS] More stats about skipped vacuums
Date: 2017-12-11 11:15:23
Message-ID: 20171211.201523.24172046.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Mon, 27 Nov 2017 13:51:22 -0500, Robert Haas <robertmhaas(at)gmail(dot)com> wrote in <CA+Tgmob2tuqvEZfHV2kLC-xobsZxDWGdc1WmjLg5+iOPLa0NHg(at)mail(dot)gmail(dot)com>
> On Mon, Nov 27, 2017 at 1:49 AM, Kyotaro HORIGUCHI
> <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> > Hmmm. Okay, we must make stats collector more effeicient if we
> > want to have additional counters with smaller significance in the
> > table stats. Currently sizeof(PgStat_StatTabEntry) is 168
> > bytes. The whole of the patchset increases it to 232 bytes. Thus
> > the size of a stat file for a database with 10000 tables
> > increases from about 1.7MB to 2.4MB. DSM and shared dynahash is
> > not dynamically expandable so placing stats on shared hash
> > doesn't seem effective. Stats as a regular table could work but
> > it seems too-much.
>
> dshash, which is already committed, is both DSM-based and dynamically
> expandable.

Yes, I forgot about that. We can just copy memory blocks to take
a snapshot of stats.

> > Is it acceptable that adding a new section containing this new
> > counters, which is just loaded as a byte sequence and parsing
> > (and filling the corresponding hash) is postponed until a counter
> > in the section is really requested? The new counters need to be
> > shown in a separate stats view (maybe named pg_stat_vacuum).
>
> Still makes the stats file bigger.

I considered dshash for pgstat.c and the attached is a *PoC*
patch, which is not full-fledged and just working under a not so
concurrent situation.

- Made stats collector an auxiliary process. A crash of stats
collector leads to a whole-server restarting.

- dshash lacks capability of sequential scan so added it.

- Also added snapshot function to dshash. It just copies
unerlying DSA segments into local memory but currently it
doesn't aquire dshash-level locks at all. I tried the same
thing with resize but it leads to very quick exhaustion of
LWLocks. An LWLock for the whole dshash would be required. (and
it is also useful to resize() and sequential scan.

- The current dshash doesn't shrink at all. Such a feature also
will be required. (A server restart causes a shrink of hashes
in the same way as before but bloat dshash requires copying
more than necessary size of memory on takeing a snapshot.)

The size of DSA is about 1MB at minimum. Copying entry-by-entry
into (non-ds) hash might be better than copying underlying DSA as
a whole, and DSA/DSHASH snapshot brings a kind of dirty..

Does anyone give me opinions or suggestions?

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
0001-Simple-implement-of-local-shapshot-of-dshash.patch text/x-patch 9.5 KB
0002-Add-seqscan-on-dshash.patch text/x-patch 3.6 KB
0003-Change-stats-collector-to-an-axiliary-process.patch text/x-patch 9.8 KB
0004-Change-stats-sharing-method.patch text/x-patch 80.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2017-12-11 12:28:10 Re: Mention ordered datums in PartitionBoundInfoData comment
Previous Message Kyotaro HORIGUCHI 2017-12-11 11:07:53 Re: [HACKERS] asynchronous execution