Re: shared-memory based stats collector

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: a(dot)zakirov(at)postgrespro(dot)ru
Cc: alvherre(at)2ndquadrant(dot)com, andres(at)anarazel(dot)de, tomas(dot)vondra(at)2ndquadrant(dot)com, ah(at)cybertec(dot)at, magnus(at)hagander(dot)net, robertmhaas(at)gmail(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-hackers(at)postgresql(dot)org
Subject: Re: shared-memory based stats collector
Date: 2019-02-25 04:52:14
Message-ID: 20190225.135214.163727209.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello.

At Fri, 22 Feb 2019 17:19:56 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote in <20190222(dot)171956(dot)98584931(dot)horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
> > It seems to me that an autovacuum process terminates because of
> > segfault.
> >
> > Segfault occurs within get_pgstat_tabentry_relid(). If I'm not
> > mistaken, somehow 'dbentry' hasn't valid pointer anymore.

do_autovacuum does the followings:

dbentry = pgstat_fetch_stat_dbentry() -- create cached dbentry
StartTransactionCommand() -- starts transaction
pgstat_vacuum_stat() -- blows away the cached dbentry.
shared = pgstat_fetch_stat_dbentry()

It was harmless previously, but pgstat_* functions blow away
local cache at the first call after transaction start. As the
result dbentry becomes invalid. The reason I didin't see the same
crash is the second pgstat_fetch_stat_dbentry() accidentially
zeroes-out the once invalidated dbentry.

It is fixed by moving StartTransactionCommand to before the first
pgstat_f_s_dbentry(), which looks better not having this problem.

me> I found another problem commit_ts test reliably fails by dshash
me> corruption in startup process. I've not found why and will
me> investigate it, too.

It is rather stupid, pgstat_reset_all() releases an entry within
the sequential scan loop, which contradicts the protocol of
dshash_seq_next.

The two aboves are fixed in the attached v17.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
v17-0001-sequential-scan-for-dshash.patch text/x-patch 10.6 KB
v17-0002-Add-conditional-lock-feature-to-dshash.patch text/x-patch 5.0 KB
v17-0003-Make-archiver-process-an-auxiliary-process.patch text/x-patch 11.9 KB
v17-0004-Allow-dsm-to-use-on-postmaster.patch text/x-patch 1.2 KB
v17-0005-Shared-memory-based-stats-collector.patch text/x-patch 209.6 KB
v17-0006-Remove-the-GUC-stats_temp_directory.patch text/x-patch 10.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2019-02-25 05:28:23 Re: Prepared transaction releasing locks before deregistering its GID
Previous Message Nagaura, Ryohei 2019-02-25 04:48:36 RE: Timeout parameters