Re: shared-memory based stats collector

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: tomas(dot)vondra(at)2ndquadrant(dot)com
Cc: alvherre(at)2ndquadrant(dot)com, andres(at)anarazel(dot)de, ah(at)cybertec(dot)at, magnus(at)hagander(dot)net, robertmhaas(at)gmail(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-hackers(at)postgresql(dot)org
Subject: Re: shared-memory based stats collector
Date: 2019-01-21 12:19:07
Message-ID: 20190121.211907.59625409.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thank you very much for reviewing this and sorry for the absense.

At Sun, 20 Jan 2019 18:13:04 +0100, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote in <b760035b-1941-38bb-5e84-c2fbc63fef6b(at)2ndquadrant(dot)com>
>
> Hi,
>
> The patch needs rebasing, as it got broken by 285d8e1205, and there's
> some other minor bitrot.

The most affected part was 0006 because of file splitting, but
actually only the follwing four (actually three) commits
affected.

42e2a58071 Fix typos in documentation and for one wait event
97c39498e5 Update copyright for 2019
578b229718 Remove WITH OIDS support, change oid catalog column visibility.
(125f551c8b Leave SIGTTIN/SIGTTOU signal handling alone in postmaster child processes.)

The last one is not relevant because stats collector is no longer
a process.

This contains the EXEC_BACKEND related bug pointed by

https://www.postgresql.org/message-id/854d6d91-f2f3-e391-f0fc-064db51b391e@2ndquadrant.com

> On 11/27/18 4:40 PM, Tomas Vondra wrote:
> > On 11/27/18 9:59 AM, Kyotaro HORIGUCHI wrote:
> >>>
> >>> ...>>
> >>> For the main workload there's pretty much no difference, but for
> >>> selects from the stats catalogs there's ~20% drop in throughput.
> >>> In absolute numbers this means drop from ~670tps to ~550tps. I
> >>> haven't investigated this, but I suppose this is due to dshash
> >>> seqscan being more expensive than reading the data from file.
> >>
> >> Thanks for finding that. The three seqscan loops in
> >> pgstat_vacuum_stat cannot take such a long time, I think. I'll
> >> investigate it.
> >>
> >
> > OK. I'm not sure this is related to pgstat_vacuum_stat - the
> > slowdown happens while querying the catalogs, so why would that
> > trigger vacuum of the stats? I may be missing something, of course.
> >
> > FWIW, the "query statistics" test simply does this:
> >
> >   SELECT * FROM pg_stat_all_tables;
> >   SELECT * FROM pg_stat_all_indexes;
> >   SELECT * FROM pg_stat_user_indexes;
> >   SELECT * FROM pg_stat_user_tables;
> >   SELECT * FROM pg_stat_sys_tables;
> >   SELECT * FROM pg_stat_sys_indexes;
> >
> > and the slowdown happened even it was running on it's own (nothing
> > else running on the instance). Which mostly rules out concurrency
> > issues with the hash table locking etc.
> >
>
> Did you have time to investigate the slowdown?

It seems to me that the slowdown comes from local caching in
snapshot_statentry in several ways.

It searches local hash (HTAB), then shared hash (dshash) if not
found and copies the found entry into local hash (action A). *If*
the second reference in a transaction comes, HTAB returns the
result (action B). But it mostly takes action A in frequent-short
transactions. It can be reduced to the update interval of shared
stats, but it would be shorter if many backends runs.

Another bottle neck found in pgstat_fetch_stat_tabentry. It calls
pgstat_fetch_stat_dbentry() too often. It can be largely reduced.

A quick (and dirty) fix of the aboves reduced the slowdown
roughly by half. (59tps(master)->48tps(current)->54tps(the fix))

I'll reconsider the referer side of the stats.

I didn't merge the suggested two pairs of commits. I'll do that
after adressing the slowdown issue.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
v11-0001-sequential-scan-for-dshash.patch text/x-patch 10.6 KB
v11-0002-Add-conditional-lock-feature-to-dshash.patch text/x-patch 5.0 KB
v11-0003-Make-archiver-process-an-auxiliary-process.patch text/x-patch 12.0 KB
v11-0004-Shared-memory-based-stats-collector.patch text/x-patch 206.9 KB
v11-0005-Remove-the-GUC-stats_temp_directory.patch text/x-patch 6.5 KB
v11-0006-Split-out-backend-status-monitor-part-from-pgstat.patch text/x-patch 174.4 KB
v11-0007-Documentation-update.patch text/x-patch 3.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message navneet nikku 2019-01-21 12:43:03 Procedures with out parameter(Postgresql 11)
Previous Message Etsuro Fujita 2019-01-21 12:09:38 Re: [HACKERS] advanced partition matching algorithm for partition-wise join