|From:||Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>|
|Cc:||alvherre(at)2ndquadrant(dot)com, andres(at)anarazel(dot)de, ah(at)cybertec(dot)at, magnus(at)hagander(dot)net, robertmhaas(at)gmail(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-hackers(at)postgresql(dot)org|
|Subject:||Re: shared-memory based stats collector|
|Views:||Raw Message | Whole Thread | Download mbox|
Thank you very much for reviewing this and sorry for the absense.
At Sun, 20 Jan 2019 18:13:04 +0100, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote in <b760035b-1941-38bb-5e84-c2fbc63fef6b(at)2ndquadrant(dot)com>
> The patch needs rebasing, as it got broken by 285d8e1205, and there's
> some other minor bitrot.
The most affected part was 0006 because of file splitting, but
actually only the follwing four (actually three) commits
42e2a58071 Fix typos in documentation and for one wait event
97c39498e5 Update copyright for 2019
578b229718 Remove WITH OIDS support, change oid catalog column visibility.
(125f551c8b Leave SIGTTIN/SIGTTOU signal handling alone in postmaster child processes.)
The last one is not relevant because stats collector is no longer
This contains the EXEC_BACKEND related bug pointed by
> On 11/27/18 4:40 PM, Tomas Vondra wrote:
> > On 11/27/18 9:59 AM, Kyotaro HORIGUCHI wrote:
> >>> ...>>
> >>> For the main workload there's pretty much no difference, but for
> >>> selects from the stats catalogs there's ~20% drop in throughput.
> >>> In absolute numbers this means drop from ~670tps to ~550tps. I
> >>> haven't investigated this, but I suppose this is due to dshash
> >>> seqscan being more expensive than reading the data from file.
> >> Thanks for finding that. The three seqscan loops in
> >> pgstat_vacuum_stat cannot take such a long time, I think. I'll
> >> investigate it.
> > OK. I'm not sure this is related to pgstat_vacuum_stat - the
> > slowdown happens while querying the catalogs, so why would that
> > trigger vacuum of the stats? I may be missing something, of course.
> > FWIW, the "query statistics" test simply does this:
> > SELECT * FROM pg_stat_all_tables;
> > SELECT * FROM pg_stat_all_indexes;
> > SELECT * FROM pg_stat_user_indexes;
> > SELECT * FROM pg_stat_user_tables;
> > SELECT * FROM pg_stat_sys_tables;
> > SELECT * FROM pg_stat_sys_indexes;
> > and the slowdown happened even it was running on it's own (nothing
> > else running on the instance). Which mostly rules out concurrency
> > issues with the hash table locking etc.
> Did you have time to investigate the slowdown?
It seems to me that the slowdown comes from local caching in
snapshot_statentry in several ways.
It searches local hash (HTAB), then shared hash (dshash) if not
found and copies the found entry into local hash (action A). *If*
the second reference in a transaction comes, HTAB returns the
result (action B). But it mostly takes action A in frequent-short
transactions. It can be reduced to the update interval of shared
stats, but it would be shorter if many backends runs.
Another bottle neck found in pgstat_fetch_stat_tabentry. It calls
pgstat_fetch_stat_dbentry() too often. It can be largely reduced.
A quick (and dirty) fix of the aboves reduced the slowdown
roughly by half. (59tps(master)->48tps(current)->54tps(the fix))
I'll reconsider the referer side of the stats.
I didn't merge the suggested two pairs of commits. I'll do that
after adressing the slowdown issue.
NTT Open Source Software Center
|Next Message||navneet nikku||2019-01-21 12:43:03||Procedures with out parameter(Postgresql 11)|
|Previous Message||Etsuro Fujita||2019-01-21 12:09:38||Re: [HACKERS] advanced partition matching algorithm for partition-wise join|