Re: shared-memory based stats collector

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, andres(at)anarazel(dot)de
Cc: magnus(at)hagander(dot)net, robertmhaas(at)gmail(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-hackers(at)postgresql(dot)org
Subject: Re: shared-memory based stats collector
Date: 2018-07-10 12:52:13
Message-ID: 2a8de8bb-dbb8-3b21-9364-6a1b45916731@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 07/10/2018 02:07 PM, Kyotaro HORIGUCHI wrote:
> Hello. Thanks for the opinions.
>
> At Fri, 6 Jul 2018 13:10:36 -0700, Andres Freund <andres(at)anarazel(dot)de> wrote in <20180706201036(dot)awheoi6tk556x6aj(at)alap3(dot)anarazel(dot)de>
>> Hi,
>>
>> On 2018-07-06 22:03:12 +0200, Magnus Hagander wrote:
>>> *If* we can provide the snapshots view of them without too much overhead I
>>> think it's worth looking into that while *also* proviiding a lower overhead
>>> interface for those that don't care about it.
>>
>> I don't see how that's possible without adding significant amounts of
>> complexity and probably memory / cpu overhead. The current stats already
>> are quite inconsistent (often outdated, partially updated, messages
>> dropped when busy) - I don't see what we really gain by building
>> something MVCC like in the "new" stats subsystem.
>>
>>
>>> If it ends up that keeping the snapshots become too much overhead in either
>>> in performance or code-maintenance, then I agree can probably drop that.
>>> But we should at least properly investigate the cost.
>>
>> I don't think it's worthwhile to more than think a bit about it. There's
>> fairly obvious tradeoffs in complexity here. Trying to get there seems
>> like a good way to make the feature too big.
>
> Agreed.
>
> Well, if we allow to lose consistency in some extent for improved
> performance and smaller footprint, relaxing the consistency of
> database stats can reduce footprint further especially on a
> cluster with so many databases. Backends are interested only in
> the residing database and vacuum doesn't cache stats at all. A
> possible problem is vacuum and stats collector can go into a race
> condition. I'm not sure but I suppose it is not worse than being
> involved in an IO congestion.
>

As someone who regularly analyzes stats collected from user systems, I
think there's certainly some value with keeping the snapshots reasonably
consistent. But I agree it doesn't need to be perfect, and some level of
inconsistency is acceptable (and the amount of complexity/overhead
needed to maintain perfect consistency seems rather excessive here).

There's one more reason why attempts to keep stats snapshots "perfectly"
consistent are likely doomed to fail - the messages are sent over UDP,
which does not guarantee delivery etc. So there's always some level of
possible inconsistency even with "perfectly consistent" snapshots.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Aditya Toshniwal 2018-07-10 13:07:54 [PG-11] Potential bug related to INCLUDE clause of CREATE INDEX
Previous Message Amit Kapila 2018-07-10 12:49:52 Re: EXPLAIN of Parallel Append