Re: shared-memory based stats collector

From: Andres Freund <andres(at)anarazel(dot)de>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: masao(dot)fujii(at)oss(dot)nttdata(dot)com, gkokolatos(at)protonmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: shared-memory based stats collector
Date: 2021-04-02 17:27:23
Message-ID: 20210402172723.jty4ryngererwlpc@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2021-04-02 15:34:54 +0900, Kyotaro Horiguchi wrote:
> At Thu, 1 Apr 2021 19:44:25 -0700, Andres Freund <andres(at)anarazel(dot)de> wrote in
> > Hi,
> >
> > I spent quite a bit more time working on the patch. There's are large
> > changes:
> >
> > - postmaster/pgstats.c (which is an incorrect location now that it's not
> > a subprocess anymore) is split into utils/activity/pgstat.c and
> > utils/activity/pgstat_kind.c. I don't love the _kind name, but I
> > couldn't come up with anything better.
>
> The place was not changed to keep footprint smaller. I agree that the
> old place is not appropriate. pgstat_kind... How about changin
> pgstat.c to pgstat_core.c and pgstat_kind.c to pgstat.c?

I don't really like that split over what I chose.

> > - A stat's entry's lwlock, refcount, .. are moved into the dshash
> > entry. There is no need for them to be separate anymore. Also allows
> > to avoid doing some dsa lookups while holding dshash locks.
> >
> > - The dshash entries are not deleted until the refcount has reached
> > 0. That's an important building block to avoid constantly re-creating
> > stats when flushing pending stats for a dropped object.
>
> Does that mean the entries for a dropped object is actually dropped by
> the backend that has been flushd stats of the dropped object at exit?
> Sounds nice.

It's marked as dropped after the commit of the transaction that dropped
the object. The memory is freed when then subsequently the last
reference goes away.

> > I know of one nontrivial issue that can lead to dropped stats being
> > revived:
> >
> > Within a transaction, a functions can be called even when another
> > transaction that dropped that function has already committed. I added a
> > spec test reproducing the issue:
> >
> > # FIXME: this shows the bug that stats will be revived, because the
> > # shared stats in s2 is only referenced *after* the DROP FUNCTION
> > # committed. That's only possible because there is no locking (and
> > # thus no stats invalidation) around function calls.
> > permutation
> > "s1_track_funcs_all" "s2_track_funcs_none"
> > "s1_func_call" "s2_begin" "s2_func_call" "s1_func_drop" "s2_track_funcs_all" "s2_func_call" "s2_commit" "s2_ff" "s1_func_stats" "s2_func_stats"
> >
> > I think the best fix here would be to associate an xid with the dropped
> > stats object, and only delete the dshash entry once there's no alive
> > with a horizon from before that xid...
>
> I'm not sure how we do that avoiding a full scan on dshash..

I think it's quite possible. Have a linked list of "to be dropped stats"
or such. A bit annoying because of needing to deal with dsa pointers,
but not too hard either.

> > > > - the replication slot stuff isn't quite right in my branch
> > >
> > > Ah, yeah. As I mentioned above I think it should be in the unified
> > > stats and should have a special means of shotcut. And the global
> > > stats also should be the same.
> >
> > The problem is that I use indexes for addressing, but that they can
> > change between restarts. I think we can fix that fairly easily, by
> > mapping names to indices once, pgstat_restore_stats(). At the point we
> > call pgstat_restore_stats() StartupReplicationSlots() already was
> > executed, so we can just inquire at that point...
>
> Does that mean the saved replslot stats is keyed by their names?

I was thinking we'd key them by name only at startup, where their
indices are not known.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Melanie Plageman 2021-04-02 17:29:38 Re: Parallel Full Hash Join
Previous Message John Naylor 2021-04-02 17:21:59 Re: Force lookahead in COPY FROM parsing