Re: Recovery conflict monitoring

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Recovery conflict monitoring
Date: 2011-01-03 09:03:36
Message-ID: AANLkTi=R67s7UGeuMXKT9=eukY9A00A9w+CPwXFnOdhp@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 3, 2011 at 00:23, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On Mon, 2010-12-27 at 14:39 +0100, Magnus Hagander wrote:
>> On Thu, Dec 23, 2010 at 13:09, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
>> > This patch adds counters and views to monitor hot standby generated
>> > recovery conflicts. It extends the pg_stat_database view with one
>> > column with the total number of conflicts, and also creates a new view
>> > pg_stat_database_conflicts that contains a breakdown of exactly what
>> > caused the conflicts.
>> >
>> > Documentation still pending, but comments meanwhile is of course appreciated ;)
>>
>> Heikki pointed out over IM that it's pointless to count stats caused
>> by recovery conflict with drop database - since we drop the stats
>> record as soon as it arrives anyway. Here's an updated patch that
>> removes that, and also adds some documentation.
>
> I like the patch, well inspired, code in the right places AFAICS. No
> code comments at all.

Thanks for reviewing!

> Couple of thoughts:
>
> * are we safe to issue stats immediately before issuing FATAL? Won't
> some of them get lost?

They shouldn't - not more than other stats messages. Those are often
flushed from on_shmem_exit() which I think runs even later.

> * Not clear what I'd do with database level information, except worry a
> lot. Maybe an option to count conflicts per user would be better, since
> at least we'd know exactly who was affected by those. Just an idea.

Depends on the usage scenario. In a lot of dedicated environments you
really only have one database - but there are many environments where
you do have multiple and it's quite useful to see them separately. And
you can of course very easily sum() them up for a total count, since
it's a view... Better keep the detail than throw it away, even if that
part isn't useful in *all* cases...

Grouping by user would potentially be helpful - I agree. However, that
goes for most pgstat counters ("number of seqscans", "tuples read" etc
are interesting per user as well in some cases). So doing that right
would mean adding per-user tracking all across pgstats in some smart
way - something we don't do now at all. So I see that as a separate
issue.

> * Would it better to have a log_standby_conflicts that allowed the
> opportunity to log the conflicting SQL, duration until cancelation etc?

Logging is useful to figure out why you have a certain scenario, yes.
But absolutely not as a *replacement* for the statistics counters, but
as an addition. Just like we have (the now incorrectly named)
pg_stat_bgwriter view *and* log_checkpoints... Different usecases for
the same basic information.

> I'd rather have what you have than nothing at all though... the new
> hot_standby_feedback mode should be acting to reduce these, so it would
> be useful to have this patch enabled for testing that feature.

It will help reduce it, but not take it away, right? Plus, it's an
optional feature...

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joel Jacobson 2011-01-03 09:15:34 Re: contrib/snapshot
Previous Message Jim Nasby 2011-01-03 09:02:25 Re: page compression