Re: Background Processes and reporting

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Vladimir Borodin <root(at)simply(dot)name>
Cc: Oleg Bartunov <obartunov(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Background Processes and reporting
Date: 2016-03-14 20:16:43
Message-ID: CA+Tgmob=RB+GLg-Dk8TV368jGeRZt+k2a4G8i-KVkHupG2qHsQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 14, 2016 at 3:54 PM, Vladimir Borodin <root(at)simply(dot)name> wrote:
> 5. Show extra information about wait event (i.e. exclusive of shared mode
> for LWLocks, relation/forknum/blknum for I/O operations, etc.).

I doubt that this is a good idea. Everybody will pay the cost of it,
and who will get a benefit? We haven't had any wait monitoring at all
in PostgreSQL for years and years and years and it's only just now
getting to the top of our list of things to fix. So I have a hard
time believing that now we suddenly need this level of detail. The
very good thing about the committed implementation is that it requires
*no* synchronization, and anything more than a 4-byte integer will
(probably an st_changecount type protocol). I continue to believe
that a feature that is on for everyone and dirt cheap is going to be
more valuable than anything that is expensive enough to require an
"off" switch.

> I have already shown [0, 1] the overhead of measuring timings in linux on
> representative workload. AFAIK, these tests were the only one that showed
> any numbers. All other statements about terrible performance have been and
> remain unconfirmed.

Of course, those numbers are substantial regressions which would
likely make it impractical to turn this on on a heavily-loaded
production system. On the other hand, the patch actually committed is
turned on by default and Amit posted numbers showing no performance
change at all.

> As for the size of such information it of course should be configurable.
> I.e. in Oracle there is a GUC for the size of ring buffer to store history
> of sampling with extra information about each wait event.

That's a reasonable idea, although not one I'm very excited about.

> Ok, doing it in short steps seems to be a good plan. Any objections against
> giving people an ability to turn some feature (i.e. notorious measuring
> timings) even if it makes some performance degradation? Of course, it should
> be turned off by default.

I am not totally opposed to that, but I think a feature that causes a
10% performance hit when you turn it on will be mostly useless. The
people who need it won't be able to risk turning it on.

> If anything, I’m not from PostgresPro and I’m not «accusing you». But to be
> honest current committed implementation has been tested exactly on one
> machine with two workloads. And I think, it is somehow unfair to demand more
> from others. Although it doesn’t mean that testing on exactly one machine
> with only one OS is enough, of course. I suppose, you should ask the authors
> to test it on some representative hardware and workload but if authors don’t
> have them, it would be nice to help them with that.

I'm not necessarily opposed to that, but this thread has a lot more
heat than light, and some of the other threads on this topic have had
the same problem. There seems to be tremendous resistance to the idea
that recording timestamps is going to be extensive even though there
are many previous threads on pgsql-hackers about many different
features showing that this is true. Somehow, I've got to justify a
position which has been taken by many people many times before on this
very same mailing list. That strikes me as 100% backwards.

Similarly, the position that a wait-reporting interface that does not
require synchronization will be a lot cheaper than one that does
require synchronization has been questioned repeatedly. I'm not very
interested in spending a lot of time defending that proposition or
producing benchmarking results to support it, and I don't think I
should have to. We wouldn't have so many patches floating around that
aimed to reduce locking if synchronization overhead didn't cost, and
what is being proposed is to stick those into low-level code paths
that are sometimes highly trafficked.

> Also it would be really interesting to hear your opinion about the initial
> Andres’s question. Any thoughts about changing current committed
> implementation?

I'm a little vague on specifically what Andres has in mind. I tend to
think that there's not much point in allowing
pg_stat_get_progress_info('checkpointer') because we can just have a
dedicated view for that sort of thing, cf. pg_stat_bgwriter, which
seems better. Exposing the wait events from background processes
might be worth doing, but I don't think we want to add a bunch of
dummy lines to pg_stat_activity.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2016-03-14 20:42:06 Re: Background Processes and reporting
Previous Message David Steele 2016-03-14 20:13:48 Re: [PATCH v6] GSSAPI encryption support