Re: Patch to implement pg_current_logfile() function

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: "Karl O(dot) Pinc" <kop(at)meme(dot)com>
Cc: Gilles Darold <gilles(dot)darold(at)dalibo(dot)com>, Christoph Berg <myon(at)debian(dot)org>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Shulgin, Oleksandr" <oleksandr(dot)shulgin(at)zalando(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Patch to implement pg_current_logfile() function
Date: 2016-12-08 16:27:34
Message-ID: CA+Tgmob-qbH+5kyZwvsO1BO5kqnDTVpDQ6WnMs7RxTpveHfNUA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Dec 7, 2016 at 11:02 PM, Karl O. Pinc <kop(at)meme(dot)com> wrote:
> I read this and knew that I hadn't finished review, but didn't
> immediately respond because I thought the patch had to be
> marked "ready for committer" on commitfest.postgresql.org
> before the committers would spend a lot of time on it.

Generally that's true, but I was trying to be helpful and trying also
to move this toward some kind of conclusion.

> I've been introducing some complication, but I hope it's necessary.
> To keep the review process simple my plan has been to submit
> multiple patches. One with the basics and more on top of that
> that introduce complication that can be considered and accepted or
> rejected. So I send emails with multiple patches, some that I think
> need to go into the core patch and others to be kept separate.
> But, although I try, this does not seem to have been communicated
> because I keep getting emails back that contain only a single patch.
>
> Maybe something's wrong with my review process but I don't know
> how to fix it.

It is, of course, not my job to decide who is at fault in whatever may
or may not have gone wrong during the reviewing process. It is not
only not my job, but would be unproductive, since the goal here is for
people to contribute more, not less. I have done enough review in
this community to have experienced the problem of people who say that
they took your suggestions when in fact they didn't, or who randomly
de-improve things in future patch versions that were more correct in
earlier versions. I agree that on occasions when that happens, it is
indeed very frustrating. Whether that's happened in this case, I
don't know. I do think that the blizzard of small patches that you've
submitted in some of your emails may possibly be a bit overwhelming.
We shouldn't really need a stack of a dozen or more patches to
implement one small feature. Declarative partitioning only had 7.
Why does this need more than twice that number?

> The extreme case is the attached "cleanup_rotate" patch.
> (Which applies to v14 of this patch.) It's nothing but
> a little bit of tiding up of the master branch, but does
> touch code that's already being modified so it seems
> like the committers would want to look at it at the same
> time as when reviewing the pg_current_logfile patch.
> Once you've looked at the pg_current_logfile patch
> you've already looked at and modified code in the function
> the "cleanup_rotate" patch touches.

I took a look at that patch just now and I guess I don't really see
the point. I mean, it will save a few cycles, and that's not nothing,
but it's not much, either. I don't see a reason to complicate the
basic feature patch by entangling it with such a change - it just
makes it harder to get the actual feature committed. And I kind of
wonder if the time it will take to review and commit that patch is
even worth it. There must be hundreds of places in our code base
where you could do stuff like that, but only a few of those save
enough cycles to be worth bothering with. To be fair, if that patch
were submitted independently of the rest of this and nobody objected,
I'd probably commit it; I mean, why not? But this thread is now over
100 emails long without reaching a happy conclusion, and one good way
to expedite the path to a happy conclusion is to drop all of the
nonessential bits. And that change certainly qualifies as
nonessential.

> FYI. The point of the "retry_current_logfiles"
> patch series is to handle the case
> where logfile_writename gets a ENFILE or EMFILE.
> When this happens the current_logfiles file can
> "skip" and not contain the name(s) of the current
> log file for an entire log rotation. To miss, say,
> a month of logs because the logs only rotate monthly
> and your log processing is based on reading the
> current_logfiles file sounds like a problem.

I think if you are trying to enumerate the names of your logfiles by
any sort of polling mechanism, rather than by seeing what files show
up in your logging directory, you are doing it wrong. So, I don't see
that this matters at all.

> The point of the code is to report not just the real error, but what
> effect the real error has -- that the current_logfiles file did not
> get updated in a timely fashion. Maybe this isn't necessary, especially
> if the write of current_logfiles gets retried on failure. Or maybe
> the right thing to do is to give logfile_open() an argument that
> let's it elaborate it's error message. Any guidance here would
> be appreciated.

Generally speaking, I would say that it's the user's job to decide
what the impact of an error is; our job is only to tell them what
happened. There are occasional exceptions; for example:

rhaas=# select ablance from pgbench_accounts;
ERROR: column "ablance" does not exist
LINE 1: select ablance from pgbench_accounts;
^
HINT: Perhaps you meant to reference the column "pgbench_accounts.abalance".

The HINT -- which you'll note is part of the same ereport() rather
than being separately logged -- has been judged a helpful response to
a particularly common kind of mistake. But in general it's not
necessary to elaborate on possible reasons for the error; it suffices
to report it. One of the important reasons for this is that our
guesses about what probably caused the problem can easily turn out to
be WRONG, and a misleading HINT is worse than no hint because it
confuses some users and annoys others. For example, I'm still on the
warpath about this:

rhaas=# select 1;
FATAL: terminating connection due to administrator command
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.

The hint is essentially alleging that the server crashed even though,
as you can see from the FATAL, it really was just shut down cleanly.
So users panic even though an experienced user can see that the hint
is obviously garbage in this particular case; that sucks.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2016-12-08 16:29:34 Re: Declarative partitioning - another take
Previous Message Dmitry Ivanov 2016-12-08 16:13:44 Re: Declarative partitioning - another take