Re: pg_receivewal makes a bad daemon

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_receivewal makes a bad daemon
Date: 2021-05-07 10:03:36
Message-ID: CABUevEx-Qp_Mm6HYeh24dEBPhHFsM1ON--wsovTdKCFMoAVahg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, May 5, 2021 at 7:12 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Wed, May 5, 2021 at 12:34 PM Magnus Hagander <magnus(at)hagander(dot)net> wrote:
> > Is this really a problem we should fix ourselves? Most daemon-managers
> > today will happily be configured to automatically restart a daemon on
> > failure with a single setting since a long time now. E.g. in systemd
> > (which most linuxen uses now) you just set Restart=on-failure (or
> > maybe even Restart=always) and something like RestartSec=10.
> >
> > That said, it wouldn't cover an fsync() error -- they will always
> > restart. The way to handle that is for the operator to capture the
> > error message perhaps, and just "deal with it"?
>
> Maybe, but if that's really a non-problem, why does postgres itself
> restart, and have facilities to write and rotate log files? I feel
> like this argument boils down to "a manual transmission ought to be
> good enough for anyone, let's not have automatics." But over the years
> people have found that automatics are a lot easier to drive. It may be
> true that if you know just how to configure your system's daemon
> manager, you can make all of this work, but it's not like we document
> how to do any of that, and it's probably not the same on every
> platform - Windows in particular - and, really, why should people have
> to do this much work? If I want to run postgres in the background I
> can just type 'pg_ctl start'. I could even put 'pg_ctl start' in my
> crontab to make sure it gets restarted within a few minutes even if
> the postmaster dies. If I want to keep pg_receivewal running all the
> time ... I need a whole pile of extra mechanism to work around its
> inherent fragility. Documenting how that's typically done on modern
> systems, as you propose further on, would be great, but I can't do it,
> because I don't know how to make it work. Hence the thread.

If PostgreSQL was built today, I'm not sure we would've built that
functionality TBH.

The vast majority of people are not interested in manually starting
postgres and then putting in a crontab to "restart it if it fails".
That's not how anybody runs a server and hasn't for a long time.

It might be interesting for us as developers, but not to the vast
majority of our users. Most of those get their startup scripts from
our packagers -- so maybe we should encourage packagers to provide it,
like they do for PostgreSQL itself. But I don't think adding log
rotations and other independent functionality to pg_receivexyz would
help almost anybody in our user base.

In relation to the other thread about pid 1 handling and containers --
if anything, I bet a larger portion of our users would be interested
in running pg_receivewal in a dedicated container, than would want to
start it manually and verify it's running using crontab... By a large
margin.

It is true that Windows is a special case in this. But it is, I'd say,
equally true that adding something akin to "pg_ctl start" for
pg_receivewal would be equally useless on Windows.

We can certainly build and add such functionality. But my feeling is
that it's going to be added complexity for very little practical gain.
Much of the server world moved to "we don't want every single daemon
to implement it it's own way, ever so slightly different".

I like your car analogy though. But I'd consider it more like "we used
to have to mix the right amount of oil into the gasoline manually. But
modern engines don't really require us to do that anymore, so most
people have stopped, only those who want very special cars do". Or
something along that line. (Reality is probably somewhere in between,
and I suck at car analogies)

> > Also, all the above also apply to pg_recvlogical, right? So if we do
> > want to invent our own daemon-init-system, we should probably do one
> > more generic that can handle both.
>
> Yeah. And I'm not really 100% convinced that trying to patch this
> functionality into pg_receive{wal,logical} is the best way forward ...

It does in a lot of ways amount to basically a daemon-init system. It
might be easier to just vendor one of the existing ones :) Or more
realistically, suggest they use something that's already on their
system. On linux that'll be systemd, on *bsd it'll probably be
something like supervisord, on mac it'll be launchd. But this is
really more a function of the operating system/distribution.

Windows is again the one that stands out. But PostgreSQL *alraedy*
does a pretty weak job of solving that problem on Windows, so
duplicating that is not that strong a win..

> but I'm not entirely convinced that it isn't, either. I think one of
> the basic problems with trying to deploy PostgreSQL in 2021 is that it
> needs so much supporting infrastructure and so much babysitting.
> archive_command has to be a complicated, almost magical program we
> don't provide, and we don't even tell you in the documentation that
> you need it. If you don't want to use that, you can stream with
> pg_receivewal instead, but now you need a complicated daemon-runner
> mechanism that we don't provide or document the need for. You also
> probably need a connection pooler that we don't provide, a failover
> manager that we don't provide, and backup management software that we
> don't provide. And the interfaces that those tools have to work with
> are so awkward and primitive that even the tool authors can't always
> get it right. So I'm sort of unimpressed by any arguments that boil
> down to "what we have is good enough" or "that's the job of some other
> piece of software". Too many things are the job of some piece of
> software that doesn't really exist, or is only available on certain
> platforms, or that has some other problem that makes it not usable for
> everyone. People want to be able to download and use PostgreSQL
> without needing a whole library of other bits and pieces from around
> the Internet.

I definitely don't think what we have is good enough, and I agree with
your general description of the problem.

I just don't think turning a simple tool into a more complicated
daemon is not going to help with that in any material way. You still
need some sort of *backup management* on that side, otherwise your
pg_receivewal will now be the one that fills your disk along with the
outputs of your pg_basebackups. So we'd be better off providing that
management tool, which could then drive the lower level tools as
necessary.

Or maybe the better solution in that case would perhaps be to actually
bless one of the existing solutions out there by making it the
official one.

--
Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2021-05-07 10:05:18 Re: pg_receivewal makes a bad daemon
Previous Message Thomas Munro 2021-05-07 10:01:40 Re: Bogus collation version recording in recordMultipleDependencies