Re: [HACKERS] log_destination=file

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <stark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] log_destination=file
Date: 2018-01-22 19:52:51
Message-ID: CA+TgmoaS4fzh5u15BFHcouXYtPH9NGzWZVW44sG3VkwuG6qKow@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jan 20, 2018 at 7:51 AM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
> Finally found myself back at this one, because I still think this is a
> problem we definitely need to adress (whether with this file or not).
>
> The funneling into a single process is definitely an issue.
>
> But we don't really solve that problem today wit logging to stderr, do we?
> Because somebody has to pick up the log as it came from stderr. Yes, you get
> more overhead when sending the log to devnull, but that isn't really a
> realistic scenario. The question is what to do when you actually want to
> collect that much logging that quickly.

I think it depends on where the bottleneck is. If you're limited by
the speed at which a single process can write, shutting the logging
collector off and letting everyone write fixes it, because now you can
bring the CPU cycles of many processes to bear rather than just one.
If you're limited by the rate at which you can lay the file down on
disk, then turning off the logging collector doesn't help, but I don't
think that's the main problem. Now, of course, if you're writing the
file to disk faster than a single process could do all those writes,
then you're probably also going to need multiple processes to keep up
with reading it, parsing it, etc. But that's not a problem for
PostgreSQL core unless we decide to start shipping an in-core log
analyzer.

> If each backend could actually log to *its own file*, then things would get
> sped up. But we can't do that today. Unless you use the hooks and build it
> yourself.

That seems like a useful thing to support in core.

> Per the thread referenced, using the hooks to handle the
> very-high-rate-logging case seems to be the conclusion. But is that still
> the conclusion, or do we feel we need to also have a native solution?
>
> And if the conclusion is that hooks is the way to go for that, then is the
> slowdown of this patch actually a relevant problem to it?

I think that if we commit what you've proposed, we're making it harder
for people who have a high volume of logging but are not currently
using hooks. I think we should try really hard to avoid the situation
where our suggested workaround for a server change is "go write some C
code and maybe you can get back to the performance you had with
release N-1". That's just not friendly.

I wonder if it would be feasible to set things up so that the logging
collector was always started, but whether or not backends used it or
wrote directly to their original stderr was configurable (e.g. dup
stderr elsewhere, then dup whichever output source is currently
selected onto stderr, then dup the other one if the config is changed
later).

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Petr Jelinek 2018-01-22 20:03:58 Re: Logical Decoding and HeapTupleSatisfiesVacuum assumptions
Previous Message Tom Lane 2018-01-22 19:47:36 Re: pgsql: Move handling of database properties from pg_dumpall into pg_dum