Re: Patch proposal for log_duration

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Guillaume Smet <guillaume(dot)smet(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-patches(at)postgresql(dot)org
Subject: Re: Patch proposal for log_duration
Date: 2006-04-11 16:42:23
Message-ID: 200604111642.k3BGgNq29904@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches


The bottom line is that while I can see reasons for making log_duration
more than boolean, adding more complexity to the logging system for this
corner case probably isn't a good idea. If we get more such requests,
we can reconsider it, but at this stage, I am thinking your using a
home-grown patch is your best approach.

---------------------------------------------------------------------------

Guillaume Smet wrote:
> Tom,
>
> On 3/30/06, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > I really find it pretty bizarre to want to log a duration without
> > logging the statement that caused it. Seems like the
> > log_min_duration_statement parameter ought to be enough. These logging
> > options are already messy and unintuitive, and adding still more odd
> > frammishes doesn't help that ...
>
> It's not so odd as it was the default behaviour for 7.4. I agree with
> you having just the duration is not the best solution but we cannot
> log every query for performances reason.
>
> The point is we are hosting several databases for our customers and
> for these databases, we don't know the application. I'm going to take
> a real example we have here to explain why the behaviour of
> log_duration in 7.4 was perfect for us.
>
> This database is a 2 GB database receiving 13 millions queries a day.
> We usually have the same old big queries which cannot be really
> optimized in the log due to log_min_duration_statement so this setting
> doesn't give us any clue on how our server performs because these
> queries are always slow.
> Here are the cases when a global overview with all duration was useful:
> - our customer tells us their site is slower than usual. We check the
> database log reports and we can see if the database is slower (average
> duration higher) or not. If this is the case, perhaps there is a bug
> or a new feature in their application performing a lot more queries
> than before: we can tell them too as we have all the queries logged by
> their duration.
> - we have a maintenance to plan which requires the database to be down
> (namely the upgrade from 7.4 to 8.1 we did a few weeks ago) and we
> need to choose a time when the database activity is not so high as we
> will move the db to a slower server. In our case, it's not directly
> correlated to the web statistics due to an advanced cache system.
> Having every query logged, we know when the database activity is
> lower.
> - we see on the graphs the database is significantly slower after
> let's say 5pm. This can be due to a the need of a vacuum on a specific
> table because of a cronjob running. That's true that log_duration
> won't be enough but we will use log_statement='mod' during this period
> the following day to see if our hypothesis is true.
>
> So our point is:
> - log_min_duration_statement is not enough as it does not give us a
> global overview of the database activity.
> - log_statement='all' generates far too I/O for us and we cannot enable it.
> - log_duration=on in 7.4 was a good compromise allowing us to detect
> the problems on the database server while generating a not so big log
> file (1.2 GB a day) and nearly no overhead.
>
> That's why we'd like to see this behaviour back again in PostgreSQL.
>
> > Changing code without changing the associated comments isn't a good way
> > to get your ideas accepted, either, eg here: [...]
> > The patch makes that comment a lie. This is not acceptable coding practice.
>
> Sorry for that. The attached patch is fixed.
>
> I'm not sure if we should remove the log_duration log if
> log_min_duration_statement logs the query and duration for the current
> query. PostgreSQL 7.4 logged a line for the duration due to
> log_duration=on and a line for statement + duration due to
> log_min_duration_statement. I have kept this behaviour to be
> consistent with 7.4 but it's perhaps better to not log the query twice
> (the tool we use count the query only once anyway). Any feedback on
> this is welcome. I'll update the patch if needed.
>
> Thanks for your attention and comments.
>
> Regards,
>
> --
> Guillaume

[ Attachment, skipping... ]

>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly

--
Bruce Momjian http://candle.pha.pa.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

In response to

Responses

Browse pgsql-patches by date

  From Date Subject
Next Message Greg Stark 2006-04-11 17:15:33 Re: Support Parallel Query Execution in Executor
Previous Message Myron Scott 2006-04-11 14:47:14 Re: Support Parallel Query Execution in Executor