Re: Better way of dealing with pgstat wait timeout during buildfarm runs?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Better way of dealing with pgstat wait timeout during buildfarm runs?
Date: 2014-12-26 21:56:36
Message-ID: 5925.1419630996@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> Tom Lane wrote:
>> Yeah, I've been getting more annoyed by that too lately. I keep wondering
>> though whether there's an actual bug underneath that behavior that we're
>> failing to see.

> I think the first thing to do is reconsider usage of PGSTAT_RETRY_DELAY
> instead of PGSTAT_STAT_INTERVAL in autovacuum workers. That decreases
> the wait time 50-fold, if I recall this correctly, and causes large
> amounts of extra I/O traffic.

Yeah --- that means that instead of the normal behavior that a stats file
newer than 500 msec is good enough, an autovac worker insists on a stats
file newer than 10 msec. I did some experimentation on prairiedog, and
found that it's not hard at all to see autovac workers demanding multiple
stats writes per second:

2014-12-26 16:26:52.958 EST 21026 LOG: sending inquiry for database 45116
2014-12-26 16:26:53.128 EST 21026 LOG: sending inquiry for database 45116
2014-12-26 16:26:53.188 EST 21026 LOG: sending inquiry for database 45116
2014-12-26 16:26:54.903 EST 21026 LOG: sending inquiry for database 45116
2014-12-26 16:26:55.058 EST 21026 LOG: sending inquiry for database 45116
2014-12-26 16:27:00.022 EST 21026 LOG: sending inquiry for database 45116
2014-12-26 16:27:00.285 EST 21026 LOG: sending inquiry for database 45116
2014-12-26 16:27:00.792 EST 21026 LOG: sending inquiry for database 45116
2014-12-26 16:27:01.010 EST 21026 LOG: sending inquiry for database 45116
2014-12-26 16:27:01.163 EST 21026 LOG: sending inquiry for database 45116
2014-12-26 16:27:01.193 EST 21026 LOG: sending inquiry for database 45116
2014-12-26 16:27:03.595 EST 21026 LOG: sending inquiry for database 45116
2014-12-26 16:27:03.673 EST 21026 LOG: sending inquiry for database 45116
2014-12-26 16:27:03.839 EST 21026 LOG: sending inquiry for database 45116
2014-12-26 16:27:03.878 EST 21026 LOG: sending inquiry for database 45116
2014-12-26 16:27:05.878 EST 21026 LOG: sending inquiry for database 45116
2014-12-26 16:27:06.571 EST 21026 LOG: sending inquiry for database 45116
2014-12-26 16:27:07.001 EST 21026 LOG: sending inquiry for database 45116
2014-12-26 16:27:07.769 EST 21026 LOG: sending inquiry for database 45116
2014-12-26 16:27:07.950 EST 21026 LOG: sending inquiry for database 45116
2014-12-26 16:27:10.256 EST 21026 LOG: sending inquiry for database 45116
2014-12-26 16:27:11.039 EST 21026 LOG: sending inquiry for database 45116
2014-12-26 16:27:11.402 EST 21026 LOG: sending inquiry for database 45116

The argument that autovac workers need fresher stats than anything else
seems pretty dubious to start with. Why shouldn't we simplify that down
to "they use PGSTAT_STAT_INTERVAL like everybody else"?

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2014-12-26 22:16:13 Re: Better way of dealing with pgstat wait timeout during buildfarm runs?
Previous Message Peter Geoghegan 2014-12-26 20:05:16 Re: BUG #12330: ACID is broken for unique constraints