Re: Status of autovacuum and the sporadic stats failures ?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Status of autovacuum and the sporadic stats failures ?
Date: 2007-02-07 20:04:47
Message-ID: 21280.1170878687@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> Beluga just failed:
> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=beluga&dt=2007-02-07%2019:30:01

Wow, that is a really interesting failure, because it implies that the
stats collector had seen the seqscan report but not the indexscan report:

WHERE st.relname='tenk2' AND cl.relname='tenk2';
?column? | ?column? | ?column? | ?column?
----------+----------+----------+----------
! t | t | t | t
(1 row)

SELECT st.heap_blks_read + st.heap_blks_hit >= pr.heap_blks + cl.relpages,
--- 105,111 ----
WHERE st.relname='tenk2' AND cl.relname='tenk2';
?column? | ?column? | ?column? | ?column?
----------+----------+----------+----------
! t | t | f | f
(1 row)

SELECT st.heap_blks_read + st.heap_blks_hit >= pr.heap_blks + cl.relpages,

I haven't seen that too many times, if at all.

> The delay seems too short though:
> LOG: wait_for_stats delayed 0.000748 seconds

This indicates there wasn't any delay, ie, on the first examination
pgstat.stat had a different size from what it had been at the "CREATE
TEMP TABLE prevfilesize" command. [ thinks about that for awhile ]
Oh, I see the problem: at the instant of checking the file size the
first time, the stats collector must have been already in process of
writing a new version of the file, which had some but not all of the
updates we want. And if that happened to be a different size from the
older version, we could fall through the wait as soon as it got
installed. So this waiting mechanism isn't good enough: it proves that
a new set of stats has been *installed* since we started waiting, but
it doesn't provide any guarantee about when the computation of that set
started. Back to the drawing board ...

If we had the suggested pg_stat_reset_snapshot function, then we could
wait until the indexscan count changes from the prior reading, which
would provide a more bulletproof synchronization approach. So maybe I
should just go do that. I had hoped to find a technique that was
potentially backpatchable into at least the last release or two, but
maybe there's no chance.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeremy Drake 2007-02-07 20:34:23 Re: [PATCHES] writing new regexp functions
Previous Message Alvaro Herrera 2007-02-07 19:48:49 Re: Status of autovacuum and the sporadic stats failures ?