Timing of pgstats updates

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "David Parker" <dparker(at)tazznetworks(dot)com>
Cc: pgsql-hackers(at)postgreSQL(dot)org, Jan Wieck <JanWieck(at)Yahoo(dot)com>, "Matthew T(dot) O'Connor" <matthew(at)zeut(dot)net>
Subject: Timing of pgstats updates
Date: 2004-11-18 16:43:12
Message-ID: 17519.1100796192@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

"David Parker" <dparker(at)tazznetworks(dot)com> writes:
> What I think is happening with the missing pg_statistic entries:
> The install of our application involves a lot of data importing (via
> JDBC) in one large transaction, which can take up to 30 minutes. (I
> realize I left out this key piece of info in my original post...)

> The pg_autovacuum logic is relying on data from pg_stat_all_tables to
> make the decision about running analyze. As far as I can tell, the data
> in this view gets updated outside of the transaction, because I saw the
> numbers growing while I was importing. I saw pg_autovacuum log messages
> for running analyze on several tables, but no statistics data showed up
> for these, I assume because the actual data in the table wasn't yet
> visible to pg_autovacuum because the import transaction had not finished
> yet.

> When the import finished, not all of the tables affected by the import
> were re-visited because they had not bumped up over the threshold again,
> even though the analyze run for those tables had not generated any stats
> because of the still-open transaction.

Bingo. The per-table activity stats are sent to the collector whenever
the backend waits for a client command. Given a moderately long
transaction block doing updates, it's not hard at all to imagine that
autovacuum would kick off vacuum and/or analyze while the updating
transaction is still in progress. The resulting operation is of course
a waste of time.

It'd be trivial to adjust postgres.c so that per-table stats are
only transmitted when we exit the transaction (basically move the
pgstat_report_tabstat call down a couple lines so it's not called if
IsTransactionOrTransactionBlock).

This seems like a good change to me. Does anyone not like it?

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2004-11-18 17:01:16 Re: [HACKERS] pg_arch.c call to sleep()
Previous Message James Robinson 2004-11-18 16:28:13 Re: plpgsql on 8.0b4 bug?

Browse pgsql-performance by date

  From Date Subject
Next Message Joshua D. Drake 2004-11-18 16:44:11 Re: Query Performance and IOWait
Previous Message David Parker 2004-11-18 16:12:12 Re: query plan question