Skip site navigation (1) Skip section navigation (2)

Re: query plan question

From: "David Parker" <dparker(at)tazznetworks(dot)com>
To: "Matthew T(dot) O'Connor" <matthew(at)zeut(dot)net>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Jeff" <threshar(at)torgo(dot)978(dot)org>,"Russell Smith" <mr-russ(at)pws(dot)com(dot)au>,<pgsql-performance(at)postgresql(dot)org>
Subject: Re: query plan question
Date: 2004-11-18 16:12:12
Message-ID: 07FDEE0ED7455A48AC42AC2070EDFF7C26BAC3@corpsrv2.tazznetworks.com (view raw or flat)
Thread:
Lists: pgsql-hackerspgsql-performance
What I think is happening with the missing pg_statistic entries:

The install of our application involves a lot of data importing (via
JDBC) in one large transaction, which can take up to 30 minutes. (I
realize I left out this key piece of info in my original post...)

The pg_autovacuum logic is relying on data from pg_stat_all_tables to
make the decision about running analyze. As far as I can tell, the data
in this view gets updated outside of the transaction, because I saw the
numbers growing while I was importing. I saw pg_autovacuum log messages
for running analyze on several tables, but no statistics data showed up
for these, I assume because the actual data in the table wasn't yet
visible to pg_autovacuum because the import transaction had not finished
yet.

When the import finished, not all of the tables affected by the import
were re-visited because they had not bumped up over the threshold again,
even though the analyze run for those tables had not generated any stats
because of the still-open transaction.

Am I making the correct assumptions about the way the various pieces
work? Does this scenario make sense?

It's easy enough for us to kick off a vacuum/analyze at the end of a
long import - but this "mysterious" behavior was bugging me!

Thanks.

- DAP 

>-----Original Message-----
>From: Matthew T. O'Connor [mailto:matthew(at)zeut(dot)net] 
>Sent: Wednesday, November 17, 2004 2:02 PM
>To: David Parker
>Cc: Tom Lane; Jeff; Russell Smith; pgsql-performance(at)postgresql(dot)org
>Subject: Re: [PERFORM] query plan question
>
>Well based on the autovacuum log that you attached, all of 
>those tables 
>are insert only (at least during the time period included in 
>the log.   
>Is that correct?  If so, autovacuum will never do a vacuum 
>(unless required by xid wraparound issues) on those tables.  
>So this doesn't appear to be an autovacuum problem.  I'm not 
>sure about the missing pg_statistic entries anyone else care 
>to field that one?
>
>Matthew
>
>
>David Parker wrote:
>
>>Thanks. The tables I'm concerned with are named: 'schema', 'usage', 
>>'usageparameter', and 'flow'. It looks like autovacuum is performing
>>analyzes:
>>
>>% grep "Performing: " logs/.db.tazz.vacuum.log
>>[2004-11-17 12:05:58 PM] Performing: ANALYZE 
>>"public"."scriptlibrary_library"
>>[2004-11-17 12:15:59 PM] Performing: ANALYZE 
>>"public"."scriptlibraryparm"
>>[2004-11-17 12:15:59 PM] Performing: ANALYZE "public"."usageparameter"
>>[2004-11-17 12:21:00 PM] Performing: ANALYZE "public"."usageproperty"
>>[2004-11-17 12:21:00 PM] Performing: ANALYZE "public"."route"
>>[2004-11-17 12:21:00 PM] Performing: ANALYZE "public"."usageparameter"
>>[2004-11-17 12:21:00 PM] Performing: ANALYZE 
>>"public"."scriptlibrary_library"
>>[2004-11-17 12:26:01 PM] Performing: ANALYZE "public"."usage"
>>[2004-11-17 12:26:01 PM] Performing: ANALYZE "public"."usageparameter"
>>[2004-11-17 12:31:04 PM] Performing: ANALYZE "public"."usageproperty"
>>[2004-11-17 12:36:04 PM] Performing: ANALYZE "public"."route"
>>[2004-11-17 12:36:04 PM] Performing: ANALYZE "public"."service_usage"
>>[2004-11-17 12:36:04 PM] Performing: ANALYZE "public"."usageparameter"
>>
>>But when I run the following:
>>
>>select * from pg_statistic where starelid in (select oid from 
>pg_class 
>>where relname in
>>('schema','usageparameter','flow','usage'))
>>
>>it returns no records. Shouldn't it? It doesn't appear to be doing a 
>>vacuum anywhere, which makes sense because none of these tables have 
>>over the default threshold of 1000. Are there statistics 
>which only get 
>>generated by vacuum?
>>
>>I've attached a gzip of the pg_autovacuum log file, with -d 3.
>>
>>Thanks again.
>>
>>- DAP
>>
>>
>>  
>>
>>>-----Original Message-----
>>>From: Matthew T. O'Connor [mailto:matthew(at)zeut(dot)net]
>>>Sent: Wednesday, November 17, 2004 11:41 AM
>>>To: David Parker
>>>Cc: Tom Lane; Jeff; Russell Smith; pgsql-performance(at)postgresql(dot)org
>>>Subject: Re: [PERFORM] query plan question
>>>
>>>David Parker wrote:
>>>
>>>    
>>>
>>>>We're using postgresql 7.4.5. I've only recently put 
>pg_autovacuum in 
>>>>place as part of our installation, and I'm basically taking the 
>>>>defaults. I doubt it's a problem with autovacuum itself, but rather 
>>>>with my configuration of it. I have some reading to do, so
>>>>      
>>>>
>>>any pointers
>>>    
>>>
>>>>to existing autovacuum threads would be greatly appreciated!
>>>>
>>>>      
>>>>
>>>Well the first thing to do is increase the verbosity of the 
>>>pg_autovacuum logging output.  If you use -d2 or higher, 
>pg_autovacuum 
>>>will print out a lot of detail on what it thinks the thresholds are 
>>>and
>>>why it is or isn't performing vacuums and analyzes.   Attach 
>>>some of the
>>>log and I'll take a look at it.
>>>
>>>    
>>>
>
>

Responses

pgsql-performance by date

Next:From: Tom LaneDate: 2004-11-18 16:43:12
Subject: Timing of pgstats updates
Previous:From: Andrew JanianDate: 2004-11-18 15:57:17
Subject: Re: Query Performance and IOWait

pgsql-hackers by date

Next:From: Tom LaneDate: 2004-11-18 16:14:41
Subject: Re: plpgsql on 8.0b4 bug?
Previous:From: Tom LaneDate: 2004-11-18 15:45:53
Subject: Re: invalid page header

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group