Skip site navigation (1) Skip section navigation (2)

Re: Performance query about large tables, lots of concurrent access

From: Scott Marlowe <smarlowe(at)g2switchworks(dot)com>
To: Karl Wright <kwright(at)metacarta(dot)com>
Cc: Shaun Thomas <sthomas(at)leapfrogonline(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Performance query about large tables, lots of concurrent access
Date: 2007-06-20 22:45:56
Message-ID: 4679AE24.9070005@g2switchworks.com (view raw or flat)
Thread:
Lists: pgsql-performance
Karl Wright wrote:
> Shaun Thomas wrote:
>> On Wednesday 20 June 2007 12:55:20 pm Karl Wright wrote:
>>
>>
>>> I am afraid that I did answer this.  My largest tables
>>> are the ones continually being updated.  The smaller
>>> ones are updated only infrequently. 
>>
>>
>> You know, it actually sounds like you're getting whacked by the same 
>> problem that got us a while back.  It sounds like you weren't 
>> vacuuming frequently enough initially, and then tried vacuuming 
>> later, only after you noticed performance degrade.
>>
>> Unfortunately what that means, is for several weeks or months, 
>> Postgres has not been reusing rows on your (admittedly) active and 
>> large tables; it just appends at the end, and lets old rows slowly 
>> bloat that table larger and larger.  Indexes too, will suffer from 
>> dead pages.  As frightening/sickening as this sounds, you may need to 
>> dump/restore the really huge table, or vacuum-full to put it on a 
>> crash diet, and then maintain a strict daily or bi-daily vacuum 
>> schedule to keep it under control.
>>
>
> A nice try, but I had just completed a VACUUM on this database three 
> hours prior to starting the VACUUM that I gave up on after 27 hours.  
> So I don't see how much more frequently I could do it.  (The one I did 
> earlier finished in six hours - but to accomplish that I had to shut 
> down EVERYTHING else that machine was doing.)

So, have you ever run vacuum full or reindex on this database?

You are aware of the difference between how vacuum and vacuum full work, 
right?

vacuum := mark deleted tuples as available, leave in table
vacuum full := compact tables to remove deleted tuples.

While you should generally avoid vacuum full, if you've let your 
database get so bloated that the majority of space in your tables is now 
empty / deleted tuples, you likely need to vacuuum full / reindex it.

For instance, on my tiny little 31 Gigabyte reporting database, the main 
table takes up about 17 Gigs.  This query gives you some idea how many 
bytes each row is taking on average:

select relname, relpages::float*8192 as size, reltuples, 
(relpages::double precision*8192)/reltuples::double precision as 
bytes_per_row from pg_class where relname = 'businessrequestsummary';
        relname         |    size     |  reltuples  |  bytes_per_row
------------------------+-------------+-------------+-----------------
 businessrequestsummary | 17560944640 | 5.49438e+07 | 319.61656229454

Note that these numbers are updated by running analyze...

What does it say about your DB?

In response to

Responses

pgsql-performance by date

Next:From: Karl WrightDate: 2007-06-20 23:22:47
Subject: Re: Performance query about large tables, lots of concurrent access
Previous:From: Scott MarloweDate: 2007-06-20 22:20:18
Subject: Re: Hardware suggestions

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group