Re: Performance query about large tables, lots of concurrent access

From: Karl Wright <kwright(at)metacarta(dot)com>
To: Scott Marlowe <smarlowe(at)g2switchworks(dot)com>
Cc: Shaun Thomas <sthomas(at)leapfrogonline(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Performance query about large tables, lots of concurrent access
Date: 2007-06-20 23:22:47
Message-ID: 4679B6C7.3080207@metacarta.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Scott Marlowe wrote:
> Karl Wright wrote:
>
>> Shaun Thomas wrote:
>>
>>> On Wednesday 20 June 2007 12:55:20 pm Karl Wright wrote:
>>>
>>>
>>>> I am afraid that I did answer this. My largest tables
>>>> are the ones continually being updated. The smaller
>>>> ones are updated only infrequently.
>>>
>>>
>>>
>>> You know, it actually sounds like you're getting whacked by the same
>>> problem that got us a while back. It sounds like you weren't
>>> vacuuming frequently enough initially, and then tried vacuuming
>>> later, only after you noticed performance degrade.
>>>
>>> Unfortunately what that means, is for several weeks or months,
>>> Postgres has not been reusing rows on your (admittedly) active and
>>> large tables; it just appends at the end, and lets old rows slowly
>>> bloat that table larger and larger. Indexes too, will suffer from
>>> dead pages. As frightening/sickening as this sounds, you may need to
>>> dump/restore the really huge table, or vacuum-full to put it on a
>>> crash diet, and then maintain a strict daily or bi-daily vacuum
>>> schedule to keep it under control.
>>>
>>
>> A nice try, but I had just completed a VACUUM on this database three
>> hours prior to starting the VACUUM that I gave up on after 27 hours.
>> So I don't see how much more frequently I could do it. (The one I did
>> earlier finished in six hours - but to accomplish that I had to shut
>> down EVERYTHING else that machine was doing.)
>
>
> So, have you ever run vacuum full or reindex on this database?
>

No. However, this database has only existed since last Thursday afternoon.

> You are aware of the difference between how vacuum and vacuum full work,
> right?
>
> vacuum := mark deleted tuples as available, leave in table
> vacuum full := compact tables to remove deleted tuples.
>
> While you should generally avoid vacuum full, if you've let your
> database get so bloated that the majority of space in your tables is now
> empty / deleted tuples, you likely need to vacuuum full / reindex it.
>

If the database is continually growing, should VACUUM FULL be necessary?

> For instance, on my tiny little 31 Gigabyte reporting database, the main
> table takes up about 17 Gigs. This query gives you some idea how many
> bytes each row is taking on average:
>
> select relname, relpages::float*8192 as size, reltuples,
> (relpages::double precision*8192)/reltuples::double precision as
> bytes_per_row from pg_class where relname = 'businessrequestsummary';
> relname | size | reltuples | bytes_per_row
> ------------------------+-------------+-------------+-----------------
> businessrequestsummary | 17560944640 | 5.49438e+07 | 319.61656229454
>
> Note that these numbers are updated by running analyze...
>
> What does it say about your DB?
>

I wish I could tell you. Like I said, I had to abandon this project to
test out an upgrade procedure involving pg_dump and pg_restore. (The
upgrade also seems to take a very long time - over 6 hours so far.)
When it is back online I can provide further information.

Karl

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Kurt Overberg 2007-06-21 00:43:57 Re: Maintenance question / DB size anomaly...
Previous Message Scott Marlowe 2007-06-20 22:45:56 Re: Performance query about large tables, lots of concurrent access