Re: perf pb solved only after pg_dump and restore

From: Markus Schaber <schabi(at)logix-tt(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: perf pb solved only after pg_dump and restore
Date: 2006-08-28 13:43:48
Message-ID: 44F2F314.5040001@logix-tt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi, Guillaume,

Guillaume Cottenceau wrote:

>> So I assume that you never UPDATE, but only INSERT new entries and
>> sometimes DELETE a big bunch of entries from the beginning.
>
> Actually, in the version of software where we have the problem,
> that's exactly the case. But in newer versions, UPDATE come into
> the picture (typically on recently inserted rows - one or two
> updates per row). Does UPDATE change anything? Row selection is
> done on the primary key (of SERIAL type).

In a MVCC database like PostgreSQL, UPDATE internally INSERTs the new
version of the row, and marks the old one as deleted.

Later transactions use the transaction's exit state (commit or rollback)
to decide which row version to use. VACUUM removes row versions that are
known to be obsolete (that's why longstanding transactions hold VACUUM,
beause they still can reference old, obsolete versions.).

So, for a few updates that are at the end of the table, normal VACUUM
with a sufficient free space map setting will work okay.

However, when updating or deleting big bunches of data (like the 90% you
spoke of), VACUUM FULL or CLUSTER does make sense.

> Yes, that what I had in mind. But I assumed that performance
> would be reclaimed (as if VACUUM FULL was run) because the
> statistics after analyzing are accurate as to data distribution,
> only disk space would not be reclaimed (but we don't care, at
> least for the moment).

Performance is not reclaimed for everything involving a sequential scan,
as it still has to scan the whole table.

It is partially reclaimed for index scans on UPDATEd rows, as the old
versions are removed, and so index have less versions to check for
validity in the current transaction.

> I actually assumed that VACUUM ANALYZE would order rows
> sequentially on disk (mainly because it was taking quite some
> time and a lot of disk output activity), but obviously this was
> wrong.

It only does so inside each page, but not across pages.

> "max_fsm_pages = 675000" means we also need to enlarge shared
> buffers, or the shared buffers available space for data caching
> would be reduced, right?

AFAIK, the FSM is not a part of the shared buffers memory, but they both
account to the kernels shared memory limit, which you may have to increase.

> I guess the bottom line is that I don't understand what the Free
> Space Map behaviour really is. Is it a map containing location of
> free disk pages, free meaning that they correspond to pages
> removed with DELETE but not yet released to the OS with VACUUM
> FULL, which are used for INSERT in favor of enlarging the size of
> data used on disk?

Mostly, yes. VACUUM scans the whole table, that's why it has so much
disk IO. On every page, it first deletes obsolete rows (by checking
their transaction IDs), and compacts the rest. It then appends the page
to the free space map, if it contains free space and the fsm has a free
slot left. As it does not move valid rows between pages, it can run
concurrently with "real" transactions and does not need a table lock.

INSERT uses the FSM before enlarging the file, UPDATE first looks for
free space on the same page where the old row is (which avoids updating
the index), then the FSM, then enlarging the file.

> If that's correct, am I right in assuming that
> we don't care about the Free Space Map size if we perform a
> VACUUM FULL right after large bunches of DELETE?

I don't know exactly, but as far as I remember, VACUUM FULL uses the FSM
map itsself, as it must have free target pages to move the rows to.

So an insufficient FSM may lead to the need of several VACUUM FULL runs
until the table is cleaned up, or might even fail completely.

Tom & co, please correct me if that statement above is imprecise.

> We actually are obliged to use 7.4.5 :/

I URGE you to update at least to 7.4.13 (which can be done in place,
without dump/restore). For a list of the urgend bug fixes, see
http://www.postgresql.org/docs/7.4/static/release.html#RELEASE-7-4-13
which also contains hints for a smooth upgrade.

> Am I correct in assuming that regularly running REINDEX would cut
> this bloat? (daily)

Yes, a regular REINDEX will cut index bloat (but not table bloat).

If you have a maintainance window every night, but very high traffic
during the daytime, it might make sense to have a cron script issuing a
bunch of VACUUM FULL / REINDEX / CLUSTER commands every night.

Btw, CLUSTERing a table includes the effects of VACUUM FULL and REINDEX,
but not ANALYZE.

> (also, is there any way to REINDEX all index of all tables
> easily? as when we do just "VACUUM ANALYZE" for the whole
> database)

For 7.4, you'll need a script to do that (current versions have improved
in this area). You might recycle the idea from the pgsql-sql list some
days ago:

http://archives.postgresql.org/pgsql-sql/2006-08/msg00184.php

Simply use the meta tables to get a list of all schema.table names, and
create the bunch of VACUUM FULL / REINDEX commands.

HTH,
Markus
--
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf. | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Guillaume Cottenceau 2006-08-28 13:47:42 Re: perf pb solved only after pg_dump and restore
Previous Message Alvaro Herrera 2006-08-28 13:32:06 Re: Postgre SQL 7.1 cygwin performance issue.