Re: optimization ideas for frequent, large(ish) updates in frequently accessed DB?

From: Manfred Koizar <mkoi-pg(at)aon(dot)at>
To: "Marinos J(dot) Yannikos" <mjy(at)geizhals(dot)at>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: optimization ideas for frequent, large(ish) updates in frequently accessed DB?
Date: 2004-02-13 15:21:29
Message-ID: 2aop205msqcfhpgr1su0fvkm99nolq7t77@email.aon.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Marinos, while you are busy answering my first set of questions :-),
here is an idea that might help even out resource consumption.

On Fri, 13 Feb 2004 01:58:34 +0100, "Marinos J. Yannikos"
<mjy(at)geizhals(dot)at> wrote:
>begin transaction;
> delete from t where id=5;
> insert into t (id,...) values (5,...);
> ... [1-200k rows]
>end;
>
>The problem is, that a large update of this kind can delay SELECT
>queries running in parallel for several seconds, so the web interface
>used by several people will be unusable for a short while.

CREATE TABLE idmap (
internalid int NOT NULL PRIMARY KEY,
visibleid int NOT NULL,
active bool NOT NULL
);
CREATE INDEX ipmap_visible ON idmap(visibleid);

Populate this table with
INSERT INTO idmap
SELECT id, id, true
FROM t;

Change
SELECT ...
FROM t
WHERE t.id = 5;

to
SELECT ...
FROM t INNER JOIN idmap ON (idmap.internalid = t.id AND
idmap.active)
WHERE idmap.visibleid = 5;

When you have to replace the rows in t for id=5, start by

INSERT INTO idmap VALUES (12345, 5, false);

Then repeatedly
INSERT INTO t (id, ...) VALUES (12345, ...);
at a rate as slow as you can accept. You don't have to wrap all INSERTs
into a single transaction, but batching together a few hundred to a few
thousand INSERTs will improve performance.

When all the new values are in the database, you switch to the new id in
one short transaction:
BEGIN;
UPDATE idmap SET active = false WHERE visibleid = 5 AND active;
UPDATE idmap SET active = true WHERE internalid = 12345;
COMMIT;

Do the cleanup in off-peak hours (pseudocode):

FOR delid IN (SELECT internalid FROM idmap WHERE NOT active)
BEGIN
DELETE FROM t WHERE id = delid;
DELETE FROM idmap WHERE internalid = delid;
END;
VACUUM ANALYSE t;
VACUUM ANALYSE idmap;

To prevent this cleanup from interfering with INSERTs in progress, you
might want to add a "beinginserted" flag to idmap.

HTH.
Servus
Manfred

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Manfred Koizar 2004-02-13 16:47:07 Re: optimization ideas for frequent, large(ish) updates in frequently accessed DB?
Previous Message Andrew Sullivan 2004-02-13 12:38:57 Re: Disappointing performance in db migrated from MS SQL Server