DELETE, INSERT vs SELECT, UPDATE || INSERT

From: Anton Maksimenkov <engineer(at)hlebprom(dot)ru>
To: pgsql-performance(at)postgresql(dot)org
Subject: DELETE, INSERT vs SELECT, UPDATE || INSERT
Date: 2005-12-23 09:02:05
Message-ID: 287988431.20051223140205@hlebprom.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi, all.

While working on algorithm of my project I came to question. Let it
be table like this (user+cookie pair is the primary key).

INT user
INT cookie
INT count

Periodically (with period 10 minutes) this PostgreSQL table
updated with my information.
The main problem that some of pairs (user, cookie) may be already
exists in PostgreSQL and must be updated, some not exists and must be
inserted.

My first way was to DELETE row with (user, cookie) pair which I'm
going to update then INSERT new. This guarantees that there will not
be an error when (user, cookie) pair already exists in table. And
currently it works by this way.
But I think that it lead to highly fragmentation of table and it need
to be VACUUMED and ANALYZED far more frequently...

Second idea was to try to SELECT (user, cookie) pair and then UPDATE
it if it exists or INSERT if not. I has thought that if UPDATE will
rewrite same place in file with new count it may lead to more compact
table (file not grow and information about actual rows in file will
not changed). And, if actual file blocks containing (user, cookie)
pair will not be moved to new place in file, table need to be ANALYZED
less frequently.
But if UPDATE will actually insert new row in file, marking as 'free
to use' previous block in file which was contain previous version of
row, then again, table need to be VACUUMED and ANALYZED far more
frequently...
And this second idea will be completely waste of time and code.
Because write on C code which "DELETE and INSERT" is more portably
than "SELECT than UPDATE if there are rows, or INSERT if there are
not".

So, can anyone explain me is the actual mechanism of UPDATE can save
resources and tables from been highly fragmented? Or it gives same
results and problems and "DELETE then INSERT" is the best way?
--
engineer

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Carlos Benkendorf 2005-12-23 12:34:39 Order by behaviour
Previous Message Qingqing Zhou 2005-12-23 04:29:42 Re: CPU and RAM