row reuse while UPDATE and vacuum analyze problem

From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: pgsql-hackers(at)postgreSQL(dot)org
Cc: tgl(at)sss(dot)pgh(dot)pa(dot)us, vadim(at)krs(dot)ru
Subject: row reuse while UPDATE and vacuum analyze problem
Date: 1999-07-28 10:39:21
Message-ID: Pine.GSO.3.96.SK.990728141058.27569K-100000@ra
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

testing of DBIlogging to postgres I've got serious problem
with performance degradation during updates.
In my case I got 15-16 req/sec for the first 1000 updates which drops down
to 1-2 req. sec after 20000 updates. This is quite unusable even for
medium Web site. As Tom Lane noticed update is just an insert, so if
table has only one row which is updated by several processes the size
will grow until 'vacuum analyze'. Indeed 'vacuum analyze' helps a lot,
but index file doesn't affected, it remains big !

After 20190 updates and several 'vacuum analyze':

-rw------- 1 postgres users 1810432 Jul 28 14:22 hits
-rw------- 1 postgres users 1368064 Jul 28 14:22 hits_pkey
om:/usr/local/pgsql/data/base/discovery$ psql discovery -c 'select count(*) from hits'
count
-----
10000
(1 row)

om:/usr/local/pgsql/data/base/discovery$ psql discovery -c 'select sum(count) from hits'
sum
-----
20190
(1 row)

I inserted 10,000 rows into table hits just to test how the number of
rows could affect to performance while 2 rows are updated. I didn't notice
any difference.

After 'vacuum analyze':
om:/usr/local/pgsql/data/base/discovery$ l hits*
-rw------- 1 postgres users 606208 Jul 28 14:27 hits
-rw------- 1 postgres users 1368064 Jul 28 14:27 hits_pkey
om:/usr/local/pgsql/data/base/discovery$

Index file doesn't touched, actually modification date changed, but the
size remains big.

How update performance could be increased if:
1. 'vacuum analyze' will analyze index file
2. reuse row instead of inserting

I found in TODO only

* Allow row re-use without vacuum(Vadim)

My site isn't in production yet, so I'd like to know are there some chance
update problem will be solved. I think this is rather general problem
and many Web developers will appreciate solving it as Jan's feature patch
for LIMIT inspired many people to use postgres in real applications as well
as great new MVCC feature.

Regards,

Oleg

PS.

For those who interested in my handler for Logging accumulated hits into
postgres:

In httpd.conf:

PerlModule Apache::HitsDBI0
<Location /db/pubs.html>
PerlCleanupHandler Apache::HitsDBI0
</Location>

Table scheme:
create table hits (
msg_id int4 not null primary key,
count int4 not null,
first_access datetime default now(),
last_access datetime
);
-- grant information

GRANT SELECT ON hits to PUBLIC;
GRANT INSERT,UPDATE ON hits to httpd;

package Apache::HitsDBI0;

use strict;

# preloaded in startup.pl
use Apache::Constants qw(:common);
#use DBI ();

sub handler {
my $orig = shift;
if ( $orig->args() =~ /msg_id=(\d+)/ ) {
my $dbh = DBI->connect("dbi:Pg:dbname=discovery") || die DBI->errstr;
$dbh->{AutoCommit} = 0;
my $sth = $dbh->do("LOCK TABLE hits IN SHARE ROW EXCLUSIVE MODE") || die $dbh->errstr;
my $rows_affected = $dbh->do("update hits set count=count+1,last_access=now() where msg_id=$1") || die $dbh->errstr;
## postgres specific !!!
$sth = $dbh->do("Insert Into hits (msg_id,count) values ($1, 1)") if ($rows_affected eq '0E0');
my $rc = $dbh->commit || die $dbh->errstr;
}
return OK;
}

1;
__END__

_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message The Hermit Hacker 1999-07-28 12:00:21 Re: [HACKERS] row reuse while UPDATE and vacuum analyze problem
Previous Message Zeugswetter Andreas IZ5 1999-07-28 08:00:28 Re: Selectivity of "=" (Re: [HACKERS] Index not used on simple se lect)