Skip site navigation (1) Skip section navigation (2)

Re: [SQL] Yet Another (Simple) Case of Index not used

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Kevin Brown <kevin(at)sysexperts(dot)com>,pgsql-performance(at)postgresql(dot)org
Subject: Re: [SQL] Yet Another (Simple) Case of Index not used
Date: 2003-04-21 16:14:43
Message-ID: 200304210914.43552.josh@agliodbs.com (view raw or flat)
Thread:
Lists: pgsql-generalpgsql-performancepgsql-sql
Kevin,

> > Josh already mentioned this as a problem with user-trigger-based
> > counting.
>
> Right, but the trigger based mechanism probably magnifies the issue by
> orders of magnitude, and thus can't necessarily be used as an argument
> against an internally-implemented method.

I'm not sure about that, Kevin.   The production trigger test was written in C 
(by Joe Conway), using some of the best memory/efficiency management he could 
devise.   I could buy that the trigger mechanism adds a certain fixed 
overhead to the process, but not the contention that we were seeing ... 
especially not the geometric progression of inefficiency as the transaction 
count went up.   We'll talk about this offlist; I may be able to get the 
client to authorize letting you examine the database.

For further detail, our setup was sort of a "destruction test"; including:
1) a slightly underpowered server running too many processes;
2) a very high disk contention environment, with multiple applications 
fighting for I/O.
3) running COUNT(*), GROUP BY x on a table with 1.4 million rows, which was 
being updated in batches of 10,000 rows to 40,000 rows every few minutes.

As I said before, the overhead for c-trigger based accounting, within the MVCC 
framework, was quite tolerable with small update batches, only 9-11% penalty 
to the updates overall for batches of 100-300 updates.   However, as we 
increased the application activity, the update penalty increased, up to 
40-45% with the full production load.

It's not hard to figure out why; like most user's servers, the aggregate 
caching table was on the same disk as the table(s) being updated.    The 
resut was a huge amount of disk-head-skipping between the updated table and 
the aggregate caching table every time a commit hit the database, with random 
seek times increasing the longer the time since the last VACUUM.

Now, on a better server with these tables on fast RAID or on different 
spindles, I expect the result would be somewhat better.   However, I also 
suspect that many of the users who complain the loudest about slow count(*) 
are operating in single-spindle environments.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco


In response to

pgsql-performance by date

Next:From: Jim C. NasbyDate: 2003-04-22 08:23:39
Subject: Re: [SQL] Yet Another (Simple) Case of Index not used
Previous:From: Tom LaneDate: 2003-04-21 01:53:20
Subject: Re: [SQL] Yet Another (Simple) Case of Index not used

pgsql-sql by date

Next:From: Mario Alberto Soto CordonesDate: 2003-04-21 20:48:59
Subject: functions problems
Previous:From: Lorraine DeweyDate: 2003-04-21 15:14:04
Subject: Optomizing left outer joins

pgsql-general by date

Next:From: Lamar OwenDate: 2003-04-21 16:18:57
Subject: Re: Using TEXT columns for binary content
Previous:From: Murthy KambhampatyDate: 2003-04-21 16:05:12
Subject: Re: stddev returns 0 when there is one row

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group