Re: strategies for dealing with frequently updated tables

From: Andy Colson <andy(at)squeakycode(dot)net>
To: Dave Vitek <dvitek(at)grammatech(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: strategies for dealing with frequently updated tables
Date: 2012-02-28 03:29:14
Message-ID: 4F4C4A0A.4050505@squeakycode.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 02/27/2012 06:55 PM, Dave Vitek wrote:
> Hi all,
>
> I have a relation where a tuple typically undergoes a lifecycle something like:
> 1) Created
> 2) Updated maybe thousands of times (no updates to indexed columns though)
> 3) Rarely or never modified again
>
> The following query takes about 100 minutes (3 seconds per tuple):
> SELECT count(id) from T
>
> (2) is causing a lot of auto vacuum/analyze activity (which is OK). HOT seems to get used for about 90% of the updates, but there are enough updates that don't use it to cause issues. I'm using pg version 9.0.3 on a 32-bit windows xp machine with 3GB of RAM. The .conf file is using default settings.
>
> Table Info:
> n_live_tup 1799
> n_dead_tup 191
> pg_relation_size 2343mb
> indexsize 10mb
> toastsize 552kb
> toastindexsize 16kb
>
> This reports under 10kb for most tuples:
> psql -A -c "select * from T where id=123" | wc -c
>
> auto-vacuum and auto-analyze both ran yesterday with default settings. There are only one or two new tuples since yesterday. The database is fairly old (was probably created using pg_restore about when 9.0.3 came out).
>
> Here is the output from VACUUM VERBOSE:
> INFO: vacuuming "public.T"
> 0 pages are entirely empty.
> CPU 0.00s/0.00u sec elapsed 0.56 sec.
> VACUUM
>
>
> I imagine CLUSTERing the table would make things happier, but I'm hoping for a permanent solution that avoids periodic downtime.
>
> One thought was to partition so rows that are still changing live in a separate table from the more stable rows. I imagine the cardinality of (2) rarely exceeds 10. Can I still get into performance trouble with a table that small after enough updates? Anyone have other ideas?
>
> - Dave
>

I'm a little confused... whats the problem?

> The following query takes about 100 minutes (3 seconds per tuple):
> SELECT count(id) from T

so table T has 18,000 rows? (100 * 60 * 3)

> Table Info:
> n_live_tup 1799

Oh, so the table has 1,800 rows?

> This reports under 10kb for most tuples:
> psql -A -c "select * from T where id=123" | wc -c

10 kb? what? So that's one row? And that one row is about 10,000 bytes?

So you have a large row, who knows how many, and the problem is "SELECT count(id) from T" is slow?

You say:

> 2) Updated maybe thousands of times (no updates to indexed columns though)

Are the updates slow? How slow, and how fast to you need them to be? Is ID a unique key? Or does "update ... where id=123" update more than one row?

What does the actual table look like? Looks like it has lots of indexes, but I'm just guessing! Are you sure you need all those indexes? updating a row has to update all those indexes.. I assume, but you give no details.

Also... in the end, you're on windows, so you probably cant tell me if you are IO bound, but you probably need a better IO subsystem. And stop running on raid 5 (which I'm assuming since you don't mention anything)

Wow. Re-reading that I seem to be angry. Please don't take it that way. Blood sugar is high, makes me angry person. Others will be along to scold me. Bad Andy! Bad! Go to your room until you find your happy face.

-Andy

ps: based on you're email gramma, I almost changed all your to you're, just to see if it annoyed you. Sounds like you work for grammar technology.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Lionel Elie Mamane 2012-02-28 04:02:24 Re: Does the current user have UPDATE privilege on FOO?
Previous Message chinnaobi 2012-02-28 03:22:02 Re: PSQL 9.1 Streaming Replication Windows 2008 64 bit Servers