Re: Derived columns / denormalization

From: "Jamie Tufnell" <diesql(at)googlemail(dot)com>
To: pgsql-sql <pgsql-sql(at)postgresql(dot)org>
Subject: Re: Derived columns / denormalization
Date: 2009-01-19 00:33:18
Message-ID: b0a4f3350901181633x12714f15te5ea65adee5f7617@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

On 1/17/09, Erik Jones <ejones(at)engineyard(dot)com> wrote:
> On Jan 15, 2009, at 8:06 PM, Tom Lane wrote:
>> "Jamie Tufnell" <diesql(at)googlemail(dot)com> writes:
>>> item_count int -- this is derived from (select count(*) from items
>>> where group_id = id)
>>> ...
>>
>>> item_count would be updated by insert/update/delete triggers on the
>>> items table, hopefully that would ensure it is always correct?
>>
>> Concurrent updates to the items table make this much harder than
>> it might first appear. If you're willing to serialize all your
>> updating
>> transactions then you can make it work, but ...
>
> That was exactly the caveat I was about to point out. That being
> said, keeping COUNT() values and other computed statistics based on
> other data in the database *is* a fairly common "tactic". On method
> that I've used to great success to avoid the serialization problem is
> to have your triggers actually insert the necessary information for
> the update into a separate "update queue" table. You then have
> separate process that routinely sweeps that update queue, aggregates
> the updates and then updates your count values in the groups table
> with the total update values for each groups entry with updates.

Fortunately our items table rarely sees concurrent writes. It's over
99% reads and is typically updated by just one user. We are already
caching these aggregates and other data in a separate layer and my
goal is to see if I can get rid of that layer.

In light of your advice though, I will think things through a bit more first.

Thanks for your help!
Jamie

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Bryce Nesbitt 2009-01-20 19:27:15 Re: Way to eliminate pg_dump activity from pg_stat_all ?
Previous Message Erik Jones 2009-01-16 16:42:11 Re: Derived columns / denormalization