Re: *sigh*

From: Christopher Browne <cbbrowne(at)acm(dot)org>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: *sigh*
Date: 2003-12-03 05:55:16
Message-ID: m31xrmwim3.fsf@wolfe.cbbrowne.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Martha Stewart called it a Good Thing when Randolf Richardson <rr(at)8x(dot)ca> wrote:
>>> The count(*) information can be revisioned too, am I wrong ? I'm able
>>> to create a trigger that store the count(*) information in a special
>>> table, why not implement the same in a way "builded in" ?
>>
>> Then every insert or delete would have to lock that count. Nobody else
>> would be able to insert or delete any records until you either commit or
>> roll back.
>>
>> That would lead to much lower concurrency, much more contention for
>> locks, and tons of deadlocks.
>
> What about queueing all these updates for a separate
> low-priority thread? The thread would be the only one with access
> to update this field.

If updates are "queued," then how do you get to use them if the
"update thread" isn't running because it's not high enough in
priority?

I am not being facetious.

The one way that is expected to be successful would be to have a
trigger that, upon seeing an insert of 5 rows to table "ABC", puts,
into table "count_detail", something like:

insert into count_detail (table, value) values ('ABC', 5);

You then replace
select count(*) from abc;

with
select sum(value) from count_detail where table = 'ABC';

The "low priority" thread would be a process that does something akin
to vacuuming, where it would replace the contents of the table every
so often...

for curr_table in (select table from count_detail) do
new_total = select sum(value) from count_detail
where table = curr_table;
delete from count_detail where table = curr_table;
insert into count_detail (table, value) values (curr_table,
new_total);
done

The point of this being to try to keep the number of rows to 1 per
table.

Note that this gets _real_ expensive for tables that see lots of
single row inserts and deletes. There isn't a cheaper way that will
actually account for the true numbers of records that have been
committed.

For a small table, it will be cheaper to walk through and calculate
count(*) directly from the tuples themselves.

The situation where it may be worthwhile to do this is a table which
is rather large (thus count(*) is expensive) where there is some
special reason to truly care how many rows there are in the table.
For _most_ tables, it seems unlikely that this will be true. For
_most_ tables, it is absolutely not worth the cost of tracking the
information.
--
(format nil "~S(at)~S" "cbbrowne" "acm.org")
http://cbbrowne.com/info/spreadsheets.html
Predestination was doomed from the start.

In response to

  • Re: *sigh* at 2003-12-03 05:20:35 from Randolf Richardson

Responses

  • Re: *sigh* at 2003-12-03 08:29:08 from Mark Kirkwood
  • Re: *sigh* at 2003-12-12 19:42:23 from Randolf Richardson

Browse pgsql-hackers by date

  From Date Subject
Next Message John Sidney-Woollett 2003-12-03 08:08:49 Transaction Question
Previous Message Claudio Natoli 2003-12-03 05:30:40 Re: [HACKERS] fork/exec problem: DynaHashCxt