Re: **SPAM** Faster count(*)?

From: Keith Worthington <KeithW(at)NarrowPathInc(dot)com>
To: pgsql-sql(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, dracula007(at)atlas(dot)cz
Subject: Re: **SPAM** Faster count(*)?
Date: 2005-08-12 04:41:31
Message-ID: 42FC287B.7030609@NarrowPathInc.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Tom Lane wrote:
> dracula007(at)atlas(dot)cz writes:
>
>>I believe running count(*) means fulltable scan, and there's no way
>>to do it without it. But what about some "intermediate" table, with
>>the necessary counts?
>
>
> There's a fairly complete discussion in the PG list archives of a
> reasonably-efficient scheme for maintaining such counts via triggers.
> It wasn't efficient enough that we were willing to impose the overhead
> on every application ... but if you really NEED a fast count(*) you
> could implement it. I'd like to see someone actually do it and put
> up working code on pgfoundry; AFAIK it's only a paper design so far.

I was kicking this around and came up with the following. I have hit a
couple of snags.

In the function I attempt to count the number of rows in a table being
checked for the first time. I wanted to use 'FROM TG_RELNAME' but as
you can see I had to hard code my test table and comment out the trigger
parameter. FROM tbl_demo--TG_RELNAME Can someone tell me why that won't
work?

Also the function doesn't seem to be getting ROW_COUNT properly. The
end result is that for this test the table is properly inserted into the
monitoring table after it's forth insert but it is never updated after
that. Can someone help me see the forest through the trees?

-- Clean up the environment.
DROP TABLE tbl_row_count;
DROP TABLE tbl_demo;
DROP FUNCTION tf_update_row_count();

-- Build the table for holding the row counts.
CREATE TABLE tbl_row_count
(
relid oid NOT NULL,
row_count int8 NOT NULL DEFAULT 0
)
WITHOUT OIDS;
ALTER TABLE tbl_row_count OWNER TO postgres;
COMMENT ON COLUMN tbl_row_count.relid IS 'Contains relation id number.';
COMMENT ON COLUMN tbl_row_count.row_count IS 'Contains the number of
rows in a relation.';

-- Build a table to test the trigger on.
CREATE TABLE tbl_demo
(
first_name varchar(30) NOT NULL
)
WITHOUT OIDS;
ALTER TABLE tbl_demo OWNER TO postgres;
COMMENT ON TABLE tbl_demo IS 'Table used for demonstrating a trigger.';

-- Create the trigger function to maintain the row counts.
CREATE OR REPLACE FUNCTION public.tf_update_row_count()
RETURNS "trigger" AS
$BODY$
DECLARE
v_row_count int8;
BEGIN
-- Store the row count before it disappears.
GET DIAGNOSTICS v_row_count = ROW_COUNT;
-- Check if this is a new table.
PERFORM relid
FROM public.tbl_row_count
WHERE relid = TG_RELID;
IF FOUND THEN
-- Data for this table is already in the row count table.
IF TG_OP = 'INSERT' THEN
UPDATE public.tbl_row_count
SET row_count = row_count + v_row_count
WHERE relid = TG_RELID;
ELSIF TG_OP = 'DELETE' THEN
UPDATE public.tbl_row_count
SET row_count = row_count - v_row_count
WHERE relid = TG_RELID;
END IF;
ELSE
-- This is a new table so it needs to be counted.
SELECT count(*)
FROM tbl_demo--TG_RELNAME
INTO v_row_count;
INSERT INTO public.tbl_row_count ( relid, row_count )
VALUES ( TG_RELID,
v_row_count
);
END IF;
RETURN NULL;
END;
$BODY$
LANGUAGE 'plpgsql' VOLATILE;
ALTER FUNCTION public.tf_update_row_count() OWNER TO postgres;
GRANT EXECUTE ON FUNCTION public.tf_update_row_count() TO postgres;
GRANT EXECUTE ON FUNCTION public.tf_update_row_count() TO public;

-- Insert some initial data into the demo table.
INSERT INTO public.tbl_demo
( first_name )
VALUES ( 'Keith' );
INSERT INTO public.tbl_demo
( first_name )
VALUES ( 'Ed' );
INSERT INTO public.tbl_demo
( first_name )
VALUES ( 'Kryss' );

-- Create the trigger on the demo table.
CREATE TRIGGER tgr_update_row_count
AFTER INSERT OR DELETE
ON public.tbl_demo
FOR EACH STATEMENT
EXECUTE PROCEDURE public.tf_update_row_count();

-- Examine the starting state of the tables.
SELECT *
FROM public.tbl_demo;
SELECT *
FROM public.tbl_row_count;
SELECT relid,
relname,
row_count
FROM public.tbl_row_count
LEFT JOIN pg_class
ON ( tbl_row_count.relid = pg_class.oid );

-- Insert a row.
INSERT INTO public.tbl_demo
( first_name )
VALUES ( 'Jarus' );

-- Examine the new state of the tables.
SELECT *
FROM public.tbl_demo;
SELECT relid,
relname,
row_count
FROM public.tbl_row_count
LEFT JOIN pg_class
ON ( tbl_row_count.relid = pg_class.oid );

-- Insert two more rows.
INSERT INTO public.tbl_demo
( first_name )
VALUES ( 'Dani' );
INSERT INTO public.tbl_demo
( first_name )
VALUES ( 'Mary' );

-- Examine the final state of the tables.
SELECT *
FROM public.tbl_demo;
SELECT relid,
relname,
row_count
FROM public.tbl_row_count
LEFT JOIN pg_class
ON ( tbl_row_count.relid = pg_class.oid );

--
Kind Regards,
Keith

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Jack Tiger 2005-08-12 07:24:24 unsubscribe
Previous Message Louise Catherine 2005-08-12 04:17:18 Re: about subselect