UPDATE with correlated aggregates

From: Joe Abbate <jma(at)freedomcircle(dot)com>
To: pgsql-sql(at)postgresql(dot)org
Subject: UPDATE with correlated aggregates
Date: 2006-01-27 03:17:05
Message-ID: 43D990B1.6090608@freedomcircle.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Hi,

I have a database with one "main" table with three columns that
self-reference the primary key, plus four other tables that have (in
total) seven foreign key columns referencing "main". I want to create a
table (or view) that summarizes the various reference counts, per row in
"main". I have an approach that works, for the much simpler situation
described below, but wanted feedback on possible improvements or
simplifications before implementing it on the actual tables.

Here are the simplified "main" table and one of the ancillary tables:

dev=> \d main
Table "public.main"
Column | Type | Modifiers
--------+---------+-----------
id | integer | not null
parent | integer |
data | text |
Indexes:
"main_pkey" PRIMARY KEY, btree (id)

dev=> \d items
Table "public.items"
Column | Type | Modifiers
--------+---------+-----------
id | integer | not null
main | integer |
data | text |
Indexes:
"items_pkey" PRIMARY KEY, btree (id)

dev=> select * from main;
id | parent | data
----+--------+--------
1 | | ABC
2 | 1 | ABCDEF
3 | 1 | ABCGHI
4 | | PQR
5 | 4 | PQRSTU
6 | 4 | PQRUVW
7 | 4 | PQRXYZ
(7 rows)

dev=> select * from items;
id | main | data
----+------+------
1 | 2 | asdf
2 | 2 | jkl;
3 | 2 | qwer
4 | 3 | uiop
5 | 3 | m,./
6 | 4 | zxcv
7 | 4 | rtyu
8 | 4 | fghj
9 | 4 | vbnm
10 | 6 | asl;
11 | 7 | qwop
12 | 7 | zx./
(12 rows)

This is the summary table:

dev=> \d summ
Table "public.summ"
Column | Type | Modifiers
--------+---------+-----------
id | integer | not null
subs | bigint |
items | bigint |
Indexes:
"summ_pkey" PRIMARY KEY, btree (id)

I first populate "summ" with 'select id from main'. Since an UPDATE
apparently cannot include aggregates, i.e., subs = count(*), I chose to
create two views as follows:

CREATE VIEW main_summ AS
SELECT parent AS id, count(*) AS subs
FROM main
WHERE parent IS NOT NULL
GROUP BY parent;
CREATE VIEW items_summ AS
SELECT items.main AS id, count(*) AS items
FROM items
GROUP BY items.main;

Then I update "summ" as follows:

update summ set subs = ms.subs
from main_summ ms where ms.id = summ.id;
update summ set items = its.items
from items_summ its where its.id = summ.id;

The end result is:

dev=> select * from summ order by id;
id | subs | items
----+------+-------
1 | 2 |
2 | | 3
3 | | 2
4 | 3 | 4
5 | |
6 | | 1
7 | | 2
(7 rows)

In the real database, I'd probably have to create 10 views and do 10
updates so any simpler approach would be appreciated. I played with
creating an intermediate table that was populated from UNION SELECTs of
the various GROUP BYs, each with an additional column that coded the
type of relationship, but couldn't figure out how to create the summary
since the coded column has to be used to add the count to either the
"subs" or "items" columns (or subcolumns by type).

Thanks for any suggestions and comments.

Joe

Browse pgsql-sql by date

  From Date Subject
Next Message andrew 2006-01-27 09:10:31 Re: filtering after join
Previous Message Kashmira Patel (kupatel) 2006-01-27 01:01:26 Question about check constraints