Aggregating over nodes in hierarchical trees

From: "McGehee, Robert" <Robert(dot)McGehee(at)geodecapital(dot)com>
To: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Aggregating over nodes in hierarchical trees
Date: 2014-09-09 15:44:31
Message-ID: 97A7C8EACC6CE247AAE690059B45D178981DA98125@MSGRTPCCRN2WIN.dmn1.fmr.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello,
I have data in a hierarchical tree structure in one table (tree_tbl) containing the number of students enrolled in a college course. The college course name is stored as a tree giving the college name, department, and course name, and perhaps other descendent nodes such as section name. I have another table (node_tbl) that contains the particular nodes I am interested in such as the name of the college, department or course. I'd like to join the two tables so I can aggregate the total students in all courses containing the node names in the node_tbl. Unfortunately, I've been unable to solve this problem and was hoping for help. I had tried using the ltree data type for this purpose, but am open to other suggestions if a better data type exists.

Here is an example of what I'm trying to do:

CREATE TABLE tree_tbl (course LTREE, students INTEGER);
INSERT INTO tree_tbl VALUES ('Arts.English.Shakespeare', 10);
INSERT INTO tree_tbl VALUES ('Arts.Music.Composition', 15);
INSERT INTO tree_tbl VALUES ('Arts.Music.Theory', 11);
INSERT INTO tree_tbl VALUES ('Science.Math.Algebra', 21);
INSERT INTO tree_tbl VALUES ('Science.Biology.IntroBio.SectionA', 20);
INSERT INTO tree_tbl VALUES ('Science.Biology.IntroBio.SectionB', 30);
INSERT INTO tree_tbl VALUES ('Science.Biology.Genetics', 3);

CREATE TABLE node_tbl (node TEXT);
INSERT INTO node_tbl VALUES ('Arts');
INSERT INTO node_tbl VALUES ('English');
INSERT INTO node_tbl VALUES ('Biology');

I would then like to join these two tables so that I can sum the total number of students in each area. Something like this:

SELECT n.node, sum(students) as students
FROM tree_tbl t, node_tbl n
WHERE t.course ~ '.*' || n.node || '.*'
GROUP BY n.node;

My hope that this query would return this:
node students
------- --------
Arts 36
English 10
Biology 53

Unfortunately, this query gives the error:
ERROR: syntax error at position 0
LINE 3: WHERE t.course ~ '.*' || n.node || '.*'
^

Any help with this problem would be greatly appreciated.

Thanks,
Robert McGehee

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2014-09-09 16:39:14 Re: Issue with to_timestamp function
Previous Message John R Pierce 2014-09-09 15:39:48 Re: stackbuilder