Re: Recursive Arrays 101

From: Alban Hertroys <haramrae(at)gmail(dot)com>
To: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
Cc: David Blomstrom <david(dot)blomstrom(at)gmail(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: Recursive Arrays 101
Date: 2015-10-25 21:07:55
Message-ID: 99B37359-1539-4C0C-8EB3-38AE84B2585A@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


> On 25 Oct 2015, at 19:38, Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com> wrote:
>
> On 10/25/2015 11:12 AM, David Blomstrom wrote:
>> I'm sorry, I don't know exactly what you mean by "definitions." The
>> fields Taxon and Parent are both varchar, with a 50-character limit.
>> ParentID is int(1).
>
> By definition I meant the schema, so from the below:
>
> CREATE TABLE t (
> N INT(6) default None auto_increment,
> Taxon varchar(50) default NULL,
> Parent varchar(25) default NULL,
> NameCommon varchar(50) default NULL,
> Rank smallint(2) default 0
> PRIMARY KEY (N)
> ) ENGINE=MyISAM

That can indeed be solved using a hierarchical query (provided you have a suitable table in PG); something akin to:

WITH RECURSIVE taxons AS (
-- Hierarchical root nodes
SELECT N AS id, Taxon, Rank, 1 AS level, '' || N AS Path -- A useful addition explained further down
FROM t
WHERE ParentID IS NULL

-- Child nodes
UNION ALL
SELECT N AS id, Taxon, Rank, taxons.level +1 AS level, taxons.Path || ':' || N AS Path
FROM taxons
JOIN t ON taxons.id = t.ParentID
)
SELECT id, Taxon, Rank, level
FROM taxons
ORDER BY Path
;

The Path-bit looks complicated, but basically that just appends ID's within the same hierarchy such that, when sorted on that field, you get the hierarchy in their hierarchical order. What the hierarchy would look like if it were shown as a file hierarchy with sub-directories expanded, for example. That's pretty much the only viable alternative (alternatives vary on the column used to create the hierarchy), which is why I added it to the example.

The fun thing with hierarchical queries is that you can add all kinds of extra information and make it trickle down to the child nodes, such as the items that make up the root of the hierarchy (pretty useful for grouping), for example or a field that calculates a string to prepend for indentation, etc. Or a computation that depends on values in parent items (I used this successfully in a bill of materials to calculate absolute quantities by volume, quantities by weight and cost of components in the end product where they were given relative to 1 kg of their parent, for example).

It's highly flexible and powerful (and standard SQL), but it takes a bit of time to get in the right mindset.

PS. I usually write my hierarchical queries in Oracle, which isn't quite as good at them as Postgres is, but it's what we have @work. Hence, I'm not sure I got the syntax 100% correct. We're working on getting PG in for a project upgrade (replacing RDB on OpenVMS, which will go EOL in <10 years!) - fingers crossed.

Cheers!

Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Rikard Pavelic 2015-10-25 21:31:03 function null composite behavior
Previous Message Adrian Klaver 2015-10-25 18:38:09 Re: Recursive Arrays 101