Skip site navigation (1) Skip section navigation (2)

Re: Querying multiple tables

From: Michael Glaesemann <grzm(at)seespotcode(dot)net>
To: Paul Malherbe <paul(at)tartan(dot)co(dot)za>
Cc: pgsql-novice(at)postgresql(dot)org
Subject: Re: Querying multiple tables
Date: 2007-08-01 20:49:31
Message-ID: 63CC1079-5011-4EF2-B1F8-2626A5BAEE05@seespotcode.net (view raw or flat)
Thread:
Lists: pgsql-novice
On Aug 1, 2007, at 14:04 , Paul Malherbe wrote:

> Sorry about that, here is an example of what I am doing and what I  
> would like to achieve.

Thanks. Providing an example of the results you're seeking is very  
useful, as is providing example data.

 From your definitions (and the fact that you're posting to the - 
novice list), I suspect you're quite new to SQL. I'll point out some  
things I think you should do to improve your code.

> Create table rcaowm (rom_owner varchar(7), rom_name varchar(30));
> Create unique index rcaowm_key1 on rcaowm (rom_owner);

SQL syntax is case insensitive (except for identifiers if they're  
quoted). Capitalizing the first letter of the statement is  
unnecessary. It doesn't matter, but it looks a little odd.

Also, I think you should take some time to learn about database  
normalization constraints such as NOT NULL, PRIMARY KEY and FOREIGN  
KEY. These are really fundamental to using a relational database  
server. I would rewrite the above statements as:

CREATE TABLE rcaowm
(
     rom_owner TEXT PRIMARY KEY
     , rom_name TEXT NOT NULL
);

For the most part, PRIMARY KEY is shorthand for UNIQUE NOT NULL. I  
recommend setting at NOT NULL constraint on your columns by default  
as well, as it will prevent perhaps unexpected query results,  
especially if you're just starting out.

As for using TEXT instead of varchar(7), there's no performance  
advantage to specifying the length of the varchar column. Unless you  
have a compelling business reason to restrict the number of  
characters, I'd leave it as TEXT.

> Insert into rcaowm values ('OOOOOOO', 'Owners Name');

Be explicit when naming columns in an INSERT statement. This  
increases readability and prevents unexpected bugs caused by changes  
in schema definition:

INSERT INTO rcaowm (rom_owner, rom_name)  VALUES  ('OOOOOOO', 'Owners  
Name');

> Create table rcaowt (rot_owner varchar(7), rot_date int4, rot_tramt  
> decimal);

CREATE TABLE rcaowt (
     rot_owner TEXT NOT NULL
         REFERENCES rcaowm
     , rot_date DATE NOT NULL
     , rot_tramt DECIMAL NOT NULL
);

I assume that the value of the rot_owner column must appear in the  
rcaowm table in the rom_owner column. If this is so, set a foreign  
key constraint so the database server can enforce it. That's what the  
"REFERENCES rcaowm" clause does. If no column is listed in the  
REFERENCES clause, it references the primary key of the referenced  
table, in this case rom_owner. To be explicit, it could have been  
written REFERENCES rcaowm (rom_owner).

Also, use appropriate datatypes. rot_date is a date, so use a DATE  
column. This adds an additional constraint to be sure only valid date  
values can be submitted. Right now, there's nothing to prevent  
someone from doing something like:

INSERT INTO rcaowt (rot_owner, rot_date, rot_tramt) VALUES  
('OOOOOOO', '20074533', 5000);

Using a date column will prevent that. Also, there are a lot of  
functions that are available for handling dates which you might find  
useful.

> Create index rcaowt_key1 on rcaowt (rot_owner);

A plain index doesn't really add anything to the schema definition,  
though it can be used by the database server to look up information  
in the table.

> Create table rcatnm (rtm_owner varchar(7), rtm_tenant varchar(7),  
> rtm_name varchar(30));
> Create unique index rcatnm_key1 on rcatnm (rtm_owner, rtm_tenant);
> Insert into rcatnm values ('OOOOOOO', 'TTTTTTT', 'Tenants Name');

Same as above: use TEXT instead of VARCHAR(n), use NOT NULL, PRIMARY  
KEY and REFERENCES, list columns in INSERT statements.

> Create table rcatnt (rtt_owner varchar(7), rtt_tenant varchar(7),  
> rtt_date int4, rtt_tramt decimal);

This is a question I had in my first response: can rtt_owner be  
anything other than the rot_owner for rtt_tenant? If so, your table  
definition should probably be

CREATE TABLE rcatnt
(
     rtt_owner TEXT NOT NULL
         REFERENCES rcaowm
     , rtt_tenant TEXT NOT NULL
         REFERENCES rcaowt
     , rtt_date DATE NOT NULL
     , rtt_tramt DECIMAL NOT NULL
);

If the rtt_owner must be the same as rot_owner, just leave it out.  
You can always look up the owner by joining through the rtt_tenant  
table.

CREATE TABLE rcatnt
(
     rtt_tenant TEXT NOT NULL
         REFERENCES rcaowt
     , rtt_date DATE NOT NULL
     , rtt_tramt DECIMAL NOT NULL
);

In either case, it's not obvious to me what the primary key on this  
table should be. You need some way to uniquely identify each row in  
the table.

> Create index rcatnt_key1 on rcatnt (rtt_owner, rtt_tenant);

Again, this index doesn't do anything for your schema definition.

> Select rom_name as "Owner_Name", rtm_name as "Tenant_Name", sum 
> (rot_tramt) as "Sum_Owner_Value" , sum(rtt_tramt) as  
> "Sum_Tenant_Value" from rcaowm join rcaowt on (rom_owner =  
> rot_owner) join rcatnt on (rom_owner = rtt_owner) join (select  
> rtm_tenant, rtm_name from rcatnm) as tenants on (rtm_tenant =  
> rtt_tenant) group by rom_name, rtm_name;

Do you find this easy to read? Personally, I don't. It helps if you  
can format the query to make it easy for others (and yourself!) to read.

> This query returns:
>
> Owner_Name  | Tenant_Name  | Sum_Owner_Value | Sum_Tenant_Value
> -------------+--------------+-----------------+------------------
> Owners Name | Tenants Name |         -7202.0 |          12000.0
>
>
> I would like to return:
>
> Owner_Name  | Tenant_Name  | Sum_Owner_Value | Sum_Tenant_Value
> -------------+--------------+-----------------+------------------
> Owners Name | Tenants Name |         -3601.0 |           4000.0

This should get you most of the way there, as it gives the amounts  
you want. However, it's not clear to me that this is *really* the  
result you're looking for because of the ambiguity as to what owner  
you're looking for (see my questions about the rcatnt table).

SELECT rot_owner, sum_rot_tramt, rtt_tenant, sum_rtt_tramt
FROM rcatnm
JOIN (
     SELECT rot_owner,  sum(rot_tramt) as sum_rot_tramt
     FROM rcaowt
     GROUP BY rot_owner) o ON (rtm_owner = rot_owner)
JOIN (
     SELECT rtt_tenant, sum(rtt_tramt) as sum_rtt_tramt
     FROM rcatnt
     GROUP BY rtt_tenant ) t ON (rtm_tenant = rtt_tenant);
  rot_owner | sum_rot_tramt | rtt_tenant | sum_rtt_tramt
-----------+---------------+------------+---------------
OOOOOOO   |       -3601.0 | TTTTTTT    |        4000.0
(1 row)

Hope this gets you pointed in the right direction. And I still  
recommend naming your tables and columns with less cryptic names. I  
know I had to repeatedly look up the table and column names when  
working on this, and that's not time well spent. Letters are cheap!

Good luck!

Michael Glaesemann
grzm seespotcode net



In response to

Responses

pgsql-novice by date

Next:From: Paul MalherbeDate: 2007-08-02 03:30:25
Subject: Re: Querying multiple tables
Previous:From: Paul MalherbeDate: 2007-08-01 19:04:31
Subject: Re: Querying multiple tables

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group