Skip site navigation (1) Skip section navigation (2)

Re: Querying multiple tables

From: Paul Malherbe <paul(at)tartan(dot)co(dot)za>
To: Michael Glaesemann <grzm(at)seespotcode(dot)net>
Cc: pgsql-novice(at)postgresql(dot)org
Subject: Re: Querying multiple tables
Date: 2007-08-02 03:30:25
Message-ID: 46B14FD1.6020209@tartan.co.za (view raw or flat)
Thread:
Lists: pgsql-novice
Michael Glaesemann wrote:
>
> On Aug 1, 2007, at 14:04 , Paul Malherbe wrote:
>
>> Sorry about that, here is an example of what I am doing and what I 
>> would like to achieve.
>
> Thanks. Providing an example of the results you're seeking is very 
> useful, as is providing example data.
>
> From your definitions (and the fact that you're posting to the -novice 
> list), I suspect you're quite new to SQL. I'll point out some things I 
> think you should do to improve your code.
>
>> Create table rcaowm (rom_owner varchar(7), rom_name varchar(30));
>> Create unique index rcaowm_key1 on rcaowm (rom_owner);
>
> SQL syntax is case insensitive (except for identifiers if they're 
> quoted). Capitalizing the first letter of the statement is 
> unnecessary. It doesn't matter, but it looks a little odd.
>
> Also, I think you should take some time to learn about database 
> normalization constraints such as NOT NULL, PRIMARY KEY and FOREIGN 
> KEY. These are really fundamental to using a relational database 
> server. I would rewrite the above statements as:
>
> CREATE TABLE rcaowm
> (
>     rom_owner TEXT PRIMARY KEY
>     , rom_name TEXT NOT NULL
> );
>
> For the most part, PRIMARY KEY is shorthand for UNIQUE NOT NULL. I 
> recommend setting at NOT NULL constraint on your columns by default as 
> well, as it will prevent perhaps unexpected query results, especially 
> if you're just starting out.
>
> As for using TEXT instead of varchar(7), there's no performance 
> advantage to specifying the length of the varchar column. Unless you 
> have a compelling business reason to restrict the number of 
> characters, I'd leave it as TEXT.
>
>> Insert into rcaowm values ('OOOOOOO', 'Owners Name');
>
> Be explicit when naming columns in an INSERT statement. This increases 
> readability and prevents unexpected bugs caused by changes in schema 
> definition:
>
> INSERT INTO rcaowm (rom_owner, rom_name)  VALUES  ('OOOOOOO', 'Owners 
> Name');
>
>> Create table rcaowt (rot_owner varchar(7), rot_date int4, rot_tramt 
>> decimal);
>
> CREATE TABLE rcaowt (
>     rot_owner TEXT NOT NULL
>         REFERENCES rcaowm
>     , rot_date DATE NOT NULL
>     , rot_tramt DECIMAL NOT NULL
> );
>
> I assume that the value of the rot_owner column must appear in the 
> rcaowm table in the rom_owner column. If this is so, set a foreign key 
> constraint so the database server can enforce it. That's what the 
> "REFERENCES rcaowm" clause does. If no column is listed in the 
> REFERENCES clause, it references the primary key of the referenced 
> table, in this case rom_owner. To be explicit, it could have been 
> written REFERENCES rcaowm (rom_owner).
>
> Also, use appropriate datatypes. rot_date is a date, so use a DATE 
> column. This adds an additional constraint to be sure only valid date 
> values can be submitted. Right now, there's nothing to prevent someone 
> from doing something like:
>
> INSERT INTO rcaowt (rot_owner, rot_date, rot_tramt) VALUES ('OOOOOOO', 
> '20074533', 5000);
>
> Using a date column will prevent that. Also, there are a lot of 
> functions that are available for handling dates which you might find 
> useful.
>
>> Create index rcaowt_key1 on rcaowt (rot_owner);
>
> A plain index doesn't really add anything to the schema definition, 
> though it can be used by the database server to look up information in 
> the table.
>
>> Create table rcatnm (rtm_owner varchar(7), rtm_tenant varchar(7), 
>> rtm_name varchar(30));
>> Create unique index rcatnm_key1 on rcatnm (rtm_owner, rtm_tenant);
>> Insert into rcatnm values ('OOOOOOO', 'TTTTTTT', 'Tenants Name');
>
> Same as above: use TEXT instead of VARCHAR(n), use NOT NULL, PRIMARY 
> KEY and REFERENCES, list columns in INSERT statements.
>
>> Create table rcatnt (rtt_owner varchar(7), rtt_tenant varchar(7), 
>> rtt_date int4, rtt_tramt decimal);
>
> This is a question I had in my first response: can rtt_owner be 
> anything other than the rot_owner for rtt_tenant? If so, your table 
> definition should probably be
>
> CREATE TABLE rcatnt
> (
>     rtt_owner TEXT NOT NULL
>         REFERENCES rcaowm
>     , rtt_tenant TEXT NOT NULL
>         REFERENCES rcaowt
>     , rtt_date DATE NOT NULL
>     , rtt_tramt DECIMAL NOT NULL
> );
>
> If the rtt_owner must be the same as rot_owner, just leave it out. You 
> can always look up the owner by joining through the rtt_tenant table.
>
> CREATE TABLE rcatnt
> (
>     rtt_tenant TEXT NOT NULL
>         REFERENCES rcaowt
>     , rtt_date DATE NOT NULL
>     , rtt_tramt DECIMAL NOT NULL
> );
>
> In either case, it's not obvious to me what the primary key on this 
> table should be. You need some way to uniquely identify each row in 
> the table.
>
>> Create index rcatnt_key1 on rcatnt (rtt_owner, rtt_tenant);
>
> Again, this index doesn't do anything for your schema definition.
>
>> Select rom_name as "Owner_Name", rtm_name as "Tenant_Name", 
>> sum(rot_tramt) as "Sum_Owner_Value" , sum(rtt_tramt) as 
>> "Sum_Tenant_Value" from rcaowm join rcaowt on (rom_owner = rot_owner) 
>> join rcatnt on (rom_owner = rtt_owner) join (select rtm_tenant, 
>> rtm_name from rcatnm) as tenants on (rtm_tenant = rtt_tenant) group 
>> by rom_name, rtm_name;
>
> Do you find this easy to read? Personally, I don't. It helps if you 
> can format the query to make it easy for others (and yourself!) to read.
>
>> This query returns:
>>
>> Owner_Name  | Tenant_Name  | Sum_Owner_Value | Sum_Tenant_Value
>> -------------+--------------+-----------------+------------------
>> Owners Name | Tenants Name |         -7202.0 |          12000.0
>>
>>
>> I would like to return:
>>
>> Owner_Name  | Tenant_Name  | Sum_Owner_Value | Sum_Tenant_Value
>> -------------+--------------+-----------------+------------------
>> Owners Name | Tenants Name |         -3601.0 |           4000.0
>
> This should get you most of the way there, as it gives the amounts you 
> want. However, it's not clear to me that this is *really* the result 
> you're looking for because of the ambiguity as to what owner you're 
> looking for (see my questions about the rcatnt table).
>
> SELECT rot_owner, sum_rot_tramt, rtt_tenant, sum_rtt_tramt
> FROM rcatnm
> JOIN (
>     SELECT rot_owner,  sum(rot_tramt) as sum_rot_tramt
>     FROM rcaowt
>     GROUP BY rot_owner) o ON (rtm_owner = rot_owner)
> JOIN (
>     SELECT rtt_tenant, sum(rtt_tramt) as sum_rtt_tramt
>     FROM rcatnt
>     GROUP BY rtt_tenant ) t ON (rtm_tenant = rtt_tenant);
>  rot_owner | sum_rot_tramt | rtt_tenant | sum_rtt_tramt
> -----------+---------------+------------+---------------
> OOOOOOO   |       -3601.0 | TTTTTTT    |        4000.0
> (1 row)
>
> Hope this gets you pointed in the right direction. And I still 
> recommend naming your tables and columns with less cryptic names. I 
> know I had to repeatedly look up the table and column names when 
> working on this, and that's not time well spent. Letters are cheap!
>
> Good luck!
>
> Michael Glaesemann
> grzm seespotcode net
>
>
>
Thanks for all your suggestions and explanations, they are very helpfull!

Regards, Paul

In response to

pgsql-novice by date

Next:From: Ralf StrandellDate: 2007-08-02 12:01:28
Subject: Postgresql on 64bit Debian 4.0: libz.so.1 => not found
Previous:From: Michael GlaesemannDate: 2007-08-01 20:49:31
Subject: Re: Querying multiple tables

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group