Quick Links

Re: [SQL] Beginner Join question

From:	Herouth Maoz <herouth(at)oumail(dot)openu(dot)ac(dot)il>
To:	calvin(at)improtech(dot)co(dot)za (Calvin Browne)
Cc:	pgsql-sql(at)postgreSQL(dot)org
Subject:	Re: [SQL] Beginner Join question
Date:	1998-06-16 15:47:25
Message-ID:	l03110706b1ac40b852be@[147.233.159.109]
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-sql

At 18:17 +0300 on 16/6/98, Calvin Browne wrote:

> TABLE A ( TABLE B(
> record_id record_id
> field1 field3
> field2) field4)
>
> record_id in both tables are the same.
> The record definitely exists in TABLE A, but not
> necessarily in TABLE B.
>
> I need to get the following:
>
> record_id | A.field1 | A.field2 | B.field3
> -------------------------------------------------------------------
> rec1 | xxxx | xxxxxx | Field displayed if there is
> | | | a record in B, such that A.record_id
> | | | is equal to B.record.id, otherwise
> | | | leave it blank.
> -------------------------------------------------------------------
>
> I'm going to have about a million records in each table, so
> efficiency of the join is a priority. Disk space is not.

Seems you need an outer join. Not supported yet in PostgreSQL. You have one
of two options:

Make a union between a "normal" join and a NOT EXISTS query.

Define an SQL function that returns the value of field3 based on given
record_id. I like this one better, because I think the union in the first
solution, in addition to the NOT EXISTS query, may make the overhead of the
function negligible.

However, if you need to have more than just one field in the query, you'll
have to define two functions, and since each of them starts its own query
whenever invoked, efficiency will deteriorate.

So:

-- Given the following two tables:

testing=> SELECT * FROM example1;
record_id|field1|field2
---------+------+------
1| 10| 100
2| 20| 200
3| 30| 300
4| 40| 400
5| 50| 500
(5 rows)

testing=> SELECT * FROM example2;
record_id|field3|field4
---------+------+------
1| 1000| 10000
3| 3000| 30000
5| 5000| 50000
(3 rows)

-- We create the following function

testing=> CREATE FUNCTION ex2_fld3( int4 ) RETURNS int4
testing-> AS 'SELECT field3 FROM example2 WHERE record_id = $1'
testing-> LANGUAGE 'sql';
CREATE

-- And here is your query:

testing=> SELECT record_id, field1, field2, ex2_fld3( record_id )
testing-> FROM example1;
record_id|field1|field2|ex2_fld3
---------+------+------+--------
1| 10| 100| 1000
2| 20| 200|
3| 30| 300| 3000
4| 40| 400|
5| 50| 500| 5000
(5 rows)

Hope this is efficient enough for you. Don't forget to create an index on
record_id in your TABLE B.

Herouth

--
Herouth Maoz, Internet developer.
Open University of Israel - Telem project
http://telem.openu.ac.il/~herutma

In response to

Beginner Join question at 1998-06-16 15:17:54 from Calvin Browne

Browse pgsql-sql by date

	From	Date	Subject
Next Message	Sebastien Flacher	1998-06-16 17:05:04	troubles with refint.c
Previous Message	Jose' Soares Da Silva	1998-06-16 15:24:47	Re: [SQL] handle the MONEY type