Re: performance advice needed: join vs explicit subselect

From: Alban Hertroys <dalroi(at)solfertje(dot)student(dot)utwente(dot)nl>
To: Karsten Hilbert <Karsten(dot)Hilbert(at)gmx(dot)net>
Cc: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: performance advice needed: join vs explicit subselect
Date: 2009-01-27 19:04:02
Message-ID: 68646613-DE83-4BF7-99A4-A73EBDD5E860@solfertje.student.utwente.nl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Jan 27, 2009, at 7:12 PM, Karsten Hilbert wrote:

> Hello all,
>
> maybe some general advice can be had on this:
>
> table test_results
> modified_by integer foreign key staff(pk),
> intended_reviewer integer foreign key staff(pk),
> actual_reviewer integer foreign key staff(pk)
>
> (this table will contain millions of rows)
>
> table staff
> pk integer
> name text
>
> (this table will contain at most 50 rows)
>
> Now I want to set up a view which aggregates test results
> with staff names for all three foreign keys. This would mean
> I would either have to
>
> - join test_results to staff three times, once for each
> of the foreign keys, this is going to be messy with
> tracking table aliases, duplicate column names etc
>
> - write three explicit sub-selects for the columns I want
> to denormalize into the view definition
>
> Is there general advice as to which of the alternatives is
> worse under most if not all circumstances ?

I did something similar once using expression logic for my aggregates:

SELECT
SUM(CASE WHEN modified_by = pk THEN 1 ELSE 0 END) AS modified_by_count,
SUM(CASE WHEN intended_reviewer = pk THEN 1 ELSE 0 END) AS
intended_reviewer_count,
SUM(CASE WHEN actual_reviewer = pk THEN 1 ELSE 0 END) AS
actual_reviewer_count
FROM test_results, staff
WHERE pk IN (modified_by, intended_reviewer, actual_reviewer)

Mind, this will very probably do a sequential scan over the product of
both tables, but at least now the staff table is in that product only
once.

In actuality I didn't use CASE statements but cast the boolean results
of the expressions directly to integer, something like
SUM((modified_by = pk)::int), but that cast may no longer work since
8.3.

I no longer have access to the project that I used this on, so I can't
verify unfortunately.

Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.

!DSPAM:737,497f5aa8747035160810079!

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Gabi Julien 2009-01-27 19:28:50 Fwd: Re: New 8.4 hot standby feature
Previous Message Sam Mason 2009-01-27 18:48:11 Re: performance advice needed: join vs explicit subselect