Skip site navigation (1) Skip section navigation (2)

Re: Q on views and performance

From: Matthew <matthew(at)flymine(dot)org>
To: Kynn Jones <kynnjo(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Q on views and performance
Date: 2008-02-25 13:45:34
Message-ID: Pine.LNX.4.64.0802251326010.20402@aragorn.flymine.org (view raw or flat)
Thread:
Lists: pgsql-performance
On Fri, 22 Feb 2008, Kynn Jones wrote:
> Hi.  I'm trying to optimize...
>
> (Q1)   SELECT a1.word, a2.word
>         FROM T a1 JOIN T a2 USING ( zipk )
>        WHERE a1.type = <int1>
>          AND a2.type = <int2>;

Okay, try this:

Create an index on T(type, zipk), and then CLUSTER on that index. That 
will effectively group all the data for one type together and sort it by 
zipk, making a merge join very quick indeed. I'm not sure whether Postgres 
will notice that, but it's worth a try.

> More specifically, how can I go about building table T and the views
> V<int?>'s to maximize the performance of (Q1)?  For example, I'm thinking
> that if T had an additional id column and were built in such a way that all
> the records belonging to each V<int?> were physically contiguous, and (say)
> had contiguous values in the id column, then I could define each view like
> this

The above index and CLUSTER will effectively do this - you don't need to 
introduce another field.

Alternatively, you could go *really evil* and pre-join the table. 
Something like this:

CREATE TABLE evilJoin AS SELECT a1.type AS type1, a2.type AS type2,
     a1.zipk, a1.word AS word1, a2.word AS word2
   FROM T AS a1, T AS a2
   WHERE a1.zipk = a2.zipk
   ORDER BY a1.type, a2.type, a1.zipk;
CREATE INDEX evilIndex1 ON evilJoin(type1, type2, zipk);

Then your query becomes:

SELECT word1, word2
    FROM evilJoin
    WHERE type1 = <int1>
      AND type2 = <int2>

which should run quick. However, your cache usefulness will be reduced 
because of the extra volume of data.

Matthew

-- 
[About NP-completeness] These are the problems that make efficient use of
the Fairy Godmother.                    -- Computer Science Lecturer

In response to

Responses

pgsql-performance by date

Next:From: MatthewDate: 2008-02-25 14:08:06
Subject: Re: Q on views and performance
Previous:From: MatthewDate: 2008-02-25 13:11:04
Subject: Re: CORRECTION to msg 'loading same instance of dump to two different servers simultaneously'

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group