Re: I feel the need for speed. What am I doing wrong?

From: "Dann Corbit" <DCorbit(at)connx(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Nigel J(dot) Andrews" <nandrews(at)investsystems(dot)co(dot)uk>
Cc: <pgsql-hackers(at)postgresql(dot)org>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: I feel the need for speed. What am I doing wrong?
Date: 2003-01-07 03:08:56
Message-ID: D90A5A6C612A39408103E6ECDD77B829408A25@voyager.corporate.connx.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

> -----Original Message-----
> From: Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us]
> Sent: Monday, January 06, 2003 5:26 PM
> To: Nigel J. Andrews
> Cc: Dann Corbit; pgsql-hackers(at)postgresql(dot)org;
> pgsql-general(at)postgresql(dot)org
> Subject: Re: [HACKERS] I feel the need for speed. What am I
> doing wrong?
>
>
> "Nigel J. Andrews" <nandrews(at)investsystems(dot)co(dot)uk> writes:
> >> select a."RT_REC_KEY", a."cnxarraycolumn", a."CRC" from
> >> "CNX_DS_53_SIS_STU_OPT_FEE_TB" a left outer join
> >> "CNX_DS2_53_SIS_STU_OPT_FEE_TB" b on ( a."RT_REC_KEY" =
> >> b."RT_REC_KEY" and a."cnxarraycolumn" = b."cnxarraycolumn") where
> >> b.oid is null ;
>
> > I suspect you get no results because it's unlikely b.oid
> will be null.
>
> Try "it's impossible for b.oid to be null --- unless a dummy
> b row is being provided by the LEFT JOIN". I interpret the
> purpose of the query to be to look for "a" rows that have no
> matching "b" row.
>
> Using OID for this is kind of cute, I guess, since it is
> guaranteed not-null in a real row; he doesn't have to think
> about whether any of his regular columns are not-null.

I am very happy to report that PostgreSQL now easily beats MS Access for
speed!
:-)

Here are the MS Access results, where I create on index and then a
second:
Access Outer Join 2 column index on 1st table
02:29.9
2 column index on 2nd table + 2 column index on 1st
table 02:18.3

For PostgreSQL, I originally created a 3 column index (because I have
also 3 column joins in other places) and got this result:
Postgres Outer Join
enable_seqscan = 1
3 column Index 1 12:43.9

Showing the plan showed that the indexes were being ignored.

Yikes! Five times slower! But then I took Tom's incredibly helpful
suggestion to disable the sequential scan:

Postgres Outer Join
enable_seqscan = 0
3 column Index 0 05:17.5

Changed to a 2 column index:
2 column index 0 04:58.3

Added an index to the second table:
2 column index on 2nd table + 2 column index on 1st
table 0 01:53.6

PostgreSQL is now 22% faster than Access (HAPPY DAYS)!

For my application, I happen to know that the data will be approximately
physically clustered, and that the indexes will always be very near
matches for data sets big enough to matter. Benchmarking also showed
that adding the 3rd column to the index was counter productive, even
when the join criteria was a.f1 = b.f1, a.f2=b.f2, a.f3 = b.f3.

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2003-01-07 03:15:33 Re: [HACKERS] I feel the need for speed. What am I doing wrong?
Previous Message Christopher Kings-Lynne 2003-01-07 01:32:25 Re: [GENERAL] www.postgresql.org

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2003-01-07 03:15:33 Re: [HACKERS] I feel the need for speed. What am I doing wrong?
Previous Message Tom Lane 2003-01-07 03:00:23 Re: OS/400 support?