Join performance

From: "David Fury" <davida(at)teleord(dot)co(dot)uk>
To: pgsql-general(at)postgresql(dot)org
Subject: Join performance
Date: 1999-09-02 13:04:56
Message-ID: 199909021158.LAA00462@wombat.teleordering.co.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I need to find a way of forcing a more efficient execution plan for
the following scenario:

CREATE TABLE live_biblio (
isbn text NOT NULL UNIQUE PRIMARY KEY,
author text,
binding text,
classif text,
editor text,
rrp money,
imprint text,
pub_date date,
title text,
bic1 text);

loaded using COPY, then vacuum analyzed, c. 550,000 rows

CREATE TABLE charts
(
isbn TEXT NOT NULL PRIMARY KEY,
quantity INT4 NOT NULL,
value MONEY NOT NULL,
prev_chart_pos INT4,
weeks_in_chart INT4,
current_pos INT4 NOT NULL
)

loaded using COPY then vacuum analyzed, 5000 rows

PostgreSQL 6.5.1 under RedHat on a twin 450MHz Intel box with 512Mb
memory (nothing else running apart from apache)

The idea of this setup is that we keep a central Bibliographic
database keyed on ISBN and use it as a central reference for various
types of book-sales related data. In this case we have a bestsellers
list (the 'charts' table). To retrieve the top 100 chart with
bibliographic attributes attached, I would do the following.

select cht.current_pos, bib.title, bib.... etc.
from live_biblio bib, charts cht
where bib.isbn = cht.isbn
and current_pos < 101
order by current_pos

However the resulting execution plan (involving a scan on the
live_biblio table) makes use of this technique unworkable (response
timeof c. 15-20 seconds with only one user)

NOTICE: QUERY PLAN:

Sort (cost=1165023.88 rows=184422 width=40)
-> Nested Loop (cost=1165023.88 rows=184422 width=40)
-> Seq Scan on live_biblio bib (cost=31056.12 rows=553155
width=24) -> Index Scan using charts_pkey on charts cht
(cost=2.05 rows=1667 width=16)

What I would really like to see is the restriction on the charts table
being actioned first, with the resulting list of key values driving an
indexed access to the live_biblio table.

Is there any way that I could rearrange things so that Postgres
carries out the join using the primary key indexes of the two tables?
I don't want to have to de-normalise the data to replicate
bibliographic attributes around the various tables of the database.

Any ideas gratefully received.

David.

David Fury
Software Developer

Whitaker BookTrack

d(dot)fury(at)teleord(dot)co(dot)uk Tel: 01252 742578

Browse pgsql-general by date

  From Date Subject
Next Message Christian Denning 1999-09-02 13:08:28 Re: Linux/Postgres 6.5 problems using jdbc w/jdk1.2
Previous Message The Hermit Hacker 1999-09-02 12:45:31 Re: [GENERAL] GEQO and KSQO problem.