Re: FTI Queries and Explain

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Gordan Bobic <gordan(at)bobich(dot)net>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: FTI Queries and Explain
Date: 2001-10-16 14:52:10
Message-ID: 9759.1003243930@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Gordan Bobic <gordan(at)bobich(dot)net> writes:
> [ why is this slow? ]
> explain select jobs.title from jobs, jobs_description_fti,
> jobs_title_fti where (jobs_description_fti.string = 'linux' or
> jobs_title_fti.string = 'linux') and (jobs_description_fti.id = jobs.oid and
> jobs_title_fti.id = jobs.oid);

Because the query is wrong. The way you wrote the WHERE, given a
match in jobs_description_fti and jobs, *any* jobs_title_fti row
with a matching ID will result in an output row. Similarly, given
a match in jobs_title_fti and jobs, *any* jobs_description_fti row
with a matching ID will produce output. So the system generates
what's essentially a doubly nested loop over the insufficiently
constrained tables.

A correct and practical form of the query would be something like

select jobs.title from jobs, jobs_description_fti where
jobs_description_fti.string = 'linux' and jobs_description_fti.id = jobs.oid
union
select jobs.title from jobs, jobs_title_fti where
jobs_title_fti.string = 'linux' and jobs_title_fti.id = jobs.oid;

One of the not-so-pleasant aspects of SQL is that erroneous queries
frequently look like performance problems, because no one waits around
for the enormous result set that the query actually generates ... they
try to debug the performance problem instead of looking to see if the
query requests what they want ...

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message gravity 2001-10-16 14:54:17 Re: WWW interface for postgres
Previous Message Tom Lane 2001-10-16 14:11:50 Re: writing & flushing C extensions