Re: SeqScan with full text search

From: Tomas Vondra <tv(at)fuzzy(dot)cz>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: SeqScan with full text search
Date: 2012-04-16 16:19:16
Message-ID: 4F8C4684.5040609@fuzzy.cz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 16.4.2012 16:02, Tomek Walkuski wrote:
> Hello group!
>
> I have query like this:
>
> SELECT
> employments.candidate_id AS candidate_id,
> SUM(TS_RANK(employers.search_vector, TO_TSQUERY('simple', 'One:* |
> Two:* | Three:* | Four:*'), 2)) AS ts_rank
> FROM
> employments
> INNER JOIN
> employers ON employments.employer_id = employers.id
> AND
> employers.search_vector @@ TO_TSQUERY('simple', 'One:* | Two:* |
> Three:* | Four:*')
> GROUP BY
> candidate_id;
>
> And it results with this:
>
> http://explain.depesz.com/s/jLM
>
> The JOIN between employments and employers is the culprit. I'm unable
> to get rid of the seq scan, and setting enable_seqscan to off makes
> things even worse.
>
> Is there any way to get rid of this JOIN?

Well, it's clearly the seqscan that takes most time, and it seems that
you really need to scan the whole table because you're asking 'for each
employment of all the candidates ...'

So you really need to scan all 1.6 million rows to get the result. And
seqscan is the best way to do that.

I don't see a way to remove the join and/or seqscan, unless you want to
keep a 'materialized view' maintained by a trigger or something ...

Another option is to make the employment table as small as possible
(e.g. removing columns that are not needed etc.) so that the seqscan is
faster.

>
> What info should I post to debug this easier?

1) structures of the tables
2) what amount of data are we talking about
3) was this the first run (with cold caches) or have you run that
several times?
4) basic system info (RAM, CPU, shared_buffers etc.)

Tomas

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Glyn Astill 2012-04-16 16:31:47 Re: H800 + md1200 Performance problem
Previous Message Scott Marlowe 2012-04-16 16:18:14 Re: H800 + md1200 Performance problem