Re: Full text search with ORDER BY performance issue

From: Krade <krade(at)krade(dot)com>
To: Devin Ben-Hur <dbenhur(at)whitepages(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Full text search with ORDER BY performance issue
Date: 2009-07-21 03:35:11
Message-ID: 4A65376F.40508@krade.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 7/21/2009 2:13, Devin Ben-Hur wrote:
> Have you tried make the full-text condition in a subselect with
> "offset 0" to stop the plan reordering?
>
> eg:
>
> select *
> from (
> select * from a where comment_tsv @@ plainto_tsquery('love')
> offset 0
> ) xx
> order by timestamp DESC
> limit 24
> offset 0;
>
>
> See http://blog.endpoint.com/2009/04/offset-0-ftw.html
Yes, that does force the planner to always pick the full text index
first rather than the timestamp index. I managed to force that by doing
something a lot more drastic, I just dropped my timestamp index
altogether, since I never used it for anything else. (I mentioned this
in my original post)

Though, that comment did make me try to readd it. I was pretty
surprised, the planner was only doing backward searches on the timestamp
index for very common words (therefore turning multi-minute queries into
very fast ones), as opposed to trying to use the timestamp index for all
queries. I wonder if this is related to tweaks to the planner in 8.4 or
if it was just my statistics that got balanced out.

I'm not entirely happy, because I still easily get minute long queries
on common words, but where the planner choses to not use the timestamp
index. The planner can't guess right all the time.

But I think I might just do:
select * from a where comment_tsv @@ plainto_tsquery('query') and
timestamp > cast(floor(extract(epoch from CURRENT_TIMESTAMP) - 864000)
as integer) order by timestamp desc limit 24 offset 0;

And if I get less than 24 rows, issue the regular query:

select * from a where comment_tsv @@ plainto_tsquery('query') order by
timestamp desc limit 24 offset 0;

I pay the price of doing two queries when I could have done just one,
and it does make almost all queries about 200 ms slower, but it does so
at the expense of turning the few very slow queries into quick ones.

Thanks for all the help.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Scott Marlowe 2009-07-21 04:06:49 Re: Full text search with ORDER BY performance issue
Previous Message Devin Ben-Hur 2009-07-21 01:13:43 Re: Full text search with ORDER BY performance issue