Re: planner row-estimates for tsvector seems horribly wrong

From: Sushant Sinha <sushant354(at)gmail(dot)com>
To: Jan Urbański <wulczer(at)wulczer(dot)org>
Cc: Sushant Sinha <sushant354(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: planner row-estimates for tsvector seems horribly wrong
Date: 2010-10-24 15:26:28
Message-ID: 1287933988.1694.3.camel@yoffice
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks a ton Jan! It works quite correctly. But many tsearch tutorials
ask tsquery to be placed in 'from' statement and that can cause bad
plan. Isn't it possible to return the correct number for a join with the
query as well?

-Sushant.

On Sun, 2010-10-24 at 15:07 +0200, Jan Urbański wrote:
> On 24/10/10 14:44, Sushant Sinha wrote:
> > I am using gin index on a tsvector and doing basic search. I see the
> > row-estimate of the planner to be horribly wrong. It is returning
> > row-estimate as 4843 for all queries whether it matches zero rows, a
> > medium number of rows (88,000) or a large number of rows (726,000).
> >
> > The table has roughly a million docs.
>
> > explain analyze select count(*) from docmeta,
> > plainto_tsquery('english', 'dyfdfdf') as qdoc where docvector @@ qdoc;
>
> OK, forget my previous message. The problem is that you are doing a join
> using @@ as the operator for the join condition, so the planner uses the
> operator's join selectivity estimate. For @@ the tsmatchjoinsel function
> simply returns 0.005.
>
> Try doing:
>
> explain analyze select count(*) from docmeta where docvector @@
> plainto_tsquery('english', 'dyfdfdf');
>
> It should help.
>
> Cheers,
> Jan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Smith 2010-10-24 16:15:54 Re: ask for review of MERGE
Previous Message Jan Urbański 2010-10-24 13:07:58 Re: planner row-estimates for tsvector seems horribly wrong