Re: procost for to_tsvector

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: procost for to_tsvector
Date: 2015-05-01 14:01:45
Message-ID: 20150501140145.GJ6342@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, May 1, 2015 at 09:39:43AM -0400, Robert Haas wrote:
> On Fri, May 1, 2015 at 9:13 AM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> > On Fri, May 1, 2015 at 07:57:27AM -0400, Robert Haas wrote:
> >> On Thu, Apr 30, 2015 at 9:34 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> >> > On Wed, Mar 11, 2015 at 02:40:16PM +0000, Andrew Gierth wrote:
> >> >> An issue that comes up regularly on IRC is that text search queries,
> >> >> especially on relatively modest size tables or for relatively
> >> >> non-selective words, often misplan as a seqscan based on the fact that
> >> >> to_tsvector has procost=1.
> >> >>
> >> >> Clearly this cost number is ludicrous.
> >> >>
> >> >> Getting the right cost estimate would obviously mean taking the cost of
> >> >> detoasting into account, but even without doing that, there's a strong
> >> >> argument that it should be increased to at least the order of 100.
> >> >> (With the default cpu_operator_cost that would make each to_tsvector
> >> >> call cost 0.25.)
> >> >>
> >> >> (The guy I was just helping on IRC was seeing a slowdown of 100x from a
> >> >> seqscan in a query that selected about 50 rows from about 500.)
> >> >
> >> > Where are we on setting increasing procost for to_tsvector?
> >>
> >> We're waiting for you to commit the patch.
> >
> > OK, I have to write the patch first, so patch attached, using the cost
> > of 10. I assume to_tsvector() is the ony one needing changes. The
> > patch will require a catalog bump too.
>
> Andrew did the research to support a higher value, but even 10 should
> be an improvement over what we have now.

Yes, I saw that, but I didn't see him recommend an actual number. Can
someone recommend a number now? Tom initially recommended 10, but
Andrew's tests suggest something > 100. Tom didn't do any tests so I
tend to favor Andrew's suggestion, if he has one.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2015-05-01 14:03:01 Re: procost for to_tsvector
Previous Message Andres Freund 2015-05-01 13:58:00 Re: INSERT ... ON CONFLICT UPDATE/IGNORE 4.0