Re: Custom operator class costs

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Ben Beecher <benbeecher(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Custom operator class costs
Date: 2011-04-18 16:24:03
Message-ID: BANLkTim+L++hVHwvYRLPkuHsHptvBWKLow@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Wed, Mar 16, 2011 at 10:10 AM, Ben Beecher <benbeecher(at)gmail(dot)com> wrote:
> Hey!
> I'm having some trouble optimizing a query that uses a custom operator class.
> #Postgres has given me a solution for natural sort -
> http://www.rhodiumtoad.org.uk/junk/naturalsort.sql
>
> I'm trying to run it over a huge table - when running it on demand,
> the data needs to be dumped to memory and sorted.
>
> Sort  (cost=31299.83..31668.83 rows=369 width=31)
>  Sort Key: name
>  ->  Seq Scan on solutions_textbookpage  (cost=0.00..25006.55
> rows=369 width=31)
>        Filter: (active AND (textbook_id = 263))
>
> That's obviously too slow. I've created an index using the custom
> operator class, so I don't have to do the sort every time I try to
> sort.
>
>  Index Scan Backward using natural_page_name_textbook on
> solutions_textbookpage  (cost=0.00..650.56 rows=371 width=31) (actual
> time=0.061..0.962 rows=369 loops=1)
>   Index Cond: (textbook_id = 263)
>   Filter: active
>
> Obviously a little faster!

Not totally obvious, since the sort output doesn't show how long it
actually took.

> The problem I'm having is that because operator classes have a low
> cost estimation pg missestimates and tries to do the sort on demand
> rather than using the index.
>
> I can get pg to use the index by either jacking up cpu_operator_cost
> or lowering random_page_cost. Is this the best way to do that, or is
> there a smarter way to ensure that pg uses this index when I need it.

It's pretty often necessary to lower random_page_cost, and sometimes
seq_page_cost, too. If, for example, the database is fully cached,
you might try 0.1/0.1 rather than the default 4/1. Raising the cpu_*
costs is equivalent, but I think it's easier to keep in your head if
you think about 1 as the nominal cost of reading a page sequentially
from disk, and then lower the value you actually assign to reflect the
fact that you'll normally be reading from the OS cache or perhaps even
hitting shared_buffers.

You might also need to tune effective_cache_size.

Is your operator class function unusually expensive? Are you having
trouble with PG not using other indexes it should be picking up, or
just your custom one?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Robert Haas 2011-04-18 16:51:59 Re: Adding additional index causes 20,000x slowdown for certain select queries - postgres 9.0.3
Previous Message Robert Haas 2011-04-18 16:19:34 Re: big distinct clause vs. group by