Re: Bad row estimates

From: Greg Stark <gsstark(at)mit(dot)edu>
To: Alex Adriaanse <alex(at)innovacomputing(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Bad row estimates
Date: 2006-03-04 07:01:35
Message-ID: 87wtfasng0.fsf@stark.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Alex Adriaanse <alex(at)innovacomputing(dot)com> writes:

> SELECT count(*) FROM test_table_1
> INNER JOIN test_table_2 ON
> (test_table_2.s_id = 13300613 AND test_table_1.id = test_table_2.n_id)
> WHERE now() BETWEEN test_table_1.start_ts AND test_table_1.end_ts
> AND test_table_1.id = test_table_1.g_id;

I don't know if this is the entire answer but this query is touching on two of
Postgres's specific difficulties in analyzing statistics:

The "now() BETWEEN test_table_1.start_ts AND test_table_1.end_ts can't be
answered completely using a btree index. You could try using a GIST index here
but I'm not clear how much it would help you (or how much work it would be).

The "test_table_1.id = test_table_1.g_id" clause depends on intercolumn
"correlation" which Postgres doesn't make any attempt at analyzing. That's why
you've found that no matter how much you increase the statitics goal it can't
come up with a better estimate.

Actually the "now() between ..." clause also suffers from the inter-column
dependency issue which is why the estimates for it are off as well.

> However, if I add a boolean column named "equal_ids" to test_table_1 with
> the value (test_table_1.id = test_table_1.g_id), and use that in the query
> instead of the equality it does make a much better row estimate.

One thing you could try is making an expression index on that expression. You
don't need to actually have a redundant column bloating your table. In 8.1 I
believe Postgres will even calculate statistics for these expression indexes.

In fact you could go one step further and try a partial index like:

CREATE INDEX eq_start ON test_table (start_ts) WHERE id = g_id

The ideal combination might be to create a partial GIST index :)

(I don't think the end_ts in the index is buying you much, despite its
appearance in the Index Cond in the plan.)

--
greg

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Greg Stark 2006-03-04 08:15:02 Re: Bad row estimates
Previous Message Alex Adriaanse 2006-03-04 01:10:52 Bad row estimates