Re: Partial index on date column

From: "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
To: Dave Page <dpage(at)vale-housing(dot)co(dot)uk>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Partial index on date column
Date: 2003-03-06 18:03:17
Message-ID: Pine.LNX.4.33.0303061054240.5572-100000@css120.ihs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 6 Mar 2003, Dave Page wrote:

> I have a table that is likely to grow over the next few years at a rate
> of 1K-2K rows/day. As the vast majority of the activity on the table
> (other than the inserts) will be selects of data for the current day, I
> have a cron job that drops and recreates a partial index just after
> midnight. It also vacuum analyzes the table.
>
> -- Index: public.pbx_log_today_idx
> CREATE INDEX pbx_log_today_idx ON pbx_log USING btree (pbx_time,
> pbx_call_type, pbx_digits_source, pbx_digits_destination) WHERE
> (pbx_date = '2003-03-06'::date);
>
> I'm surprised by the following behaviour:
>
> EXPLAIN SELECT * FROM pbx_log WHERE pbx_date = CURRENT_DATE;
>
> Seq Scan on pbx_log (cost=0.00..286.20 rows=1274 width=384)
> Filter: (pbx_date = ('now'::text)::date)
>
>
> EXPLAIN SELECT * FROM pbx_log WHERE pbx_date = '2003-03-06';
>
> Index Scan using pbx_log_today_idx on pbx_log (cost=0.00..5.00 rows=0
> width=384)
> Filter: (pbx_date = '2003-03-06'::date)
>
> Is this just an oddity because I don't have masses of data yet (4500
> rows right now), or is this something the optimizer cannot handle?

It's not an oddity, it's the planner trying to decide which is the better
choice, a seq scan or an index scan. If you had 150,000,000 rows and
asked for 149,999,000 of them, it would be counterproductive to use an
index, since you're gonna visit nearly every page of the table anyway.

If you were gonna get 1,000 rows out of 150,000,000 then an index scan
makes more sense,

Somewhere in between is the switch point where going from one to the other
makes sense.

Since you've got 4500 rows and asked for 1274 of them it's likely that the
database will have to read the whole table anyway, so it goes ahead and
does it.

Look for random_page_cost and a few other settings near it in the
$PGDATA/postgresql.conf file.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2003-03-06 18:12:35 Re: Partial index on date column
Previous Message Bruce Momjian 2003-03-06 18:01:17 Re: TODO: DROP COLUMN .. CASCADE