Re: How index are running and how to optimise ?

From: "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
To: Hervé Piedvache <herve(at)elma(dot)fr>
Cc: <pgsql-general(at)postgresql(dot)org>
Subject: Re: How index are running and how to optimise ?
Date: 2004-03-03 18:52:26
Message-ID: Pine.LNX.4.33.0403031148480.8436-100000@css120.ihs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, 3 Mar 2004, [iso-8859-15] Hervé Piedvache wrote:

> Hi,
>
> I have may be a stupid question, but I'm a little surprised with some explains
> I have, using date fields ...
>
> I would like to understand exactly when index are used ...
> I'm using PostgresQL 7.4.1
>
> I have a table with 351 000 records.
> I have about 300 to 600 new records by day
> I have an index like this :
> ix_contracts_start_stop_date btree (start_date, stop_date)
>
> I want to simply do something like this :
>
> select o.id_contract
> from contracts o
> where o.start_date <= '2001-10-31'
> and (o.stop_date > '2001-11-06' or stop_date is null);
>
> OK I get an explain like this :
> QUERY PLAN
> --------------------------------------------------------------------------------------------------------------
> Seq Scan on contracts o (cost=0.00..12021.80 rows=160823 width=4)
> Filter: ((start_date <= '2001-10-31'::date) AND ((stop_date >
> '2001-11-06'::date) OR (stop_date IS NULL)))

Notice the planner is expecting to get back 160823 rows here. How many
does it actually return?

> I understand that the OR could make the no use of the stop_date index ..., but
> why I'm not using the index for the start_date part ?
>
> Index are used only if I use an egality like this :
>
> select o.id_contract
> from contracts o
> where o.start_date = '2001-10-31'
> and o.stop_date = '2001-11-06';

No, you don't have to do that. You should be able to use a range and get
an index scan IF said index scan will be faster (in the query planner's
estimate.)

explain select * from test where dt>'2004-01-01 00:00:00' and
dt<'2004-01-02 00:00:00';
QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------
Index Scan using test_dt on test (cost=0.00..118628.84 rows=29793
width=51)
Index Cond: ((dt > '2004-01-01 00:00:00'::timestamp without time zone)
AND (dt < '2004-01-02 00:00:00'::timestamp without time zone))
(2 rows)

Notice the use of an index there.

> QUERY PLAN
> ------------------------------------------------------------------------------------------------
> Index Scan using ix_contracts_start_stop_date on contracts o
> (cost=0.00..6.00 rows=1 width=4)
> Index Cond: ((start_date = '2001-10-31'::date) AND (stop_date =
> '2001-11-06'::date))

Here the planner expects ONE row. Of course it's using an index.

> Could you please explain me why index are not used with <, > and how I can
> optimise my request ... I have no idea but I'm using this request to do
> insert in another table and this segmentation take 13 hours for making the
> insert ! :o((

It may well be the inserts that are slow and not the selects. how long
does the select, by itself, take to run?

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Eric Davies 2004-03-03 19:34:13 gist index build produces corrupt result on first access to table.
Previous Message Gunjeet Singh 2004-03-03 18:51:45 Question on stored functions