Re: Strange Index behavior

From: Együd Csaba <csegyud(at)vnet(dot)hu>
To: 'Tom Lane' <tgl(at)sss(dot)pgh(dot)pa(dot)us>, 'Együd Csaba (Freemail)' <csegyud(at)freemail(dot)hu>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Strange Index behavior
Date: 2004-12-22 19:12:46
Message-ID: 0I9500H0N1D746@mail.vnet.hu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Yes, I vacuum analyze it once in every hour.

Thank you for the suggestions, now I'm getting understand the point...

With default_statistics_target=500 and random_page_cost=0.1 and reanalyzing
the db the planner always uses the index. Certainly if there were more rows
in the table the planner somewhere would switch to seq scan.

Some exlpain results:
------------------------
#explain analyze select tstamp, pp, pm, status from measured_1 where tstamp
>= '2004.12.22 00:00' and tstamp <= '2004.12.22 23:59' order by tstamp;
"Index Scan using measured_1_pkey on measured_1 (cost=0.00..2258.91
rows=26769 width=38) (actual time=0.000..271.000 rows=26743 loops=1)"
" Index Cond: ((tstamp >= '2004-12-22 00:00:00'::timestamp without time
zone) AND (tstamp <= '2004-12-22 23:59:00'::timestamp without time zone))"
"Total runtime: 361.000 ms"

#explain analyze select tstamp, pp, pm, status from measured_1 where tstamp
>= '2004.12.22 13:00' and tstamp <= '2004.12.22 23:59' order by tstamp;
"Index Scan using measured_1_pkey on measured_1 (cost=0.00..1560.35
rows=18377 width=38) (actual time=0.000..140.000 rows=18424 loops=1)"
" Index Cond: ((tstamp >= '2004-12-22 13:00:00'::timestamp without time
zone) AND (tstamp <= '2004-12-22 23:59:00'::timestamp without time zone))"
"Total runtime: 200.000 ms"

Tom and Martijn,
many thanks.

-- Csaba

-----Original Message-----
From: Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us]
Sent: Wednesday, December 22, 2004 7:01 PM
To: Együd Csaba (Freemail)
Cc: 'Együd Csaba'; pgsql-general(at)postgresql(dot)org
Subject: Re: [GENERAL] Strange Index behavior

=?iso-8859-2?Q?Egy=FCd_Csaba_=28Freemail=29?= <csegyud(at)freemail(dot)hu> writes:
> The difference between the result times is 16x. I can't understand why
> the planner thinks it is the better way...

The planner thinks that because it thinks the second query will fetch 1500
times as many rows as the first. Now that I look more carefully I see that
both of those row estimates are off, the first by a lot.
Have you ANALYZEd this table lately? If you have, it might be a good idea
to increase the statistics target for the tstamp column.

You might also need to play around with random_page_cost, but it will be
counterproductive to touch that until the rowcount estimates are in the
vicinity of reality.

regards, tom lane

--
No virus found in this incoming message.
Checked by AVG Anti-Virus.
Version: 7.0.296 / Virus Database: 265.6.3 - Release Date: 2004.12.21.

--
No virus found in this incoming message.
Checked by AVG Anti-Virus.
Version: 7.0.296 / Virus Database: 265.6.3 - Release Date: 2004.12.21.

--
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.296 / Virus Database: 265.6.3 - Release Date: 2004.12.21.

--
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.296 / Virus Database: 265.6.3 - Release Date: 2004.12.21.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Rick Morris 2004-12-22 19:20:16 Re: nice work on the new site
Previous Message Joshua D. Drake 2004-12-22 19:02:30 Re: nice work on the new site