Re: index on timestamp performance

From: Stephan Szabo <sszabo(at)megazone23(dot)bigpanda(dot)com>
To: Eric Cholet <cholet(at)logilune(dot)com>
Cc: <pgsql-general(at)postgresql(dot)org>
Subject: Re: index on timestamp performance
Date: 2003-01-29 18:45:34
Message-ID: 20030129104117.V12532-100000@megazone23.bigpanda.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


On Wed, 29 Jan 2003, Eric Cholet wrote:

> --On Wednesday, January 29, 2003 08:51:30 -0800 Stephan Szabo
> <sszabo(at)megazone23(dot)bigpanda(dot)com> wrote:
>
> > On Wed, 29 Jan 2003, Eric Cholet wrote:
> >
> >> I have this schema:
> >>
> >>
> >> motid | integer | not null
> >> objid | integer | not null
> >> date | timestamp without time zone | not null
> >> Indexes: dico_frs_motid_date btree (motid, date)
> >> dico_frs_objid btree (objid)
> >>
> >> The performance I'm getting from the index that contains
> >> 'date' is much slower than when using the objid index
> >> (different queries of course). This is a 10 million row
> >> table. Am I right to assume that postgres needs to do
> >> more work because it has to convert the dates to some
> >> internal (integer?) format?
> >
> > What does explain (analyze if possible) show for the two queries?
>
> I can't really run two equivalent queries that will each use
> a different index. Here's a query that uses the index with 'date'
> (output wrapped manually)

Well, I was wondering for example, the objid queries you were comparing
to, were they returning an equivalent number of about 3000 rows from the
index scan? I think part of the difference may be that here the plan
grabs the rows from the index and then sorts them all so the limit doesn't
actually save you significant time. For better speed on this particular
sort of query, you might be better off with an order of :
order by motid desc, date desc

> => explain analyze select objid from dico_frs where motid=1247
> and date <= '2003-01-29 17:55:17' and date >= '2002-10-29 17:55:17'
> order by date desc limit 11;
>
> Limit (cost=4752.14..4752.17 rows=11 width=12) (actual time=63.20..63.37
> rows=11 loops=1)
> -> Sort (cost=4752.14..4755.11 rows=1187 width=12) (actual
> time=63.17..63.23 rows=12 loops=1)
> Sort Key: date
> -> Index Scan using dico_frs_motid_date on dico_frs
> (cost=0.00..4691.50 rows=1187 width=12) (actual time=0.08..41.88 rows=2924
> loops=1)
> Index Cond: ((motid = 1247) AND (date <= '2003-01-29
> 17:55:17'::timestamp without time zone) AND (date >= '2002-10-29
> 17:55:17'::timestamp without time zone))
> Total runtime: 63.93 msec
> (6 rows)
>
>
> > It could just be a difference in plans or estimates.
>
> Right, but still I'd like to know whether the timestamp datatype in the
> index
> results in more work than an integer datatype.

Almost certainly a little bit, but not so much that I would expect to see
order of magnitude differences or anything.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Stephan Szabo 2003-01-29 18:54:42 Re: URGENT: referential integrity problem
Previous Message Richard A Lough 2003-01-29 18:36:10 Re: [NOVICE] Perl - Postgres