Re: overzealous sorting?

From: Marc Cousin <cousinmarc(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: overzealous sorting?
Date: 2011-09-26 09:39:01
Message-ID: 20110926113901.38fdf892@marco-dalibo
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Le Mon, 26 Sep 2011 16:28:15 +1000,
anthony(dot)shipman(at)symstream(dot)com a écrit :

> In Mammoth Replicator (PG 8.3) I have a table described as
>
> Table "public.tevent_cdr"
> Column | Type |
> Modifiers
> ----------------+--------------------------+------------------------------------------------------------
> event_id | integer | not null default
> nextval(('event_id_seq'::text)::regclass) timestamp | timestamp
> with time zone | not null classification | character varying |
> not null area | character varying | not null
> kind | character varying |
> device_id | integer |
> device_name | character varying |
> fleet_id | integer |
> fleet_name | character varying |
> customer_id | integer |
> customer_name | character varying |
> event | text |
> Indexes:
> "tevent_cdr_event_id" UNIQUE, btree (event_id)
> "tevent_cdr_timestamp" btree ("timestamp")
> Check constraints:
> "tevent_cdr_classification_check" CHECK (classification::text
> = 'cdr'::text)
> Inherits: tevent
>
>
> This simple query puzzles me. Why does it need to sort the records?
> Don't they come from the index in order?
>
> "explain analyze select * from tevent_cdr where timestamp >=
> '2011-09-09 12:00:00.000000+0' and timestamp < '2011-09-09
> 13:00:00.000000+0' and classification = 'cdr' order by timestamp;"
>
> QUERY PLAN
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Sort (cost=9270.93..9277.12 rows=2477 width=588) (actual
> time=9.219..11.489 rows=2480 loops=1)
> Sort Key: "timestamp"
> Sort Method: quicksort Memory: 2564kB
> -> Bitmap Heap Scan on tevent_cdr (cost=57.93..9131.30
> rows=2477 width=588) (actual time=0.440..3.923 rows=2480 loops=1)
> Recheck Cond: (("timestamp" >= '2011-09-09
> 22:00:00+10'::timestamp with time zone) AND ("timestamp" <
> '2011-09-09 23:00:00+10'::timestamp with time zone))
> Filter: ((classification)::text = 'cdr'::text)
> -> Bitmap Index Scan on tevent_cdr_timestamp
> (cost=0.00..57.31 rows=2477 width=0) (actual time=0.404..0.404
> rows=2480 loops=1)
> Index Cond: (("timestamp" >= '2011-09-09
> 22:00:00+10'::timestamp with time zone) AND ("timestamp" <
> '2011-09-09 23:00:00+10'::timestamp with time zone))
> Total runtime: 13.847 ms
> (9 rows)

Because Index Scans are sorted, not Bitmap Index Scans, which builds a
list of pages to visit, to be then visited by the Bitmap Heap Scan step.

Marc.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Antonio Rodriges 2011-09-26 12:51:12 Re: [PERFORMANCE] Insights: fseek OR read_cluster?
Previous Message Craig Ringer 2011-09-26 07:59:39 Re: [PERFORMANCE] Insights: fseek OR read_cluster?