Re: [Devel 9.2] Index-only scan in count aggregation

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Emanuel Calvo <postgres(dot)arg(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [Devel 9.2] Index-only scan in count aggregation
Date: 2012-04-18 16:17:20
Message-ID: CA+TgmobdBewJobPnWzXzaLX2wvQRhoMs+sfBbjbD4UqyJ9ObJw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 18, 2012 at 12:13 PM, Emanuel Calvo <postgres(dot)arg(at)gmail(dot)com> wrote:
> Hi guys,
>
> I'm one of the nightly sources of 9.2devel. I was trying some simple
> queries and I realized something:
>
> stuff=# explain (analyze true, costs true, buffers true, timing true,
> verbose true) select count(i) from lot_of_values;
>                                                                QUERY
> PLAN
> -------------------------------------------------------------------------------------------------------------------------------------------
>  Aggregate  (cost=213496.00..213496.01 rows=1 width=4) (actual
> time=60400.788..60400.791 rows=1 loops=1)
>   Output: count(i)
>   Buffers: shared hit=2400 read=86096
>   ->  Seq Scan on public.lot_of_values  (cost=0.00..188496.00
> rows=10000000 width=4) (actual time=0.371..32227.791 rows=10000000
> loops=1)
>         Output: i, t1, r1, r2, r3, d1
>         Buffers: shared hit=2400 read=86096
>  Total runtime: 60402.460 ms
> (7 rows)
>
> stuff=# set enable_seqscan=off;
> SET
> stuff=# explain (analyze true, costs true, buffers true, timing true,
> verbose true) select count(i) from lot_of_values;
>
>          QUERY PLAN
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>  Aggregate  (cost=351292.03..351292.04 rows=1 width=4) (actual
> time=63278.472..63278.475 rows=1 loops=1)
>   Output: count(i)
>   Buffers: shared hit=1 read=110379
>   ->  Index Only Scan using lot_of_values_pkey on
> public.lot_of_values  (cost=0.00..326292.03 rows=10000000 width=4)
> (actual time=42.028..35217.460 rows=10000000 loops=1)
>         Output: i
>         Heap Fetches: 10000000
>         Buffers: shared hit=1 read=110379
>  Total runtime: 63278.720 ms
> (8 rows)
>
>
> I know, still development. Just wanna know if there will be an
> improvement for this in the next patches or the idea is to maintain
> this behaviour.
>
> Cheers and thanks for the amazing work you all had done!

I'm not sure what you're unhappy about. It seems that the query
planner picked the fastest plan (a sequential scan) and then when you
disabled that it picked the second-fastest plan (an index-only scan).

The index-only scan would have a chance of beating the sequential scan
if the table had been recently vacuumed, but not in the case where
every row is going to require a heap fetch.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Emanuel Calvo 2012-04-18 16:40:17 Re: [Devel 9.2] Index-only scan in count aggregation
Previous Message Emanuel Calvo 2012-04-18 16:13:41 [Devel 9.2] Index-only scan in count aggregation