Re: [PATCH] Prefetch index pages for B-Tree index scans

From: Claudio Freire <klaussfreire(at)gmail(dot)com>
To: PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Prefetch index pages for B-Tree index scans
Date: 2012-10-18 22:42:55
Message-ID: CAGTBQpYvzLunDsCVsQadWi6eYfUf8QKN8062tMjBC7T1hFyM6w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Oct 18, 2012 at 5:30 PM, Claudio Freire <klaussfreire(at)gmail(dot)com> wrote:
> Backward:
>
> QUERY PLAN
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> GroupAggregate (cost=0.00..4149039.04 rows=90257289 width=4) (actual
> time=28.190..157708.405 rows=90000001 loops=1)
> -> Index Only Scan Backward using pgbench_accounts_pkey on
> pgbench_accounts (cost=0.00..2795179.71 rows=90257289 width=4)
> (actual time=28.178..135282.317 rows=90000001 loops=1)
> Index Cond: ((aid >= 10000000) AND (aid <= 200000000))
> Heap Fetches: 0
> Total runtime: 160735.539 ms
> I/O thoughput averages 12MB/s (a small increase), and the 3-second
> difference seems related to it (it's consistent).
> I/O utilization averages 88% (important increase)
>
> This last result makes me think deeper prefetching could be
> potentially beneficial (it would result in read merges), but it's
> rather hard to implement without a skiplist of leaf pages. Maybe the
> backward-sequential pattern could be detected. I'll have to tinker
> with that.

Fun. That didn't take long.

With the attached anti-sequential scan patch, and effective_io_concurrency=8:

QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
GroupAggregate (cost=0.00..4149039.04 rows=90257289 width=4) (actual
time=26.964..84299.789 rows=90000001 loops=1)
-> Index Only Scan Backward using pgbench_accounts_pkey on
pgbench_accounts (cost=0.00..2795179.71 rows=90257289 width=4)
(actual time=26.955..62761.774 rows=90000001 loops=1)
Index Cond: ((aid >= 10000000) AND (aid <= 200000000))
Heap Fetches: 0
Total runtime: 87170.355 ms
I/O thoughput 22MB/s (twice as fast)
I/O utilization 95% (I was expecting 100% but... hey... good enough)

With e_i_c=24, it gets to 100% utilization and 30MB/s (that's 3 times
faster). So, I'd like to know what you think, but maybe for
back-sequential scans, prefetch should be set to a multiple (ie: x24)
of e_i_c, in order to exploit read request merges.

Attachment Content-Type Size
postgresql-git-bt_prefetch_backseq.diff application/octet-stream 8.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2012-10-19 00:21:00 Re: Deprecations in authentication
Previous Message Simon Riggs 2012-10-18 22:19:02 Re: Skip checkpoint on promoting from streaming replication