Re: PATCH: index-only scans with partial indexes

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: index-only scans with partial indexes
Date: 2015-09-04 21:03:31
Message-ID: 55EA0723.4040300@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 09/04/2015 06:10 PM, Jeff Janes wrote:
>
> How can we evaluate Tom's performance concerns? I tried
> turning log_planner_stats on and using the regression test as a load
> generator, but I don't think that that is very demanding of a test.

I've done a bit of benchmarking today, trying to measure how expensive
the additional checks are.

Using a simple table with just 4 columns and 1M rows

CREATE TABLE t AS SELECT i AS a, i AS b, i AS c, i AS d
FROM generate_series(1,1000000) s(i);

with three different index sets:

- no indexes
- 40 regular indexes (indexes-1.sql)
- 40 partial indexes (indexes-2.sql)

and two different query sets:

- matching the partial indexes (queries-1.sql)
- not matching the partial indexes (queries-2.sql)

which means 6 combinations:

A: no indexes / queries-1
B: no indexes / queries-2
C: indexes-1 / queries-1
D: indexes-1 / queries-2
E: indexes-2 / queries-1
F: indexes-2 / queries-2

A summary of 100 EXPLAIN timings looks like this:

master A B C D E F
-------------------------------------------------------------------------
min 0.10 0.10 0.30 0.29 0.66 0.23
max 1.07 1.00 2.13 1.98 4.52 1.59
median 0.49 0.52 0.31 0.33 0.68 1.12
average 0.43 0.35 0.62 0.49 1.01 0.89

patched A B C D E F
-------------------------------------------------------------------------
min 0.11 0.11 0.29 0.29 0.70 0.22
max 0.99 1.05 0.55 1.93 3.79 1.12
median 0.19 0.55 0.32 0.34 0.74 0.24
average 0.42 0.52 0.34 0.55 0.95 0.27

A-D should be exactly the same, because there are no partial indexes,
and the results match that expectation.

E and F should be different, depending on how expensive the additional
checks are. But in this benchmark that's not true - the patched version
is actually a bit faster, thanks to noise.

I find that a bit strange, but I repeated the benchmark about 3x just to
verify it really behaves like this. Maybe I did some stupid mistake and
the results are useless, or maybe it needs to be more complex (e.g. the
conditions must not be exactly the same). So if someone could rerun the
benchmark and review it, that'd be nice.

Judging the cost/benefit ratio is a bit tricky. We need to identify the
cases where additional planning complexity makes it measurably slower,
without getting better performance at execution. And then we need to
somehow argue whether those cases are frequent enough or not.

ISTM that the worst case would be a data set with many partial indexes,
that however don't allow IOS. And the amount of data would have to be
small, so that the queries don't take too much time (which would make
the additional planning time noise).

However that was the idea of the benchmark, and I got no difference.

regards
Tomas

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment Content-Type Size
queries-1.sql application/sql 5.2 KB
queries-2.sql application/sql 5.2 KB
indexes-2.sql application/sql 2.2 KB
indexes-1.sql application/sql 1.1 KB
table.sql application/sql 148 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2015-09-04 22:25:31 Re: Spurious standby query cancellations
Previous Message Stephen Frost 2015-09-04 20:51:33 Re: WIP: SCRAM authentication