Re: [HACKERS] Secondary index access optimizations

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Antonin Houska <ah(at)cybertec(dot)at>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [HACKERS] Secondary index access optimizations
Date: 2018-10-07 21:16:16
Message-ID: CAKJS1f_cDESjcDR-_E9NLrRpo7TQGiJ-aguw5XEwSBD1S+AyaQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 5 October 2018 at 04:45, Konstantin Knizhnik
<k(dot)knizhnik(at)postgrespro(dot)ru> wrote:
> Will the following test be enough:
>
> -- check that columns for parent table are correctly mapped to child
> partition of their order doesn't match
> create table paren (a int, b text) partition by range(a);
> create table child_1 partition of paren for values from (0) to (10);
> create table child_2 (b text, a int);
> alter table paren attach partition child_2 for values from (10) to (20);
> insert into paren values (generate_series(0,19), generate_series(100,119));
>
> explain (costs off) select * from paren where a between 0 and 9;
> explain (costs off) select * from paren where a between 10 and 20;
> explain (costs off) select * from paren where a >= 5;
> explain (costs off) select * from paren where a <= 15;
>
> select count(*) from paren where a >= 5;
> select count(*) from paren where a < 15;
>
> drop table paren cascade;

I started looking at this to see if this test would cause a crash with
the original code, but it does not. The closest I can get is:

drop table parent;
create table parent (a bytea, b int) partition by range(a);
create table child_1 (b int, a bytea);
alter table parent attach partition child_1 for values from ('a') to ('z');
explain (costs off) select * from parent where b = 1;

But despite the varattnos of the bytea and int accidentally matching,
there's no crash due to the way operator_predicate_proof() requires
more than just the varno and varattno to match. It requires the Vars
to be equal(), which includes vartype, and those are not the same. So
the proof just fails.

In short, probably this test is doing nothing in addition to what
various other tests are doing. So given the test is unable to crash
the unfixed code, then I think it's likely not a worthwhile test to
add.

I wrote:
> create table listp (a int, b int) partition by list(a);
> create table listp1 partition of listp for values in(1);
> create index listp_a_b_idx on listp (a,b);
>
> and a query:
>
> select * from listp where a = 1 order by b;
>
> if we remove the "a = 1" qual, then listp_a_b_idx can't be used.

I had a look at what allows this query still to use the index and it's
down to pathkey_is_redundant() returning true because there's still an
equivalence class containing {a,1}. I don't quite see any reason why
it would not be okay to rely on that working, but that only works for
pathkeys. If you have a case like:

set max_parallel_workers_per_gather=0;
create table listp (a int, b int) partition by list(a);
create table listp1 partition of listp for values in(1);
insert into listp select 1,x from generate_Series(1,1000000) x;
create index listp_a_b_idx on listp (a,b);
vacuum analyze listp;
explain analyze select * from listp where a = 1 and b = 1;

the "a = 1" will be dropped and the index on (a,b) does not get used.

Patched results in:

postgres=# explain analyze select * from listp where a = 1 and b = 1;
QUERY PLAN
------------------------------------------------------------------------------------------------------------
Append (cost=0.00..16925.01 rows=1 width=8) (actual
time=0.019..169.231 rows=1 loops=1)
-> Seq Scan on listp1 (cost=0.00..16925.00 rows=1 width=8)
(actual time=0.017..169.228 rows=1 loops=1)
Filter: (b = 1)
Rows Removed by Filter: 999999
Planning Time: 0.351 ms
Execution Time: 169.257 ms
(6 rows)

Whereas unpatched gets:

postgres=# explain analyze select * from listp where a = 1 and b = 1;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
Append (cost=0.42..4.45 rows=1 width=8) (actual time=0.657..0.660
rows=1 loops=1)
-> Index Only Scan using listp1_a_b_idx on listp1
(cost=0.42..4.44 rows=1 width=8) (actual time=0.653..0.655 rows=1
loops=1)
Index Cond: ((a = 1) AND (b = 1))
Heap Fetches: 0
Planning Time: 32.303 ms
Execution Time: 0.826 ms
(6 rows)

so I was probably wrong about suggesting set_append_rel_size() as a
good place to remove these quals. It should perhaps be done later, or
maybe we can add some sort of marker to the qual to say it does not
need to be enforced during execution. Probably the former would be
best as we don't want to show these in EXPLAIN.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-10-07 23:18:15 Re: executor relation handling
Previous Message Tom Lane 2018-10-07 19:40:51 Re: Performance improvements for src/port/snprintf.c