Re: between not propated into a simple equality join

From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Benedikt Grundmann <bgrundmann(at)janestreet(dot)com>
Cc: PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: between not propated into a simple equality join
Date: 2016-05-10 04:34:57
Message-ID: CAKFQuwbHAD_KgaWbq6q6_RvnV0FEjGKhYCS28BBXSkGOo=Fjyg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, May 9, 2016 at 8:53 AM, Benedikt Grundmann <
bgrundmann(at)janestreet(dot)com> wrote:

> We just run into a very simple query that the planner does much worse on
> than we thought it would (in production the table in question is ~ 100
> GB). It surprised us given the planner is generally quite good, so I
> thought I share our surprise
>
> Setup:
>
> postgres_prod(at)proddb_testing=# select version();[1]
> version
>
>
> ────────────────────────────────────────────────────────────────────────────────────────────────────────────────
> PostgreSQL 9.2.16 on x86_64-unknown-linux-gnu, compiled by gcc (GCC)
> 4.4.7 20120313 (Red Hat 4.4.7-16), 64-bit
> (1 row)
>
> Time: 69.246 ms
>
> postgres_prod(at)proddb_testing=# create table toy_data3 (the_date date, i
> int);
> CREATE TABLE
> Time: 67.096 ms
> postgres_prod(at)proddb_testing=# insert into toy_data3
>
> (select current_date-(s.idx/1000), s.idx from generate_series(1,1000000)
> as s(idx));
> INSERT 0 1000000
> Time: 1617.483 ms
> postgres_prod(at)proddb_testing=# create index toy_data_date3 on
> toy_data3(the_date);
> CREATE INDEX
> Time: 660.166 ms
> postgres_prod(at)proddb_testing=# analyze toy_data3;
> ANALYZE
> Time: 294.984 ms
>
> The bad behavior:
>
> postgres_prod(at)proddb_testing=# explain analyze
> select * from (
> select td1.the_date, td1.i
> from toy_data3 td1, toy_data3 td2 where td1.the_date = td2.the_date
> and td1.i = td2.i
> ) foo
> where the_date between current_date and current_date;
> QUERY
> PLAN
>
> ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
> Hash Join (cost=55.49..21980.50 rows=1 width=8) (actual
> time=0.336..179.374 rows=999 loops=1)
> Hash Cond: ((td2.the_date = td1.the_date) AND (td2.i = td1.i))
> -> Seq Scan on toy_data3 td2 (cost=0.00..14425.00 rows=1000000
> width=8) (actual time=0.007..72.510 rows=1000000 lo
> -> Hash (cost=40.44..40.44 rows=1003 width=8) (actual
> time=0.321..0.321 rows=999 loops=1)
> Buckets: 1024 Batches: 1 Memory Usage: 40kB
> -> Index Scan using toy_data_date3 on toy_data3 td1
> (cost=0.01..40.44 rows=1003 width=8) (actual time=0.018.
> Index Cond: ((the_date >= ('now'::cstring)::date) AND
> (the_date <= ('now'::cstring)::date))
> Total runtime: 179.440 ms
> (8 rows)
>
> Time: 246.094 ms
>
> Notice the red. Which is sad because one would like it to realize that it
> could propagate the index constraint onto td2. That is on both sides of
> the join do the green.
>
>
​FWIW​

​This is my plan result:
version
PostgreSQL 9.5.2 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu
4.8.2-19ubuntu1) 4.8.2, 64-bit
All default settings

using "BETWEEN"


QUERY PLAN
Nested Loop (cost=0.86..48.91 rows=1 width=8) (actual time=0.042..168.512
rows=999 loops=1)
-> Index Scan using toy_data_date3 on toy_data3 td1 (cost=0.43..8.46
rows=1 width=8) (actual time=0.022..1.388 rows=999 loops=1)
Index Cond: ((the_date >= ('now'::cstring)::date) AND (the_date <=
('now'::cstring)::date))
-> Index Scan using toy_data_date3 on toy_data3 td2 (cost=0.42..40.44
rows=1 width=8) (actual time=0.078..0.160 rows=1 loops=999)
Index Cond: (the_date = td1.the_date)
Filter: (td1.i = i)
Rows Removed by Filter: 998
Planning time: 0.353 ms
Execution time: 169.692 ms

​using "="​

QUERY PLAN
Hash Join (cost=49.89..90.46 rows=1 width=8) (actual time=2.320..5.652
rows=999 loops=1)
Hash Cond: (td1.i = td2.i)
-> Index Scan using toy_data_date3 on toy_data3 td1 (cost=0.43..37.37
rows=967 width=8) (actual time=0.014..1.168 rows=999 loops=1)
Index Cond: (the_date = ('now'::cstring)::date)
-> Hash (cost=37.37..37.37 rows=967 width=8) (actual time=2.292..2.292
rows=999 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 48kB
-> Index Scan using toy_data_date3 on toy_data3 td2
(cost=0.43..37.37 rows=967 width=8) (actual time=0.008..1.183 rows=999
loops=1)
Index Cond: (the_date = ('now'::cstring)::date)
Planning time: 0.326 ms
Execution time: 6.673 ms

I was hoping to be able to say more but alas cannot find the words.

I'm surprised by the estimate of 1 rows for the td1 index scan in my 9.5
query - and also why the 9.2 query would choose to sequential scan hash
join in favor of what seems to be a superior index scan nested loop on a
fraction of the table.

The fact that the between doesn't get transitively applied to td2 through
the td1=td2 condition, not as much...though whether the limitation is due
to theory or implementation I do not know.

I do suspect that at least part of the issue is that the computation of
"the_date" makes the two columns highly correlated while the planner
assumes independence.

David J.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2016-05-10 05:00:36 Re: asynchronous and vectorized execution
Previous Message Greg Stark 2016-05-10 04:34:36 Re: asynchronous and vectorized execution