From: | Luca Ferrari <fluca1978(at)gmail(dot)com> |
---|---|
To: | pgsql-general <pgsql-general(at)postgresql(dot)org> |
Subject: | help understanding create statistic |
Date: | 2018-06-28 09:38:51 |
Message-ID: | CAKoxK+6C8CKdbYbbyNeYnc5aiDk=G-k-iDyDZMcmjJATqkLM9w@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hi all,
in order to better understand this feature of 10, I've created a table
like this:
CREATE TABLE expenses(
...
day date,
year int,
CHECK( year = EXTRACT( year FROM day ) )
);
so that I can ensure 'year' and 'day' are tied together:
SELECT
count(*) FILTER( WHERE year = 2016 ) AS by_year,
count(*) FILTER( WHERE EXTRACT( year FROM day ) = 2016 ) AS by_day
FROM expenses;
-[ RECORD 1 ]-
by_year | 8784
by_day | 8784
Then I created a statistic:
CREATE STATISTICS stat_day_year ( dependencies )
ON day, year
FROM expenses;
select * from pg_statistic_ext ;
-[ RECORD 1 ]---+---------------------
stxrelid | 42833
stxname | stat_day_year
stxnamespace | 2200
stxowner | 16384
stxkeys | 3 5
stxkind | {f}
stxndistinct |
stxdependencies | {"3 => 5": 1.000000}
Now, having an index on the extract year of day as follows:
CREATE INDEX idx_year
ON expenses ( EXTRACT( year FROM day ) );
why is the planner not choosing to use such index on a 'year' raw query?
EXPLAIN SELECT * FROM expenses
WHERE year = 2016;
QUERY PLAN
-------------------------------------------------------------------------------
Gather (cost=1000.00..92782.34 rows=8465 width=32)
Workers Planned: 2
-> Parallel Seq Scan on expenses (cost=0.00..90935.84 rows=3527 width=32)
Filter: (year = 2016)
The number of rows are correct, but I was expecting it to use the same
index as a query like "WHERE EXTRACT( year FROM day) = 2016" triggers.
Even altering the year column to not null does show any change, and
this is the plan obtained turning off seq_scan (to see the costs):
EXPLAIN ANALYZE SELECT * FROM expenses
WHERE year = 2016;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
Seq Scan on expenses (cost=10000000000.00..10000127402.44 rows=8451
width=32) (actual time=972.734..2189.300 rows=8784 loops=1)
Filter: (year = 2016)
Rows Removed by Filter: 4991216
Am I misunderstaing this functional dependency?
Thanks,
Luca
From | Date | Subject | |
---|---|---|---|
Next Message | Mathieu PUJOL | 2018-06-28 09:45:26 | Analyze plan of foreign data wrapper |
Previous Message | Pablo Hendrickx | 2018-06-28 07:45:30 | Re: Example setup for Odyssey connection pooler? |