Re: help understanding create statistic

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Luca Ferrari <fluca1978(at)gmail(dot)com>
Cc: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: help understanding create statistic
Date: 2018-06-28 10:06:32
Message-ID: CAKJS1f-Zw24J9FEk=4UnpneMkXd4LBed=ayW8XTH-sxLRoE11g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 28 June 2018 at 21:38, Luca Ferrari <fluca1978(at)gmail(dot)com> wrote:
> CREATE INDEX idx_year
> ON expenses ( EXTRACT( year FROM day ) );
>
> why is the planner not choosing to use such index on a 'year' raw query?
>
> EXPLAIN SELECT * FROM expenses
> WHERE year = 2016;

The expression in the where clause must match the indexed expression.
You'd need to add an index on just (year) for that to work.

> Am I misunderstaing this functional dependency?

Yeah, the statistics are just there to drive the planner's costing.
They won't serve as proof for anything else.

All you've done by creating those stats is to allow better estimates
for queries such as:

SELECT * FROM expenses WHERE day = '2018-06-28' and year = 2018;

> stxdependencies | {"3 => 5": 1.000000}

It would appear that "3" is the attnum for day and "5" is year. All
that tells the planner is that on the records sampled during analyze
is that each "day" had about exactly 1 year.

There's nothing then to stop you going and adding a record with the
day '2017-01-01' and the year 2018. The stats will remain the same
until you analyze the table again.

If those stats didn't exist, the planner would have multiplied the
selectivity estimates of each item in the WHERE clause individually.
So if about 10% of records had year=2018, and 0.01% had '2018-06-28',
then the selectivity would have been 0.1 * 0.001. With a functional
dependency of 1, the selectivity just becomes 0.001.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-general by date

  From Date Subject
Next Message joby.john@nccgroup.trust 2018-06-28 10:10:28 Re: Database name with semicolon
Previous Message Luca Ferrari 2018-06-28 09:48:13 Re: plperl and plperlu language extentsions