Re: multivariate statistics (v25)

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, David Fetter <david(at)fetter(dot)org>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: multivariate statistics (v25)
Date: 2017-03-31 15:25:12
Message-ID: CAKJS1f8MqodC1JCzTKAyyx9OVo31VH_a9cHkTdKHifw-gVSrMw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 31 March 2017 at 21:18, Kyotaro HORIGUCHI <
horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:

> > When adding these two parameters I had 2nd thoughts that the
> "tryextstats"
> > was required at all. We could just have this controlled by if the rel is
> a
> > base rel of kind RTE_RELATION. I ended up having to pass these parameters
> > further, down to clauselist_selectivity's singleton couterpart,
> > clause_selectivity(). This was due to clause_selectivity() calling
> > clauselist_selectivity() for some clause types. I'm not entirely sure if
> > this is actually required, but I can't see any reason for it to cause
> > problems.
>
> I understand that the reason for tryextstats is that the two are
> perfectly correlating but caluse_selectivity requires the
> RelOptInfo anyway. Some comment about that may be reuiqred in the
> function comment.

hmm, you could say one is functionally dependant on the other. I did
consider removing it, but it seemed weird to pass a NULL relation when we
dont want to attempt to use extended stats.

> Some random comments by just looking on the patch:
>
> ======
> The name of the function "collect_ext_attnums", and
> "clause_is_ext_compatible" seems odd since "ext" doesn't seem to
> be a part of "extended statistics". Some other names looks the
> same, too.
>

I agree. I've made some changes to the patch to change how the functional
dependency estimations are applied. I've removed most of the code from
clausesel.c and put it into dependencies.c. In doing so I've removed some
of the inefficiencies that were in the patch. For example
clause_is_ext_compatible() was being called many times on the same clause
at different times. I've now nailed that down to just once per clause.

> Something like "collect_e(xt)stat_compatible_attnums" and
> "clause_is_e(xt)stat_compatible" seem better to me.
>
>
Changed to dependency_compatible_clause(), since this was searching for
equality clauses in the form Var = Const, or Const = Var. This seems
specific to the functional depdencies checking. A multivariate histogram
won't want the same.

> ======
> The following comment seems something wrong.
>
> + * When applying functional dependencies, we start with the strongest ones
> + * strongest dependencies. That is, we select the dependency that:
>
> ======
> dependency_is_fully_matched() is not found. Maybe some other
> patches are assumed?
>
> ======
> + /* see if it actually has the right */
> + ok = (NumRelids((Node *) expr) == 1) &&
> + (is_pseudo_constant_clause(lsecond(expr->args)) ||
> + (varonleft = false,
> + is_pseudo_constant_clause(
> linitial(expr->args))));
> +
> + /* unsupported structure (two variables or so) */
> + if (!ok)
> + return true;
>
> Ok is used only here. I don't think seeming-expressions with side
> effect is not good idea here.
>
>
I thought the same, but I happened to notice that Tomas must have taken it
from clauselist_selectivity().

> ======
> + switch (get_oprrest(expr->opno))
> + {
> + case F_EQSEL:
> +
> + /* equality conditions are compatible with
> all statistics */
> + break;
> +
> + default:
> +
> + /* unknown estimator */
> + return true;
> + }
>
> This seems somewhat stupid..
>

I agree. Changed.

I've attached an updated patch.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
mv_functional-deps_2017-04-01.patch application/octet-stream 98.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-03-31 15:29:24 Re: REFERENCES privilege should not be symmetric (was Re: [GENERAL] Postgres Permissions Article)
Previous Message Tom Lane 2017-03-31 15:25:07 Re: Variable substitution in psql backtick expansion